7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Constrained Dual-Level Bandit for Personalized Impression Regulation in Online Ranking Systems

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Impression regulation plays an important role in various online ranking systems, e.g. , e-commerce ranking systems always need to achieve local commercial demands on some pre-labeled target items like fresh item cultivation and fraudulent item counteracting while maximizing its global revenue. However, local impression regulation may cause “butterfly effects” on the global scale, e.g. , in e-commerce, the price preference fluctuation in initial conditions (overpriced or underpriced items) may create a significantly different outcome, thus affecting shopping experience and bringing economic losses to platforms. To prevent “butterfly effects”, some researchers define their regulation objectives with global constraints, by using contextual bandit at the page-level that requires all items on one page sharing the same regulation action, which fails to conduct impression regulation on individual items. To address this problem, in this article, we propose a personalized impression regulation method that can directly makes regulation decisions for each user-item pair. Specifically, we model the regulation problem as a C onstrained D ual-level B andit (CDB) problem, where the local regulation action and reward signals are at the item-level while the global effect constraint on the platform impression can be calculated at the page-level only. To handle the asynchronous signals, we first expand the page-level constraint to the item-level and then derive the policy updating as a second-order cone optimization problem. Our CDB approaches the optimal policy by iteratively solving the optimization problem. Experiments are performed on both offline and online datasets, and the results, theoretically and empirically, demonstrate CDB outperforms state-of-the-art algorithms.

          Related collections

          Most cited references47

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Neural Collaborative Filtering

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Deep Neural Networks for YouTube Recommendations

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Wide & Deep Learning for Recommender Systems

                Bookmark

                Author and article information

                Contributors
                Journal
                ACM Transactions on Knowledge Discovery from Data
                ACM Trans. Knowl. Discov. Data
                Association for Computing Machinery (ACM)
                1556-4681
                1556-472X
                July 21 2021
                July 21 2021
                : 16
                : 2
                : 1-23
                Affiliations
                [1 ]Alibaba Group, Hangzhou, China
                [2 ]Peking University, Beijing, China
                Article
                10.1145/3461340
                fca24667-d246-466c-ac41-fdb5778ceff7
                © 2021
                History

                Comments

                Comment on this article