Robust Learning of Consumer Preferences

Operations Research(2022)

引用 3|浏览0
暂无评分
摘要
When companies develop new products, there are often competing designs from which to choose to take to market. How to decide? Traditional methods, such as focus groups, do not scale to the modern marketplace in which tastes evolve rapidly. In “Robust Learning of Consumer Preferences,” Feng, Caldentey, and Ryan develop a data-driven approach to deciding which design to produce by displaying a sequence of subsets of possible designs to potential customers. Their framework finds solutions that are robust to any model of consumer choice within a broad family that includes common choice models studied in the literature as special cases. Previous research focuses on algorithms whose performances are tied to a given choice model. Their algorithm is shown to be asymptotically optimal in a worst-case sense. The promising practical performance of the algorithm is demonstrated through a comprehensive numerical study using real data. This paper studies a class of ranking and selection problems faced by a company that wants to identify the most preferred product out of a finite set of alternatives when consumer preferences are a priori unknown. The only information available is that consumer preferences satisfy two key properties: (i) they are consistent with some unknown true ranking of the alternatives, and (ii) they are strict, namely, no two products are equally preferred. To learn the unknown ranking, the company is able to sample consumer preferences by sequentially showing different subsets of products to different consumers and asking them to report their top preference within the displayed set. The objective of the company is to design a display policy that minimizes the expected number of samples needed to identify the top-ranked product with high probability. We prove an instance-specific lower bound on the sample complexity of any policy that identifies the top-ranked product within a given (probabilistic) confidence. We also propose a robust formulation of the company’s problem and derive a sampling policy (myopic tracking policy), which is both worst-case asymptotically optimal and intuitive to implement. Roughly speaking, the myopic tracking policy randomly alternates between two extreme types of displaying strategies: (i) full display , which shows a consumer the entire menu so as to learn something about every product, and (ii) pair display , which shows a consumer only two products so as to maximize the informativeness of the choice made by the consumer. To assess the performance of our proposed myopic tracking policy, we conduct a comprehensive set of computational studies and compare it to alternative methods in the literature.
更多
查看译文
关键词
Revenue Management and Market Analytics,sequential learning,maximum selection,best arm identification,dynamic assortments,preference learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要