Active Learning for Recommender Systems with Multiple Localized Models

Meghana Deodhar,Joydeep Ghosh,Maytal Saar-Tsechansky

msra（2011）

Cited 23|Views33

No score

Abstract

For effective predictive modeling in large scale recommender systems, it is essential to have many customers rate a large number of products, i.e., obtain a large number of labeled data. However, most consumers often do not provide their preferences without proper incentives. Given a budget to reward consumers for their feedback, it would be beneficial to have a policy to suggest the ratings of which customers and for what products would be most cost-effective to acquire, so as to improve modeling the most. E-commerce businesses can use such a policy to cost-effectively acquire consumers’ ratings or other forms of feedbacks, incrementally. This challenge can be mapped to the problem of active learning [2] in which a learner aims to intelligently select the labels of particularly informative examples from a pool of prospective acquisitions, so as to improve generalization accuracy the most for a given number of acquisitions. While there are published results for active learning in a regression setting [4, 5], there has been little study of approaches that are applicable to important, practical scenarios. In particular, most proposed approaches aim at optimizing the training input density (non pool based) rather than evaluate prospective acquisitions from an available pool, and focus on linear least squares models. In addition, most of the literature on active learning, especially in the machine learning community involves learning in the context of classification problems [2]. Also, these methods consider predictions obtained by either a single “global” probabilistic classification model or an ensemble of global classifiers (e.g., bagging or boosting). As we discuss below, large scale data sets that are characteristic of recommender systems, suggest other types of modeling that are significantly more appropriate. The data encountered in large scale recommender systems typically exhibit inherent heterogeneity among the customers and products. For instance, Amazon.com being a large retailer has customers who exhibit varied purchase patterns. As we demonstrate, in this setting it is beneficial to model the behavior of different groups of customers separately. Similarly, because the retailer’s products span a very wide range of categories, it is also advantageous to model ratings for homogeneous product groups separately. Thus, rather than induce a single predictive model, one can represent such data by a set of multiple local models, such that each local model captures consumers’ ratings in a certain region of the input (consumer/product) space. We propose a radically different active learning scheme that (a) leverages a collection of localized predictive models, and (b) generalizes to both classification and regression prediction problems. Our approach is also applicable to a wide range of model types, including non linear models such as MLPs, or regularized linear models (ridge/lasso). Before developing our active learning strategy, we first discuss briefly how to learn a set of local predictive models to accurately represent such heterogeneous data. We recently proposed Simultaneous CO-clustering And Learning (SCOAL) [3], a versatile and effective framework for predictive modeling of large scale, heterogeneous, dyadic data. SCOAL interleaves simultaneous partitioning along both “customer” and “product” modes (co-clustering) and the construction of prediction models to iteratively improve both the assignment of a consumer’s rating to a given cluster, as well as improve the fit of the model induced within each data cluster. SCOAL exploits both neighborhood information and the available customer/product attributes, thereby combining the benefits of collaborative filtering and of content based approaches. The framework can be viewed as simultaneous co-segmentation and classification (or regression), and we show is substantially better than independently clustering the data a priori followed by model induction. In SCOAL, each model

Translated text

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined