GB-CENT: Gradient Boosted Categorical Embedding and Numerical Trees.

WWW(2017)

引用 28|浏览85
暂无评分
摘要
Latent factor models and decision tree based models are widely used in tasks of prediction, ranking and recommendation. Latent factor models have the advantage of interpreting categorical features by a low-dimensional representation, while such an interpretation does not naturally fit numerical features. In contrast, decision tree based models enjoy the advantage of capturing the nonlinear interactions of numerical features, while their capability of handling categorical features is limited by the cardinality of those features. Since in real-world applications we usually have both abundant numerical features and categorical features with large cardinality (e.g. geolocations, IDs, tags etc.), we design a new model, called GB-CENT, which leverages latent factor embedding and tree components to achieve the merits of both while avoiding their demerits. With two real-world data sets, we demonstrate that GB-CENT can effectively (i.e. fast and accurately) achieve better accuracy than state-of-the-art matrix factorization, decision tree based models and their ensemble.
更多
查看译文
关键词
recommender systems, matrix factorization, low-dimensional embedding, gradient boosting, decision trees, numerical and categorical features, large cardinality
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要