Urban neighborhood socioeconomic status (SES) inference: A machine learning approach based on semantic and sentimental analysis of online housing advertisements

Habitat International(2022)

引用 2|浏览13
暂无评分
摘要
Understanding the dynamic distribution of residents' socioeconomic status (SES) across neighborhoods within cities is essential for urban planning and policy-making aligning to the Sustainable Development Goals 2030. Whereas the promise in explicitly linking geographical features to SES has been highlighted fairly clear in previous works, scholars hold an eclectic attitude in their outlook, given the absence of theoretical ground, the heavy reliance on nontransparent proprietary data sources and the relatively coarse resolution predictions. Drawing on a case study of Hangzhou metropolitan in China, this paper aims to address these problems by demonstrating a novel approach to neighborhood SES inference based on online housing advertisements. We first revisit the theoretical debates on the linkage between neighborhood SES and online housing advertisements. Then, the Naïve Bayes classifier is employed to semantically identify the topics from online housing advertisements and the associated sentiments are quantified using the lexicon-based approach. Following that, seven commonly used machine learning algorithms are compared and utilized to infer the fine-grained neighborhood SES at residential quarters scale based on the housing attributes and extracted topics from online housing advertisements. Results show that machine learning algorithms vary with predictive ability and the tree-based algorithms are much more powerful in inferring neighborhood SES. More specifically, we distinguish 8 reliable features which not only present relative high importance estimated by all the machine learning algorithms but also exhibit great robustness in inferring neighborhood SES and show promising potential to being applied for unraveling social inequalities. We also observe noteworthy spatial heterogeneity in neighborhood SES across the research site. The demonstrated approach not only enables the policymakers to take stock of deprived neighborhoods in a timely manner, but also lays firm ground for framing contextualized strategies of urban governance. This study is among the first attempts to bridge the theoretical interpretation of housing attributes with the proxy indicator -based approach for fine-grained neighborhood SES measurement.
更多
查看译文
关键词
Neighborhood socioeconomic status,Area deprivation,Machine learning,Open data,Social inequalities,Online housing listings
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要