Dynamic Features Spaces and Machine Learning - Open Problems and Synthetic Data Sets.

IUKM(2020)

引用 1|浏览6
暂无评分
摘要
Dynamic feature spaces appear when different records or instances in databases are defined in terms of different features. This is in contrast with usual (static) feature spaces in standard databases, where the schema of the database is known and fixed. Then, all records in the database have the same set of variables, attributes or features. Dynamic feature mining algorithms are to extract knowledge from data on dynamic feature spaces. As an example, spam detection methods have been developed from a dynamic feature space perspective. Words are taken as features and new words appearing in new emails are, therefore, considered new features. In this case, the problem of spam detection is represented as a classification problem (a supervised machine learning problem). The relevance of dynamic feature spaces is increasing. The large amounts of data currently available or received by systems are not necessarily described using the same feature spaces. This is the case of distributed databases with data about customers, providers, etc. Industry 4.0, Internet of Things, and RFIDs are and will be a source of data in dynamic feature spaces. New sensors added in an industrial environment, new devices connected into a smart home, new types of analysis and new types of sensors in healthcare, all are examples of dynamic feature spaces. Machine learning algorithms are needed to deal with these type of scenarios. In this paper we motivate the interest for dynamic feature mining, we give some examples of scenarios where these techniques are needed, we review some of the existing solutions and its relationship with other areas of machine learning and data mining (e.g., incremental learning, concept drift, topic modeling), we discuss some open problems, and we discuss synthetic data generation for this type of problem.
更多
查看译文
关键词
dynamic features spaces,synthetic data sets,machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要