Using Machine Learning to Generate a Dictionary for Environmental Issues.

Daniel E. O'Leary, Yangin Yoon


引用 0|浏览0
The purpose of this paper is to investigate the use of machine learning approaches to build a dictionary of terms to analyze text for ESG content using a bag of words approach, where ESG stands for “environment, social and governance.” Specifically, the paper reviews some experiments performed to develop a dictionary for information about the environment, for “carbon footprint”. We investigate using Word2Vec based on Form 10K text and from Earnings Calls, and queries of ChatGPT and compare the results. As part of the development of our dictionaries we find that bigrams and trigrams are more likely to be found when using ChatGPT, suggesting that bigrams and trigrams provide a “better” approach for the dictionaries developed with Word2Vec. We also find that terms provided by ChatGPT were not as likely to appear in Form 10Ks or other business disclosures, as were those terms generated using Word2Vec. In addition, we explored different question approaches to ChatGPT to find different perspectives on carbon footprint, such as “reducing carbon footprint” or “negative effects of carbon footprint.” We then discuss combining the findings from each of these approaches, to build a dictionary that could be used alone or with other ESG concept dictionaries.
machine learning,dictionary
AI 理解论文
Chat Paper