Word Frequencies in Linguistic Articles Published in SINTA Indexed Journals

Jurnal Sosioteknologi(2023)

引用 0|浏览0
暂无评分
摘要
Multiword sequences are a language pattern that occurs when a bunch of words emerge in a similar register. In research papers conducted by lecturers and students, different topic areas and indexes has created various characteristics of lexical bundles. The method of this research is qualitative, combining corpus design to identify the sequence of words within the text. The corpus data were generated from five different indexing journals, yet the topic is linguistics. Initially, the whole papers were converted to text format to deal with readability in the program used. The program used was Orange Apps version 3.27 by applying the textable, data table, and text mining menus. The sources of the data are emphasized as being academic research indexed in SINTA 5, published in 2020. The main theory of used in this research is that of Biber’s (2007) which discusses the main characteristics and number of criteria for defining word strings. This observation resulted in 207.896 characters and 33.636 words. There were 4,273 words based on the pre-processing analysis result, which included transformation, tokenization, and PoS-tagging. From a total of 4,273 words, virus, deixis, and slang were the most frequently occurring. Based on these results, it can be concluded that the majority of journal articles are about viruses and slang. They pertain to the prevalent topic of pandemics at the time the journals were published. When the process of writing a journal article is in progress, this information may aid the authors in identifying the journal’s keywords and most frequent words.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要