Automatic Question Generation using RNN-based and Pre-trained Transformer-based Models in Low Resource Indonesian Language

INFORMATICA-AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS（2022）

引用 1|浏览8

暂无评分

摘要

Although Indonesian is the fourth most frequently used language on the internet, the development of NLP in Indonesian has not been studied intensively. One form of NLP application classified as an NLG task is the Automatic Question Generation task. Generally, the task has proven well, us-ing rule-based and cloze tests, but these approaches depend heavily on the defined rules. While this approach is suitable for automated question generation systems on a small scale, it can be-come less efficient as the scale of the system grows. Many NLG model architectures have recently proven to have significantly improved performance compared to previous architectures, such as generative pre-trained transformers, text-to-text transfer transformers, bidirectional auto -regressive transformers, and many more. Previous studies on AQG in Indonesian were built on RNN-based architecture such as GRU, LSTM, and Transformer. The performance of models in previous studies is compared with state-of-the-art models, such as multilingual models mBART and mT5, and monolingual models such as IndoBART and IndoGPT. As a result, the fine-tuned IndoBART performed significantly higher than either BiGRU and BiLSTM on the SQuAD dataset. Fine-tuned IndoBART on most of the metrics also performed better on the TyDiQA dataset only, which has fewer population than the SQuAD dataset.

查看译文

关键词

natural language processing,natural language generation,automatic question generation,recurrent neural network,long-short term memory,gated recurrent unit,transformer,fine-tuning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要