Aprendizagem de Máquina na identificação de regiões codantes em sequências de DNA de fungos filamentosos

Gustavo Henrique Ferreira Cruz, Vinícius Menossi, Josiane Melchiori Pinheiro, Antônio Roberto dos Santos, Gustavo Luiz Furuhata Ferreira,Sarah Anduca de Oliveira

Anais do XIII Computer on the Beach - COTB'22(2022)

引用 0|浏览3
暂无评分
摘要
The task of identifying intron and exon regions in genes is a verycomplex task, and it is necessary to identify certain nucleotidepatterns in the gene sequence. This task can be done manually orthrough software that most often uses genetic alignment techniques, which is not a very effective way for this purpose. In this oppor-tunity for collaboration between biology and computer science using machine learning techniques, the objective was to predictthe intron and exon regions in filamentous fungi genes as well totranslate the identified regions intro proteic codons. In this paper,the problem was modeled as a supervised learning problem, basedon training a set of genes obtained from GenBank that alreadyhave the intron and exon regions identified. The machine learningmodel used in this work was the Condicional Random Fields (CRF).Through the values resulting from the metrics applied to the model,it can be seen that it is possible to achieve a good precision in thetask of identifying the intron and exon regions as well the proteiccodons. Thus, although there is a need for a greater diversity ofdatabase characteristics to support the effectiveness of identifyingthe splicing sites, this paper gives evidence that it is possible topredict these splicing sites with a good accuracy.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要