ProtoCode: Leveraging Large Language Models for Automated Generation of Machine-Readable Protocols from Scientific Publications
arxiv(2023)
摘要
Protocol standardization and sharing are crucial for reproducibility in life
sciences. In spite of numerous efforts for standardized protocol description,
adherence to these standards in literature remains largely inconsistent.
Curation of protocols are especially challenging due to the labor intensive
process, requiring expert domain knowledge of each experimental procedure.
Recent advancements in Large Language Models (LLMs) offer a promising solution
to interpret and curate knowledge from complex scientific literature. In this
work, we develop ProtoCode, a tool leveraging fine-tune LLMs to curate
protocols which can be interpretable by both human and machine interfaces. Our
proof-of-concept, focused on polymerase chain reaction (PCR) protocols,
retrieves information from PCR protocols at an accuracy ranging 69-100%
depending on the information content. In all the tested protocols, we
demonstrate that ProtoCode successfully converts literature-based protocols
into correct operational files for multiple thermal cycler systems. In
conclusion, ProtoCode can alleviate labor intensive curation and
standardization of life science protocols to enhance research reproducibility
by providing a reliable, automated means to process and standardize protocols.
ProtoCode is freely available as a web server at
https://curation.taxila.io/ProtoCode/.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要