Ethical considerations in utilizing artificial intelligence for analyzing the NHGRI’s History of Genomics and Human Genome Project archives

Christopher Donohue, Kris Wetterstrand,Kristi Holmes,Luis A. Nunes Amaral,Mohammad Hosseini, Spencer Hong,Thomas Stoeger

Journal of eScience Librarianship(2024)

引用 0|浏览4
暂无评分
摘要
Understanding “how to optimize the production of scientific knowledge” is paramount to those who support scientific research—funders as well as research institutions—to the communities served, and to researchers. Structured archives can help all involved to learn what decisions and processes help or hinder the production of new knowledge. Using artificial intelligence (AI) and large language models (LLMs), we recently created the first structured digital representation of the historic archives of the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health. This work yielded a digital knowledge base of entities, topics, and documents that can be used to probe the inner workings of the Human Genome Project, a massive international public-private effort to sequence the human genome, and several of its offshoots like The Cancer Genome Atlas (TCGA) and the Encyclopedia of DNA Elements (ENCODE). The resulting knowledge base will be instrumental in understanding not only how the Human Genome Project and genomics research developed collaboratively, but also how scientific goals come to be formulated and evolve. Given the diverse and rich data used in this project, we evaluated the ethical implications of employing AI and LLMs to process and analyze this valuable archive. As the first computational investigation of the internal archives of a massive collaborative project with multiple funders and institutions, this study will inform future efforts to conduct similar investigations while also considering and minimizing ethical challenges. Our methodology and risk-mitigating measures could also inform future initiatives in developing standards for project planning, policymaking, enhancing transparency, and ensuring ethical utilization of artificial intelligence technologies and large language models in archive exploration.Author Contributions: Mohammad Hosseini: Investigation; Project Administration; Writing – original draft; Writing – review & editing. Spencer Hong: Conceptualization, Data curation, Investigation, Methodology, Software, Visualization, Writing – original draft, Writing – review & editing. Thomas Stoeger: Conceptualization; Investigation; Project Administration; Supervision; Writing – original draft; Writing – review & editing. Kristi Holmes: Funding acquisition, Supervision, Writing – review & editing. Luis A. Nunes Amaral: Funding acquisition, Supervision, Writing – review & editing. Christopher Donohue: Conceptualization, Project administration, Resources, Supervision, Writing – original draft, Writing – review & editing. Kris Wetterstrand: Conceptualization, Funding acquisition, Project administration.
更多
查看译文
关键词
artificial intelligence,AI,large language models,National Human Genome Research Institute (U.S.),Human Genome Project,ethics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要