A Robust Schema for Machine Learning Data and Models Within the Granta MI Information Management System

Brandon L. Hearley,Steven M. Arnold,Joshua Stuckner

AIAA SCITECH 2023 Forum(2023)

引用 0|浏览0
暂无评分
摘要
Recent advances in the development of machine learning (ML) algorithms have enabled the creation of predictive models that can improve decision making, decrease computational cost, and improve efficiency in a variety of fields. As an organization begins to develop and implement such models, the data used in the training, validation, and testing of machine learning models, the model parameters, and the use cases or limitations of the models must be properly stored to ensure models are both fully traceable and used correctly. In the context of predicting material behavior, advances in computationally intense, physics-based, modeling of material behavior at various length scales, and the emergence of Integrated Computational Materials Engineering (ICME) have driven the need for developing data-driven surrogate models of the physics-based simulation tools using machine learning (ML) techniques. Surrogate model development allows for accurate material behavior prediction at a fraction of the cost of its physics-based counterpart, allowing for multiscale simulations of real-world applications, further enabling the ability to design fit-for-purpose materials for a reasonable computational investment. However, training such models requires extensive data, and thus effective data management is necessary to reach the full potential that ML can offer to material design and ICME. This paper proposes a generalized, robust schema that allows organizations to store both real (experimental) and virtual (simulation) data used to train machine learning models and the defining model parameters and architectures. The developed schema allows for various types of data inputs and outputs, including single point values, time-series data, and images that can be used in for various types of machine learning models while following outlined best practices for effective data management. An effective schema for machine learning data and models can help prevent the recreation of virtual/real training data and surrogate models, can help reduce the time to create new models similar to existing ones by offering a starting point in the hyperparameter determination stages, minimize resources devoted to verification and validation (V&V) and certification of models, and ensure that data and surrogate models are not misused due to full traceability of both the data and ML model. It also allows organizations access to models that have already been developed, such that they can be used in the design of new materials, enabling the overall goals of ICME.
更多
查看译文
关键词
robust schema,machine learning data,machine learning,management system,models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要