Chrome Extension
WeChat Mini Program
Use on ChatGLM

MilkOligoThesaurus, a dataset of mammalian milk oligosaccharide synonyms

Data in Brief(2024)

Cited 0|Views5
No score
Abstract
There is a growing interest in milk oligosaccharides (MOs) because of their numerous benefits for newborns’ and long-term health. A large number of MO structures have been identified in mammalian milk. Mostly described in human milk, the oligosaccharide richness, although less broad, has also been reported for a wide range of mammalian species. The structure of MOs is particularly difficult to report as it results from the combination of 5 monosaccharides linked by various glycosidic bonds forming structurally diverse and complex matrices of linear and branched oligosaccharides. Exploring the literature and extracting relevant information on MO diversity within or across species appears promising to elucidate structure-function role of MOs. Currently, given the complexity of these molecules, the main issues in exploring literature to extract relevant information on MO diversity within or across species relate to the heterogeneity in the way authors refer to these molecules. Herein, we provide a thesaurus (MilkOligoThesaurus) including the names and synonyms of MOs collected from key selected articles on mammalian milk analyses. MilkOligoThesaurus gathers the names of the MOs with a complete description of their monosaccharide composition and structures. When available, each unique MO molecule is linked to its ID from the NCBI PubChem and ChEBI databases. MilkOligoThesaurus is provided in a tabular format. It gathers 245 unique oligosaccharide structures described by 22 features (columns) including the name of the molecule, its abbreviation, the chemical database IDs if available, the monosaccharide composition, chemical information (molecular formula, monoisotopic mass), synonyms, its formula in condensed form, and in abbreviated condensed form, the abbreviated systematic name, the systematic name, the isomer group, and scientific article sources. MilkOligoThesaurus is also provided in the SKOS (Simple Knowledge Organization System) format. This thesaurus is a valuable resource gathering MO naming variations that are not found elsewhere for (i) Text and Data Mining to enable automatic annotation and rapid extraction of milk oligosaccharide data from scientific papers; (ii) biology researchers aiming to search for or decipher the structure of milk oligosaccharides based on any of their names, abbreviations or monosaccharide compositions and linkages.
More
Translated text
Key words
chemical nomenclature,normalized milk oligosaccharide name,milk oligosaccharide monoisotopic mass,milk oligosaccharide monosaccharide composition,oligosaccharide isomer name,vocabulary extraction,systematic names
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined