Does the Use of Unusual Combinations of Datasets Contribute to Greater Scientific Impact?
CoRR(2024)
Abstract
Scientific datasets play a crucial role in contemporary data-driven research,
as they allow for the progress of science by facilitating the discovery of new
patterns and phenomena. This mounting demand for empirical research raises
important questions on how strategic data utilization in research projects can
stimulate scientific advancement. In this study, we examine the hypothesis
inspired by the recombination theory, which suggests that innovative
combinations of existing knowledge, including the use of unusual combinations
of datasets, can lead to high-impact discoveries. We investigate the scientific
outcomes of such atypical data combinations in more than 30,000 publications
that leverage over 6,000 datasets curated within one of the largest social
science databases, ICPSR. This study offers four important insights. First,
combining datasets, particularly those infrequently paired, significantly
contributes to both scientific and broader impacts (e.g., dissemination to the
general public). Second, the combination of datasets with atypically combined
topics has the opposite effect – the use of such data is associated with fewer
citations. Third, younger and less experienced research teams tend to use
atypical combinations of datasets in research at a higher frequency than their
older and more experienced counterparts. Lastly, despite the benefits of data
combination, papers that amalgamate data remain infrequent. This finding
suggests that the unconventional combination of datasets is an under-utilized
but powerful strategy correlated with the scientific and broader impact of
scientific discoveries.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined