A Data Set Comparison Method Using Noise Statistics Applied to VUV Spectrum Match Determinations

ANALYTICAL CHEMISTRY(2022)

引用 1|浏览3
暂无评分
摘要
It has been demonstrated that a pair of spectra exhibiting a coefficient of determination (R2) as low as 0.976 can originate from the same chemical species in one example, while a different pair of spectra exhibiting an R2 up to 0.9997 can originate from different chemical species. The R2 between spectra overlays depends on the signal-to-noise ratio, while the residual between any two spectra should look like noise only when the two spectra originate from the same chemical species. Numerical characteristics of the residual between two high-resolution spectra are invaluable toward the definitive elimination of many plausible matches of reference spectra to the sample spectra of analytes eluted from two-dimensional gas chromatography. Additionally, numerical character-istics beyond R2 facilitate a logical ranking of all plausible matches, making positive identification of a single-component analyte possible provided a reference spectrum exists for all plausible matches. Specifically, the experimental background noise is shown to follow a Gaussian distribution at all wavelengths, and a method is described to normalize the data such that the numerically adjusted noise distributions are independent of wavelength. The differences between matching spectra are further shown to exhibit numerical characteristics consistent with the background noise's Gaussian distribution, common to all wavelengths. Seven criteria are described for judging the similarity between spectra: R2 between the two spectra, R2 of a Q-Q plot with one axis being ideal Gaussian quantiles and the other axis being the distribution of the numerically adjusted residual quantiles, the maximum count of consecutive (by wavelength) signs in the residual, and the first four moments of the residuals. One exemplar application of the methodology is a definitive match of n-undecane, n-dodecane, and n-tridecane sample spectra to their corresponding reference spectrum, which is among the most challenging set of species within the volatility range of jet fuel to differentiate by spectral methods. While this example is a significant stress test of the approach, the utility of the methodology generally is in the subtle math and transparent criteria that unambiguously identify mismatches because the distributions of residuals between mismatching spectra are very clearly not Gaussian and have a high consecutive sign count, even in cases where the R2 between the compared spectra is ambiguous.
更多
查看译文
关键词
data set comparison method,noise statistics,spectrum
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要