Show simple item record

dc.contributor.authorGarg, Ankita
dc.contributor.authorEnright, Catherine G.
dc.contributor.authorMadden, Michael G.
dc.date.accessioned2015-07-16T14:17:15Z
dc.date.available2015-07-16T14:17:15Z
dc.date.issued2015-04-22
dc.identifier.citationAnkita Garg, Catherine G. Enright, Michael G. Madden (2015) 'Improving Spectral Library Search by Redefining Similarity Measures'. Journal Of Chemical Information And Modeling, 55 (5):963-971.en_US
dc.identifier.urihttp://hdl.handle.net/10379/5085
dc.description.abstractSimilarity plays a central role in spectral library search. The goal of spectral library search is to identify those spectra in a reference library of known materials that most closely match an unknown query spectrum, on the assumption that this will allow us to identify the main constituent(s) of the query spectrum. The similarity measures used for this task in software and the academic literature are almost exclusively metrics, meaning that the measures obey the three axioms of metrics: (1) minimality; (2) symmetry; (3) triangle inequality. Consequently, they implicitly assume that the query spectrum is drawn from the same distribution as that of the reference library.In this paper, we demonstrate that this assumption is not necessary in practical spectral library search and that in fact it is often violated in practice. Although the reference library may be constructed carefully, it is generally impossible to guarantee that all future query spectra will be drawn from the same distribution as the reference library. Before evaluating different similarity measures, we need to understand how they define the relationship between spectra.In spectral library search, we often aim to find the constituent(s) of a mixture. We propose that rather than asking which reference library spectra are similar to the mixture, we should ask which of the reference library spectra are contained in the given query mixture. This question is inherently asymmetric. Therefore, we should adopt a nonmetric measure. To evaluate our hypothesis, we apply a nonmetric measure formulated by Tversky known as the Contrast Model and compare its performance to the well-known Jaccard similarity index metric on spectroscopic data sets. Our results show that the Tversky similarity measure yields better results than the Jaccard index.   en_US
dc.formatapplication/pdfen_US
dc.language.isoenen_US
dc.publisherJournal Of Chemical Information And Modelingen_US
dc.relation.ispartofJournal Of Chemical Information And Modelingen
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 Ireland
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/3.0/ie/
dc.subjectSpectral library searchen_US
dc.subjectQuery spectrumen_US
dc.subjectMetricsen_US
dc.subjectContrast modelen_US
dc.subjectData setsen_US
dc.titleImproving spectral library search by redefining similarity measuresen_US
dc.typeArticleen_US
dc.date.updated2015-06-10T11:06:52Z
dc.identifier.doiDOI: 10.1021/acs.jcim.5b00077
dc.local.publishedsourcehttp://dx.doi.org/10.1021/acs.jcim.5b00077en_US
dc.description.peer-reviewedpeer-reviewed
dc.contributor.funder|~|EU|~|
dc.internal.rssid8975800
dc.local.contactCatherine Enright, Information Technology, Nui Galway. Email: catherine.enright@nuigalway.ie
dc.local.copyrightcheckedYes
dc.local.versionACCEPTED
nui.item.downloads730


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 Ireland
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Ireland