Machine learning methods for quantitative analysis of Raman spectroscopy data
Madden, Michael G.
Ryder, Alan G.
MetadataShow full item record
This item's downloads: 829 (view details)
Cited 15 times in Scopus (view citations)
Michael G. Madden ; Alan G. Ryder; Machine learning methods for quantitative analysis of Raman spectroscopy data. Proc. SPIE 4876, Opto-Ireland 2002: Optics and Photonics Technologies and Applications, 1130 (March 17, 2003); doi:10.1117/12.464039
The automated identification and quantification of illicit materials using Raman spectroscopy is of significant importance for law enforcement agencies. This paper explores the use of Machine Learning (ML) methods in comparison with standard statistical regression techniques for developing automated identification methods. In this work, the ML task is broken into two sub-tasks, data reduction and prediction. In well-conditioned data, the number of samples should be much larger than the number of attributes per sample, to limit the degrees of freedom in predictive models. In this spectroscopy data, the opposite is normally true. Predictive models based on such data have a high number of degrees of freedom, which increases the risk of models over-fitting to the sample data and having poor predictive power. In the work described here, an approach to data reduction based on Genetic Algorithms is described. For the prediction sub-task, the objective is to estimate the concentration of a component in a mixture, based on its Raman spectrum and the known concentrations of previously seen mixtures. Here, Neural Networks and k-Nearest Neighbours are used for prediction. Preliminary results are presented for the problem of estimating the concentration of cocaine in solid mixtures, and compared with previously published results in which statistical analysis of the same dataset was performed. Finally, this paper demonstrates how more accurate results may be achieved by using an ensemble of prediction techniques.