Peak-Based Machine Learning for Plastic Type Classification in Time-of-Flight Secondary Ion Mass Spectrometry

J Am Soc Mass Spectrom. 2024 Dec 4;35(12):3107-3115. doi: 10.1021/jasms.4c00325. Epub 2024 Nov 8.

Abstract

Time-of-flight secondary ion mass spectrometry (ToF-SIMS) measurement data and machine learning were used in this work to classify six different types of plastics. In order to take into account the characteristics of the measurement data, the local maxima of the measurement data were first examined in a preprocessing step. Several machine learning methods were then implemented to create a model that could successfully classify the plastics. To visualize the data distribution, we applied a dimensionality reduction method, namely, principal component analysis. Finally, to distinguish between the six types of plastics, we conducted an ensemble analysis using four tree-based algorithms: decision tree, random forest, gradient boosting, and LIGHTGBM. This approach can identify the feature importance of plastic samples and allow the inference of the chemical properties of each plastic type. In this way, ToF-SIMS data could be utilized to successfully classify plastics and enhance explainability.

Keywords: PCA; ToF-SIMS; feature importance; machine learning; mass spectra preprocessing; tree-based algorithm.