Background: Wavelength selection is one of the key steps in spectral analysis and plays an irreplaceable role in improving model prediction accuracy and computational efficiency. High-dimensional spectral datasets contain substantial irrelevant information and redundant variables. Whereas, at current stage, such problem can be solved by existing abundant wavelength selection methods. However, it is difficult to achieve the balance between strong wavelength interpretability and prediction accuracy by those methods. As a result, there is an urgent need for a new method that can reach the point of balance.
Results: we propose a new framework for wavelength selection based on wavelength importance clustering (WIC) which attempts to establish a hierarchical relationship between wavelength points and attributions of response through a clustering algorithm, consequently, performing combinatorial and filtering to obtain the optimal wavelength combinations. In this paper, a new wavelength selection method (WIC-WRCKF) is constructed based on WIC, and four commonly used wavelength selection methods are selected to be compared with WIC-WRCKF. A large number of experiments are carried out on three publicly available datasets as well, namely, wheat, corn, and tablets. Compared with other methods, WIC-WRCKF has the highest prediction accuracy with high stability on the three datasets, and the number of wavelengths selected is small and highly interpretative, indicating that WIC-WRCKF has a better predictive ability.
Significance: The wavelength selection method can significantly improve the model prediction accuracy, and the WIC architecture can effectively exploit the essence of the spectral data, which has great potential in the application of wavelength selection.
Keywords: Clustering; Near infrared spectroscopy; Wavelength importance; Wavelength selection.
Copyright © 2024 Elsevier B.V. All rights reserved.