Comparative Characterization of Crofelemer Samples Using Data Mining and Machine Learning Approaches With Analytical Stability Data Sets

Maulik K Nariya; Jae Hyun Kim; Jian Xiong; Peter A Kleindl; Asha Hewarathna; Adam C Fisher; Sangeeta B Joshi; Christian Schöneich; M Laird Forrest; C Russell Middaugh; David B Volkin; Eric J Deeds

doi:10.1016/j.xphs.2017.07.013

Comparative Characterization of Crofelemer Samples Using Data Mining and Machine Learning Approaches With Analytical Stability Data Sets

J Pharm Sci. 2017 Nov;106(11):3270-3279. doi: 10.1016/j.xphs.2017.07.013. Epub 2017 Jul 22.

Affiliations

¹ Department of Physics and Astronomy, University of Kansas, Lawrence, Kansas 66045.
² Department of Pharmaceutical Chemistry, University of Kansas, Lawrence, Kansas 66045.
³ Department of Pharmaceutical Chemistry, University of Kansas, Lawrence, Kansas 66045; Macromolecule and Vaccine Stabilization Center, University of Kansas, Lawrence, Kansas 66045.
⁴ Center for Drug Evaluation and Research, Office of Pharmaceutical Quality, U.S. Food and Drug Administration, Silver Spring, Maryland 20993.
⁵ Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas 66045; Center for Computational Biology, University of Kansas, Lawrence, Kansas 66045; Santa Fe Institute, Santa Fe, New Mexico 87501. Electronic address: deeds@ku.edu.

Abstract

There is growing interest in generating physicochemical and biological analytical data sets to compare complex mixture drugs, for example, products from different manufacturers. In this work, we compare various crofelemer samples prepared from a single lot by filtration with varying molecular weight cutoffs combined with incubation for different times at different temperatures. The 2 preceding articles describe experimental data sets generated from analytical characterization of fractionated and degraded crofelemer samples. In this work, we use data mining techniques such as principal component analysis and mutual information scores to help visualize the data and determine discriminatory regions within these large data sets. The mutual information score identifies chemical signatures that differentiate crofelemer samples. These signatures, in many cases, would likely be missed by traditional data analysis tools. We also found that supervised learning classifiers robustly discriminate samples with around 99% classification accuracy, indicating that mathematical models of these physicochemical data sets are capable of identifying even subtle differences in crofelemer samples. Data mining and machine learning techniques can thus identify fingerprint-type attributes of complex mixture drugs that may be used for comparative characterization of products.

Keywords: comparative characterization; crofelemer; data mining; supervised learning.

Publication types

Comparative Study
Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, P.H.S.

MeSH terms

Antidiarrheals / chemistry*
Antidiarrheals / pharmacology
Cell Line
Chloride Channels / antagonists & inhibitors*
Chloride Channels / metabolism
Circular Dichroism
Data Mining
Drug Stability
Humans
Machine Learning
Principal Component Analysis
Proanthocyanidins / chemistry*
Proanthocyanidins / pharmacology
Spectrophotometry, Ultraviolet
Spectroscopy, Fourier Transform Infrared

Substances

Antidiarrheals
Chloride Channels
Proanthocyanidins
crofelemer

Comparative Characterization of Crofelemer Samples Using Data Mining and Machine Learning Approaches With Analytical Stability Data Sets

Authors

Affiliations

Abstract

Publication types

MeSH terms

Substances

Grants and funding