Biomedical engineers at Duke University have improved the accuracy of machine learning (ML) algorithms for molecular biology and drug development by programming them to identify gaps in datasets.
Duke's Daniel Reker explained, "With active machine learning, the algorithm is essentially able to ask questions or request more information if it is confused or senses a gap in the data, rather than passively sifting through it. This makes active-learning models very efficient at predicting performance."
The researchers tested their algorithm against models trained on a dataset of molecules with different properties and against 16 cutting-edge subsampling applications.
They found active subsampling could identify and anticipate molecular characteristics more accurately than all standard subsampling frameworks, and in some cases surpassed the efficacy of programs trained on the full dataset by up to 139%.
The researchers also found the algorithm sometimes required just 10% of the available data.
From Duke University Pratt School of Engineering
View Full Article
Abstracts Copyright © 2023 SmithBucklin, Washington, D.C., USA
No entries found