Comprehensive ensemble

Comprehensive Ensemble is an multi-subject ensemble. Instead of limiting the ensemble diversity to a single subject, it combines multi-subject individual models comprehensively; ensemble for the combinations of bagging, methods, and chemical compound input representations.
There exist a new type of QSAR individual classifier that is an end-to-end neural network model based on 1D-CNN and RNN. It extracts sequential features automatically from the SMILES.
A set of models are combines by using second-level combining learning (meta-learning), and meta-learning provides an interpretation regarding the importance of individual models through learned weights.

Main Algorithm

Our proposed ensemble learning procedure is divided into two levels: first-level individual learning and second-level combining learning. The first-level learning is a level for individual learning from diversified learning algorithms and chemical compound representations. The prediction probabilities from the first-level learning models are used as inputs for the second-level learning. The second-level combining learning makes the final decision by learning the importance of individual models from the first-level predictions.

P R O G R A M

D A T A S E T s

1851_1a2 (5,902/6,974)
1851_2c19 (5,840/7,135)
1851_2c9 (4,065/8,361)
1851_2d6 (2,601/10,826)
1851_3a4 (5,175/7,446)
1915 (2,219/1,017)
2358 (1,006/934)
463213 (4,138/3,234)
463215 (2,941/1,695)
488912 (2,491/3,705)
488915 (3,568/2,628)
488917 (4,283/1,913)
488918 (3,691/2,505)
492992 (2,094/2,820)
504607 (4,825/1,406)
624504 (3,944/1,090)
651739 (4,043/1,322)
651744 (3,099/2,303)
652065 (2,965/1,286)
chemDB (593,170)