Internet Electronic Journal of Molecular Design - IEJMD, ISSN 1538-6414, CODEN IEJMAT
ABSTRACT - Internet Electron. J. Mol. Des. February 2005, Volume 4, Number 2, 181-193 |
Support Vector Regression Quantitative Structure-Activity Relationships (QSAR)
for Benzodiazepine Receptor Ligands
Ovidiu Ivanciuc
Internet Electron. J. Mol. Des. 2005, 4, 181-193
|
Abstract:
Support vector machines were developed by Vapnik as an
effective algorithm for determining an optimal hyperplane to
separate two classes of patterns. Comparative studies showed
that support vector classification (SVC) usually gives better
predictions than other classification methods. In a short period of
time SVC found significant applications in bioinformatics and
computational biology, such as cancer diagnosis, prediction of
protein fold, secondary structure, protein-protein interactions,
and subcellular localization. Using various loss functions, the
support vector method was extended for regression (support
vector regression, SVR). SVR can have significant applications
in QSAR (quantitative structure-activity relationships) if it is
able to predict better than other well-established QSAR models.
In this study we compare QSAR models obtained with multiple
linear regression (MLR) and SVR for the benzodiazepine
receptor affinity using a set of 52 pyrazolo[4,3-c]quinolin-3-ones.
Both models were developed with five structural
descriptors, namely the Hammett electronic parameter σR', the
molar refractivity MRR8, the Sterimol parameter LR'4', an
indicator variable I (1/0) for 7-substituted compounds, and the
Sterimol parameter B5R. Extensive simulations using the dot,
polynomial, radial basis function, neural, and anova kernels show
that the best predictions are obtained with the neural kernel. The
prediction power of the QSAR models was tested with complete
cross-validation: leave-one-out, leave-5%-out, leave-10%-out,
leave-20%-out, and leave-25%-out. While for the leave-one-out
test SVR is better than MLR (q2LOO,MLR = 0.481,
RMSELOO,MLR = 0.82;
q2LOO,SVR = 0.511, RMSELOO,SVR = 0.80), in the more
difficult test of leave-25%-out, MLR is better than SVR
(q2L25%O,MLR = 0.470, RMSEL25%O,MLR = 0.83;
q2L25%O,SVR = 0.432,
RMSEL25%O,SVR = 0.86). The results obtained in the
present study indicate that SVR applications in QSAR must be
compared with other models, in order to determine if their use
brings any prediction improvement. Despite many over-optimistic
expectations, support vector regression can overfit the
data, and SVR predictions may be worse than those obtained
with linear models.
|