Internet Electronic Journal of Molecular Design - IEJMD, ISSN 1538-6414, CODEN IEJMAT
ABSTRACT - Internet Electron. J. Mol. Des. June 2004, Volume 3, Number 6, 316-334 |
Quantitative Structure-Electrochemistry Relationship Study
of Some Organic Compounds Using PC-ANN and PCR
Bahram Hemmateenejad and Mojtaba Shamsipur
Internet Electron. J. Mol. Des. 2004, 3, 316-334
|
Abstract:
A QSPR analysis has been conducted on the half-wave
reduction potential (E1/2) of a diverse set of organic compounds
by means of principal component regression (PCR) and
principal component-artificial neural network (PC-ANN)
modeling method. Genetic algorithm was employed as a factor
selection procedure for both modeling methods. The results
were compared with two other factor selection methods namely
eigenvalue ranking (EV) and correlation ranking (CR)
procedures. By using the Dragon software more than 1000
structural descriptors were calculated for each molecule. The
descriptor data matrix was subjected to principal component
analysis and the most significant principal components (PC)
were extracted. Multiple linear regression and artificial neural
network were employed for the respective linear and nonlinear
modeling between the extracted principal components and E1/2.
First, the principal components were ranked by decreasing
eigenvalues and entered successively to each modeling method
separately. In addition, the factors were ranked by their
corresponding correlation (linear correlation for PCR and
nonlinear correlation for PC-ANN models) with the half-wave
potentials and entered to the models. Finally, genetic algorithm
(GA) was also employed to select the best set of factors for
both models. The 96% of variances in the descriptor data
matrix could be explained by 30 first extracted PCs. Among
these, 10, 6 and 10 PCs were selected by EV, CR and GA,
respectively, for PCR, while for the ANN model, 7 PCs were
selected by all of the factor selection procedures. The ANN
model with EV, CR and GA factor selection procedures could
explain 78.4%, 94.3% and 96% of variances in the E1/2 data,
respectively, while the respective values obtained from
different PCR procedures were 52.9%, 58.2% and 74.4%. The
results of this project showed that factor selection by
correlation ranking and genetic algorithm gives superior results
relative to those obtained by eigenvalue ranking. This confirms
that the magnitude of the eigenvalue of a PC is not necessarily
a measure of its significance in calibration. Moreover, it was
found that for PCR method, the results obtained by GA has a
major difference with those by EV and CR procedures, while,
the GA and CR factor selection methods give results close to
each other.
|