Internet Electronic Journal of Molecular Design - IEJMD, ISSN 1538-6414, CODEN IEJMAT
ABSTRACT - Internet Electron. J. Mol. Des. December 2002, Volume 1, Number 12, 675-684 |
The Numerical Characterization and Similarity Analysis of DNA
Primary Sequences
Yachun Liu
Internet Electron. J. Mol. Des. 2002, 1, 675-684
|
Abstract:
DNA sequencing has become routine and has resulted in an abundance
of data on primary sequences of DNA for various species. Hence, we
faced the task of process such great amount of data, which poses a
number of yet unsolved problems. The motivation of this paper is to
introduce a new numerical characterization of DNA sequences. We
define a scheme to give a logic order of DNA primary sequences in term
of the classification of nucleic acid bases. Using logic sequences we
generate a set of 4x6 matrices to represent DNA primary sequences,
which are based on counting all (0,1) triplets in the logic sequences.
Using the condensed representation of primary DNA primary sequences
and the eigenvalues of the corresponding symmetric real matrix a
comparison is made between the primary sequences for exon-1 of
human β-globin and seven other species. With this procedure we extend
the matrix method to determine new invariants as descriptors for DNA
sequences. On the basis of this new scheme, we find that a new
similarity index, the informational compression ratio, can characterize
the evolution relationships for different species.
|