#DOTD: Z-curve

13 Jul 2017

The Z-curve of a DNA sequence is its representation as a three-dimensional graph based on three skew measurements. It is, therefore, useful in genomic analysis as it can be used as a compact representation of a genome and can be converted back in to the genomic sequence.



Each node in the graph has three co-ordinates:

xi = (Ai + Gi) - (Ci + Ti)

which represents purine-pyrimidine skew

yi = (Ai + Ci) - (Gi + Ti)

which represents amino-keto skew

zi = (Ai + Ti) - (Ci + Gi)

which represents weak-strong hydrogen bond base skew

where i = 0...n - 1, n is the length of the sequence,xi, yi, zi are in the range [-n, n], and Ai, Ci, Gi, Ti are cumulative frequencies of each of the bases in the sequence up to and including position i. Each node is connected to its neighbouring node with a straight line and, intuitively, the first node has coordinates [0, 0, 0].



AT-skew, defined by (xi + yi)/2 and GC-skew, defined by (xi - yi)/2, are used to predict the locus of the origin of replication in a prokaryotic genome [1].




1. Zhang, R. and Zhang, C.T., 2005. Identification of replication origins in archaeal genomes based on the Z-curve method. Archaea, 1(5), pp.335-346.

Share on Facebook
Share on Twitter
Share on LinkedIn
Please reload

Please reload

Related Posts
PhDomics by Fatima