The Z-curve of a DNA sequence is its representation as a three-dimensional graph based on three skew measurements. It is, therefore, useful in genomic analysis as it can be used as a compact representation of a genome and can be converted back in to the genomic sequence.
Each node in the graph has three co-ordinates:
xi = (Ai + Gi) - (Ci + Ti)
which represents purine-pyrimidine skew
yi = (Ai + Ci) - (Gi + Ti)
which represents amino-keto skew
zi = (Ai + Ti) - (Ci + Gi)
which represents weak-strong hydrogen bond base skew
where i = 0...n - 1, n is the length of the sequence,xi, yi, zi are in the range [-n, n], and Ai, Ci, Gi, Ti are cumulative frequencies of each of the bases in the sequence up to and including position i. Each node is connected to its neighbouring node with a straight line and, intuitively, the first node has coordinates [0, 0, 0].
AT-skew, defined by (xi + yi)/2 and GC-skew, defined by (xi - yi)/2, are used to predict the locus of the origin of replication in a prokaryotic genome .
1. Zhang, R. and Zhang, C.T., 2005. Identification of replication origins in archaeal genomes based on the Z-curve method. Archaea, 1(5), pp.335-346.