A New Classification Scheme
 of the Genetic Code

 

 J. Mol. Evol. :   59: 598 -605 (2004)   

BIOforum Europe :     A Purine-Pyrimidine Classification Scheme of the Genetic Code  06/2004: 46-49.             References

Please, visit our new page!

A new classification scheme of the  genetic code  is based on a binary representation of :

nWe have 23 = 8 different binary codons representations:        000 , 001, … ,111    each of these has again 8 possibilities,
for instance:

q000    stands for three pyrimidines:   CCC, CCU, UUC, …, UUU

q111    stands for three purines:          GGG, GGA, GAA, …, AAA
 
n  Pair C     G   binds via 3 hydrogen bonds in the complementary base-paring
n  Pair A     U   binds via 2 hydrogen bonds in the complementary base-paring

The number of columns in the new classification scheme of  the genetic code would be 8, but we reduced it, because the third position in codons is important only as purine or pyrimidine (in the binary manner). Thus, the eight rows  and four columns are sufficient to place 20 amino acids, as well as the termination codons of the genetic code.

                  

  1. In the first column the first two positions are G and C. These always pair with their anticodon base via 3 hydrogen bonds, i.e. the first two bases together always guarantee 6 hydrogen bonds. For that reason Lagerkvist (1978) called them strong codons. In the second and third column, the first two bases guarantee exactly 5 bonds (mixed codons) and in the fourth column A,U just 4 bonds (weak codons).
    This pattern corresponds very well to the importance of the third base in the triplet codon: if the first bases are G and/or C (first column), the third base is never important, and in the second and third column, the third base is important in exactly half of the cases (if there is a purine in the second position – lower half of the table). In the fourth column the third base is always necessary for the determination of the correct amino acid.

  2. Each row contains exactly 4 different amino acids (including the termination codon).

     In the standard code, exceptions are the second row with two leucines and in the fourth row the AU* start codon. Note that here are also the deviations from the standard code. Interestingly, the yeast mitochondrial code shows no exception: each row contains exactly four different entries in four different columns. In this spirit the yeast mitochondrial code is the most regular one.

     

  3. The mitochondrial genetic code shows no exception:  32 positions contain exactly 32 entries.
     

  4. There are 22 tRNA genes  in the mammalian  mitochondrial genomes: Table2.

    "T
    he mammalian mitochondrial genomes contain ONE gene for each tRNA, with the exceptions of tRNA Leucine and tRNA Serine for which TWO genes are present."  This is no exeption for our scheme, what can be seen  in the mammalian mitochondrial code via tRNA .
     

  5. This corresponds to the known fact that transition mutations (e.g. purine A vs. purine G) occur more frequently than transversion mutations (e.g. purine A vs. pyrimidine U).

    Our scheme yields some support for the “adaptive genetic code” hypothesis (Freeland 2002) which states that the code has evolved to minimize the deleterious effects of mutation and translation error (Haig and Hurst 1991, Freeland and Hurst 1998). The purine-pyrimidine binary coding scheme, given in
    table, gives a much higher regularity than a binary coding according to the base pairs (A,U – 1; G,C – 0).

     

  6. The deviations of non-standard genetic codes. As can be seen in table, nearly all deviations occur in codons with a purine at the third position. The only exception is the yeast mitochondrial genetic code where CU* does not code for Leu, but rather for Thr.
     

  7. Three perfect symmetries in our scheme of the genetic code.

         
    The first is the codon-anticodon symmetry: the thick horizontal line in Fig.2 marks the symmetry axis.
      For instance, codon CCC (Pro, first column, first row) has the anticodon GGG (Gly, first column, last row).

      
     The second  is the point symmetry corresponding to Halitsky’s family – nonfamily symmetry operation (“E-M bifurcation”, Halitsky 2003),
     indicated by the point in the center of
    table.  Halitsky observed that all the 32 “family codons” CC*, CU*, UC* GC*, GU*, AC*, CG*, GG* can be mapped into the 32 “nonfamily codons” UU*, AU*, CA*, UG*, UA*, GA*, AG*, AA* by exchanging the two keto  bases A and C with one another, and the two keto bases U and G with one another. For instance, the family codon GUA (Val) is mapped into the nonfamily codon UGC (Cys). Thus, this point symmetry is behind the family – nonfamily symmetry in our scheme (shaded vs. unshaded regions).
     

  8. Ithe fourth column  all amino acids are ketogenic (leucine, lysine) or  glucogenic and ketogenic  (isoleucine, phenylalanine, threonine, asparagine, methionine and tyrosine) .  

    The carbon skeletons of amino acids are generally conserved as carbohydrate, via gluconeogenesis, or as fatty acid via fatty acid synthesis pathways. In this respect amino acids fall into three categories:
    glucogenic, ketogenic, or glucogenic and ketogenic. Glucogenic amino acids are those that give rise to a net production of pyruvate or TCA cycle intermediates, such as a-ketoglutarate or oxaloacetate, all of which are precursors to glucose via gluconeogenesis.
     

  9. Correlation of codon strength and amino acid properties: Table 1.
     

  10. Evolution of the  genetic code: doublet code?

 

Thanks for your visit!  Your comments are  welcome!  mailto:sweta@imb-jena.de

 

Last update 09.01.2006