Genes VII

7.1 Introduction

Key terms defined in this section
Stop codons are the three triplets (UAA, UAG, UGA) which terminate protein synthesis.

The sequence of a coding strand of DNA, read in the direction from 5′ to 3′, consists of nucleotide triplets (codons) corresponding to the amino acid sequence of a protein read from N-terminus to C-terminus. Sequencing of DNA and proteins makes it possible to compare corresponding nucleotide and amino acid sequences directly. There are 64 codons (each of 4 possible nucleotides can occupy each of the three positions of the codon, so that there are 43 = 64 possible trinucleotide sequences). Each of these codons has a specific meaning in protein synthesis: 61 codons represent amino acids; 3 codons cause the termination of protein synthesis.

The meaning of a codon that represents an amino acid is determined by the tRNA that corresponds to it; the meaning of the termination codons is determined directly by protein factors.

The breaking of the genetic code originally showed that genetic information is stored in the form of nucleotide triplets, but did not reveal how each codon specifies its corresponding amino acid. Before the advent of sequencing, codon assignments were deduced on the basis of two types of in vitro studies. A system involving the translation of synthetic polynucleotides was introduced in 1961, when Nirenberg showed that polyuridylic acid [poly(U)] directs the assembly of phenylalanine into polyphenylalanine. This result means that UUU must be a codon for phenylalanine. A second system was later introduced in which a trinucleotide was used to mimic a codon, by causing the corresponding aminoacyl-tRNA to bind to a ribosome. By identifying the amino acid component of the aminoacyl-tRNA, the meaning of the codon can be found. The two techniques together assigned meaning to all of the codons that represent amino acids (418, 423).

Figure 7.1 All the triplet codons have meaning: 61 represent amino acids, and 3 cause termination (STOP).

The code is summarized in Figure 7.1. Because there are more codons (61) than there are amino acids (20), almost all amino acids are represented by more than one codon. The only exceptions are methionine and tryptophan. Codons that have the same meaning are called synonyms. Because the genetic code is actually read on the mRNA, usually it is described in terms of the four bases present in RNA: U, C, A, and G.

Codons representing the same or related amino acids tend to be similar in sequence. Often the base in the third position of a codon is not significant, because the four codons differing only in the third base represent the same amino acid. Sometimes a distinction is made only between a purine versus a pyrimidine in this position. The reduced specificity at the last position is known as third-base degeneracy.

The interpretation of a codon requires base pairing with the anticodon of the corresponding aminoacyl-tRNA. The reaction occurs within the ribosome: complementary trinucleotides in isolation would usually be too short to pair in a stable manner, but the interaction is stabilized by the environment of the ribosomal A site. Also, base pairing between codon and anticodon is not solely a matter of A PU and G PC base pairing. The ribosome controls the environment in such a way that conventional pairing occurs at the first two positions of the codon, but additional reactions are permitted at the third base. As a result, a single aminoacyl-tRNA may recognize more than one codon, corresponding with the pattern of degeneracy. Furthermore, pairing interactions may also be influenced by the introduction of special bases into tRNA, especially by modification in or close to the anticodon.

The tendency for similar amino acids to be represented by related codons minimizes the effects of mutations. It increases the probability that a single random base change will result in no amino acid substitution or in one involving amino acids of similar character. For example, a mutation of CUC to CUG has no effect, since both codons represent leucine; and a mutation of CUU to AUU results in replacement of leucine with isoleucine, a closely related amino acid.

Figure 7.2 The number of codons for each amino acid does not correlate closely with its frequency of use in proteins.

Figure 7.2 plots the number of codons representing each amino acid against the frequency with which the amino acid is used in proteins (in E. coli). There is only a slight tendency for amino acids that are more common to be represented by more codons, and therefore it does not seem that the genetic code has been optimized with regard to the utilization of amino acids.

The three codons (UAA, UAG, and UGA) that do not represent amino acids are used specifically to terminate protein synthesis. One of these stop codons marks the end of every gene.

Is the genetic code the same in all living organisms?

Comparisons of DNA sequences with the corresponding protein sequences reveal that the identical set of codon assignments is used in bacteria and in eukaryotic cytoplasm. As a result, mRNA from one species usually can be translated correctly in vitro or in vivo by the protein synthetic apparatus of another species. So the codons used in the mRNA of one species have the same meaning for the ribosomes and tRNAs of other species.

The universality of the code argues that it must have been established very early in evolution. Perhaps the code started in a primitive form in which a small number of codons were used to represent comparatively few amino acids, possibly even with one codon corresponding to any member of a group of amino acids. More precise codon meanings and additional amino acids could have been introduced later. One possibility is that at first only two of the three bases in each codon were used; discrimination at the third position could have evolved later. (Originally there might have been a stereochemical relationship between amino acids and the codons representing them. Then a more complex system evolved.)

Evolution of the code could have become "frozen" at a point at which the system had become so complex that any changes in codon meaning would disrupt existing proteins by substituting unacceptable amino acids. Its universality implies that this must have happened at such an early stage that all living organisms are descended from a single pool of primitive cells in which this occurred.

Exceptions to the universal genetic code are rare. Changes in meaning in the principal genome of a species usually concern the termination codons. For example, in a mycoplasma, UGA codes for tryptophan; and in certain species of the ciliates Tetrahymena and Paramecium, UAA and UAG code for glutamine. Systematic alterations of the code have occurred only in mitochondrial DNA.

Research
418: Nirenberg, M. W. and Matthaei, H. J. (1961). The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proc. Nat. Acad. Sci. USA 47, 1588-1602.
423: Nirenberg, M. W. and Leder, P. (1964). The effect of trinucleotides upon the binding of sRNA to ribosomes. Science 145, 1399-1407.

Категории