Codes and DNA

I am still trying to assemble an explanation of the term “DNA code” and explain what it is and is not. Older posts on this can be found here and here.

“Code” may be defined as any of the following:

1) a set of rules for converting information into another form or representation (encoding) for later retrieval
2) a set of rules or principles or laws (not applicable in this case)
3) a system of transmitting messages for brevity and/or security (also not applicable)
4) the symbolic arrangement of data or instructions in a computer program or the set of such instructions

Is DNA a code?

A DNA molecule is, itself, neither a set of rules, nor is it symbolic of data or instructions. DNA molecules are sequences of nucleotides which can be broken down into reading frames, promoters, enhancers, structural and spacial segments, and so forth. Since the DNA molecule has such variability and physical functionality inherent within it, the metaphor of DNA being information or a “code” is both overly simplistic and inaccurate.

What about RNA?

Additionally, functional segments of the “code” exist without ever being “decoded.” Some segments of RNA, for example, spliceosomes, ribozymes, and ribozomes are in the same sequences as the DNA template.  Even ignoring the function inherent in many RNA transcripts (spliceosomes, ribozymes, ribozomes, etc.) which are still “the raw code,” we can continue on both ends. The DNA sequence itself, as well as the transcripts which are translated into proteins, are similarly not information. While codes rely upon symbolic arrangements to transmit data or instructions, the arrangement of DNA merely serves as a template by which RNA transcripts are made; this is more like a mold than a code as the resulting copy of a DNA template is the reverse and opposite of the template itself: the DNA sequence 5’AAGCTTGGCAT3′ is transcribed into the RNA sequence 5’AUGCCAAGCUU3′. What, also, is the magical “information” this DNA sequence is supposed to carry? Thus far, it appears the RNA is not information, nor is the DNA itself information.


Could it be the proteins which result from some RNA transcripts which are the “information” in DNA? No. Proteins also do not transmit information; while the protein is useful, it is as much “information” as a car is a manual to assemble a new one. Proteins were never “encoded” from protein to RNA to DNA in order to be properly “decoded,” the function of a protein is based upon its physical structure, as is the function of RNA and DNA.

Well, what is the “genetic code?”

We hear this phrase all the time, mostly by individuals who have no understandings of even the basics of genetics. The genetic code is the means by which we (humans) organize DNA into bits which we can understand. We remove the three-dimensional context and focus exclusively on the sequence. From this, the simplistic explanation (DNA->RNA->”magical protein”) that DNA is arranged in codons. These codons hold the “magical information” which becomes “magical protein,” which as we all know, is what DNA is all about. This is where the confusion lies. DNA does not just contain sequences for protein; functional units exist outside of any reading frames including promoters, enhancers, suppressors, etc. Other functional segments exist as well (centromeres, telomeres, and origins of replication) which contradicts the idea that DNA is like a software program. The genetic code is, in this respect, a human construct to understand specific regions of DNA which do contain sequences used for protein production. This is the only use this phrase has.


10 Responses to “Codes and DNA”

  1. 1 Pliny-the-in-Between
    July 9, 2009 at 1:13 pm

    Great post though I for one would like some follow-up on information definitions. I think this is a really fascinating discussion even for people who didn’t start out on a molecular biology track. The concept that DNA is a code in the common sense of the term hampers (in my mind) the frame of reference for discussing the origins of life. When we think of DNA as a hard code it tends to create the conundrum of how could life’s precursors evolve without first having complex nucleic acids. The notion that nucleic acids over time became a more efficient way for organisms to institutionalize discrete enough data sets (presumably already present in some form – i.e. the actual information content which may have evolved independent of nucleic acids) that jump started the great engine of natural selection, once there was something more discrete upon which to act and life never looked back. Just a thought.

  2. 2 jaredcormier
    July 9, 2009 at 2:20 pm

    Precisely, and since RNA can, and does, act as the template and catalyst even in small units (13 bases, I think) can have enzymatic activity of some kind. It may be that RNAs are effective at catalyzing all reactions necessary, but upon becoming above a certain length, DNA, being more stable, is more effective at acting as a transcript and amino acids, also being more diverse in structure, are more efficient for enzymes.

  3. 3 Pliny-the-in-Between
    July 9, 2009 at 2:50 pm

    It’s also interesting to consider that if RNA was the first nucleic acid of import in early life, its instability might actually have been an advantage early on (first 300 million years or so…) in that greater variety of peptides might have been catalyzed by partially denatured RNA strands which reconstituted. Since natural selection had not yet had time to confer any decisive advantage to the DNA paradigm, reactions such as these might account for the appearance of complex organic ‘fossils’ seen at 3.7 billion years in the past and the true bacteria seen by 3.4 billion years ago. That’s an extraordinary amount of time to be doing experiments with self-catalyzing molecules bathed in all the energy around geothermal hot spots for example.

    Again your point about RNA’s inherent catalytic function may be why it persists today in protein synthesis: early peptides may have required (or been limited by) the catalytic activity of RNA until such time as some mutations allowed proto-enzymes to become available boosting the range of energetically favorable reactions expanding the pool of proteins even more.

    At that point, there is nothing that can’t be explained by evolution.

  4. 4 jaredcormier
    July 9, 2009 at 5:50 pm

    Interesting thoughts Pliny, as always, I appreciate your insight and comments. You made me think of a completely new post, which I don’t have time for right now, but that being “where does evolution actually begin.” This is due to the evolutionary activity which occurs as soon as self-replication begins.

  5. 5 eddie
    July 17, 2009 at 7:39 pm

    I’ve been thinking of how the dna-rna-amino acid-protein system is analogous to a computer system. It seems best to me to think of the rna being like working memory; making sure that sequences of instructions are implemented in the right order. Application programs are the analog to amino acids. They are run in order to do jobs like typing a letter or flying a plane, or to make a body or phenotype.
    The dna here is analogous to a code library that rna calls on, but also to a resilient storage medium: a high-availability raid box.

  6. July 17, 2009 at 9:16 pm

    I am rather fond of pointing out that tree rings express information, some of which can be interpreted by school-age children. And yet, the source of that information is not exclusively genetic. It is the product of an interaction between the genome and the environment. The genetic code follows a part of that interaction, and it happens to be the part which is most easily conceptualized as a series of strings. So I know that there is a limit to the metaphor of ‘the Central Dogma’.

    But I’m also a high school science teacher who is charged with teaching transcription and translation. The non-linear aspects of protein synthesis do not typically appear either in the standards or the textbooks, and they are (frankly) quite beyond the intellect of most people. There is a time and a place for making the leap to a more realistic conceptualization of the whole affair for those who can, and it will typically be in a second or third-year undergraduate course in cell or molecular biology. Trying to bring this level of context to the general public is a tall order. That’s why I like the analogy to tree rings. Make the point that the information arises from a complex interaction that is more than a linear string of texts, and hope it sinks in?

  7. 7 jaredcormier
    July 17, 2009 at 9:34 pm

    But the difference here, Scott, is that tree rings do not, in fact, express information, either. The process of ring formation is understood to a point which we can interpret the conditions the tree was present in at the time of this portion of growth. It is not “information” until it has been interpreted by an observer, at which point, the information can be discussed. Prior to this, the rings are simply the manifestation of the tree’s growth history. Similarly, the “Genetic Code” is the way we interpret the DNA, not anything present in the DNA itself.

  8. November 2, 2015 at 11:09 pm

  9. November 2, 2015 at 11:10 pm

  10. December 30, 2016 at 11:16 pm

    This design is steller! You certainly know how to keep a reader entertained.
    Between your wit and your videos, I was almost moved to start my own blog (well, almost…HaHa!)
    Great job. I really enjoyed what you had to say, and more than that,
    how you presented it. Too cool!

