HOME
Table of Contents (all articles on this disk)
This Article: THE DNA COMPUTER AND BIO-ALGEBRA
For this article:  References

The DNA COMPUTER and BIO-ALGEBRA

The new "biotechnology" revolution has brought related advances in mathematics. For most of our history Mathematics has been thought of as equivalent to philosophy; something to be thought about, reasoned with, but rarely applied to biology. That image is changing.

In order to explain just one relationship between mathematics and DNA, we are going to back into the topic by providing a little information about a recent experiment in molecular computing to show that DNA molecules are can be used in applied mathematics. In a future issue we will introduce the idea that our DNA might be thought of as a special kind of mathematical set (a Julia Set) which is subject to mathematical rules and techniques.

Under the headline of "COMPUTING WITH DNA: USC Scientist States and Solves Computational Problem Using Tools of Molecular Biology -- Writing mathematics in the language of life." a press release from the University of Southern California, School of Engineering, send over the Internet announced the latest experiment of Leonard M. Adleman of their department. Computer designers are constantly looking for ways to make computers smaller and faster. Professor Adleman's experiment demonstrates that it is possible to carry out computations at the molecular level, using DNA as the computational engine (another word for "computer").

As was reported in the Nov. 11, 1994 issue of the journal Science, "Dr. Adleman solved a simple example of a well-known computational problem by encoding mathematical structures in molecules of DNA and carrying out computational steps in a test tube using standard techniques from molecular biology."

"Although the long-term potential of this concept is difficult to assess at this time," Adler noted that " ... while current super computers can execute about a trillion operations per second, molecular computers conceivably could execute more than a thousand trillion operations per second. Further, molecular computers might be as much as a billion times more energy efficient than current electronic computers. Also, storing information in DNA requires about 1 trillionth the space required by existing storage media such as video tape."

The problem that the DNA computer solved is called the "directed Hamiltonian path problem." This problem involves finding a special path through a network of points. A simple example might be finding a path or flight itinerary from city to city. Consider cities such as Atlanta, Baltimore, Chicago and Detroit (any number of cities will do - Adleman used seven cities but we will use four in our discussion). Assume that non-stop flights are scheduled only from Atlanta to Chicago, Chicago to Detroit, Chicago to Baltimore, and Baltimore to Detroit (or any such combination.) The directed Hamiltonian path problem becomes: Could a traveler fly exactly three flights in a sequence -- starting in Atlanta and ending in Detroit -- that would take him to each of the four cities? For simple maps and itineraries we can see the answer. In fact every time someone must travel to more that one location and wishes to minimize travelling, they solve a directed Hamiltonian path problem. In our example, fly from Atlanta to Chicago, from Chicago to Baltimore, then from Baltimore to Detroit. That's three flights to four cities.

If the number of cities and flights grows even slightly larger, the number of possible itineraries becomes astronomical. Even the most widely used algorithms running on super computers will fail to find a solution. To date, mathematical analysis indicates that the directed Hamiltonian path problem is one of a class of problems for which an efficient algorithm may never be found.

DNA-encoded molecular computation for such a problem can be illustrated by way of the following example. For each of the four cities, a "DNA name" written in the four-letter alphabet of DNA -- A, T, G, C -- was chosen at random. Although Adleman used chains of 20 such "letters" (actually, base-pairs) in length. For this discussion we will shorten these to lengths of just 6 letters. Remember, we use the word "letter" to refer to one of the bases, G, C, A, or T (Guanine, or "G", Cytosine, or "C", Adenine, or "A", and Thymine, or "T" - the chemicals that form the cross-links joining the double-helix structure of DNA and carrying genetic information). Here's how our city names might look:

CITY

DNA NAME

Atlanta
Baltimore
Chicago
Detroit

aatgcg
ccgatc
ggctta
ggtccg

James D. Watson and Frances H.D. Crick, Nobel laureates, showed that every sequence of DNA bases has a "complementary sequence" (cDNA) that sticks to it (this sticking together of DNA and its cDNA strands is what gives DNA its double-helix shape). In the complementary sequence, each A is matched with a T (and each T with an A), each C with a G (and each G with a C). For example, the complementary sequence for the DNA sequence "CTACG" would be "GATGC." Notice how each C was matched with a G, each G with a C, and so on.

Applying the concept of cDNA to our example, each of the DNA strands representing our four cities also has a "cDNA name":

CITY
DNA NAME
cDNA NAME

Atlanta
Baltimore
Chicago
Detroit
aatgcg
ccgatc
ggctta
ggtccg
ttacgc
ggctag
ccgaat
ccaggc

For each flight, Adleman then created a "DNA flight number" by taking the three letters at the end of the DNA name for the departure city and attaching the three letters at the beginning of the DNA name for the destination city. Thus:

FLIGHT
DNA FLIGHT
NUMBER

Atlanta-Chicago
Chicago-Detroit
Chicago-Baltimore
Baltimore-Detroit
gcgggc
ttaggt
ttaccg
atcggt

It is now a routine process to synthesize strings of DNA like these with any desired sequence of bases. In Adleman's experiment, 30 trillion molecules of each city's complementary DNA name and each DNA flight number were synthesized and added to a test tube.

The diagram below illustrates the role of a small strand of cDNA ("ccgaat") in splinting or sewing cDNA sequences representing flight numbers together.

For example:

ATLANTA-CHICAGO CHICAGO-BALTIMORE
DNA FLIGHT NUMBER DNA FLIGHT NUMBER
\ /
\ /
gcgggcttaccg
. .
. .
ccgaat

Notice how the splint segment of cDNA ("ccgaat") represents the cDNA name for Chicago, holding together two 6-letter segments representing the Flight Names for Atlanta-Chicago and Chicago-Baltimore. The "answer" is, of course, "gcgggcttaccg" which, for a simple 3-city problem, is obvious. The Hamiltonian path is from Atlanta to Chicago to Baltimore. Given the available flights, there is no other possible solution. Of course, it gets a lot more complicated with seven cities.

After a sufficient reaction time (the whole experiment lasted about 7 days), the test tube contained trillions of long DNA molecules consisting of DNA flight numbers splinted together by cDNA names. Considering the numbers of molecules used for this experiment, it was assumed that all possible combinations of flights were formed. Given the large number of reacting molecules, probability makes it virtually certain, by chance alone, that a molecule will form that corresponds to the sought after combination (in our example - Atlanta to Chicago, Chicago to Baltimore, and eventually Baltimore to Detroit). The molecule encoding this solution can be distinguished from the others because of its length (in our example, it is four DNA city names long), its beginning (the DNA name for Atlanta), its end (the DNA name for Detroit) and the fact that it contains the DNA names for both remaining cities (Baltimore and Chicago). It is relatively straightforward, using the tools of molecular biology, to isolate the DNA molecule coding the solution to the problem from all of the others that formed in the test tube and analyze ("sequence") it.

In the experiment reported in Science, a small example with seven cities was solved. The methods described could be scaled up to handle larger Hamiltonian path problems. Solution of such problems is important in a number of real-world applications, including designing telephone networks, scheduling work tasks and artificial intelligence problem solving.

Professor Adleman wrote,

"the potential of molecular computation is impressive. What is not clear is whether such massive numbers of inexpensive operations can be productively used to solve real computational problems. One major advantage of electronic computers is the variety of operations they provide and the flexibility with which these operations can be applied. Whereas two 100-digit integers can be multiplied quite efficiently on an electronic computer, it would be a daunting task to do such a calculation on a molecular computer using the currently available protocols and enzymes."

"Nonetheless, for certain intrinsically complex problems, such as the directed Hamiltonian path problem where existing electronic computers are very inefficient and where massively parallel searches can be organized to take advantage of the operations that molecular biology currently provides, it is conceivable that molecular computation might compete with electronic computation in the near term."

From this experiment we can see that DNA and mathematics may be closely related. Not math applied to understanding the nature of DNA but, as in our example, DNA performing as if it were a mathematical operator.



HOME
Table of Contents (all articles on this disk)
This Article: THE DNA COMPUTER AND BIO-ALGEBRA
For this article:  References