Dideoxy Sequencing

Earlier descriptions of molecular genetic techniques liberally omitted many important steps in order to focus on conceptual issues. Here, the major technique for sequencing DNA in the human genome project is described in some detail. It can help to give you an appreciation of the laboratory techniques used in modern molecular genetics.

As with all human molecular technique, sequencing begins with biological material, almost always blood. Through a series of steps that need not concern us, the DNA from the blood cells is extracted and purified. A very small sample of this DNA is then used for analysis.

The physical process of extraction and purification breaks the DNA at large number of random places. However, very long fragments of DNA—too long to work with—still remain. Hence, the DNA is often bathed with a restriction enzyme to cut it into fragments of more manageable size.

The next step is to amplify the DNA section that we want to sequence by using PCR. At the end of this stage, there are now many millions of copies of the DNA section that is to be sequenced.

The actual sequencing procedure begins with the first few steps of the PCR technique (Figure 1). The DNA is heated so that the double-stranded helix breaks apart into two single strands (a process called denaturing). Then a primer—the identical one used in the PCR—is added. This primer is a small section of single-stranded nucleotides that will bind with its complementary section of DNA that was previously amplified in the PCR.

The next step is to synthesize rest of the double-stranded DNA starting with the primer. Essentially, this is the same process as DNA replication but here it is performed in the test tube instead of the cell nucleus. The two major ingredients for replication are: (1) a large number of free nucleotides; and (2) a polymerase enzyme that will build the chain.

Figure 2 shows the DNA with its nucleotide "soup." The majority of nucleotides in the soup have the same chemical composition as the nucleotides in ordinary DNA. The clever trick in this step of the process is the addition of a small amount of specially engineered nucleotides that have two key features. This first is that these nucleotides are chain terminating. That is, whenever one of these special nucleotides gets entered into the chain, the process of building the double helix stops dead in its tracks. The second important property is that each special nucleotides is "color-coded" through a chemical tag so that it will fluoresce into a specific color when exposed to the appropriate light conditions. Each type of these special nucleotides is given a different color—e.g., green for adenine, yellow for thymine, etc.

Next, millions of polymerase molecules are added (Figure 3). The polymerase is a complicated enzyme that ordinarily acts to replicate DNA in the cell after the hydrogen bonds are split and the double helix is split into two single strands. To imbue polymerase with a sentience that it really lacks, one could say that the molecule grabs free nucleotides, inspects the next nucleotide in the single-stranded chain, and then "glues" the appropriate nucleotide partner into the other DNA strand. For example, if the next nucleotide on the single strand is A, then the polymerase will place a T on the growing strand.

The next step is simply to wait and let nature take its course. Because there are millions of single-stranded DNA fragments with primer attached to them, millions of polymerase molecules, and several gazillion nucleotides, millions of double-stranded DNA molecules will be synthesized. However, these complementary DNA strands will be of different lengths because of the chance incorporation of special nucleotides. Whenever one of these special nucleotides is placed into the chain, further synthesis of the DNA strand stops. The net result is many millions of copies of double-stranded DNA, but all of different lengths (Figure 4).

If the DNA mixture is now heated, the double-stranded DNA will break down into its single strands, giving a large number of single-stranded DNA molecules of various lengths (Figure 5). The next step is to load the single-stranded DNA onto an electrophoretic gel. The new forms of electrophoresis are so sensitive that they can detect the difference between two DNA strands that differ in length by only one nucleotide. The result after electrophoresis is completed is depicted in Figure 6, where the fragments would have been loaded on the top of the figure.

When the gel is viewed under the appropriate lighting, the bands will fluoresce. Because the special nucleotides are color-coded, simply reading the sequence of colors gives the nucleotide sequence of the DNA (Figure 7).

This type of sequencing procedure is now highly automated. There are specialize sequencing machines that are essentially computerized robots that perform the automated tasks of timing, heating and cooling, and pipetting mixtures. The newer models also contain a special capillary tube that permits the electrophoresis to be done automatically. Laser lighting and optical scanning allows the colors to be read by a computer. All the data are processed by specialized software that analyzes the sequence, flags areas of uncertainty, and stores the data.