Original Article Link

How tall of a stack of paper would we need to print out an entire human genome?

September 2005

The answer depends on many things such as the choice of font and it's point size, the margin width, whether we print capital or lower-case letters, the size of a sheet of paper, whether we print on both sides or on a single side of each sheet, etc. Diverse claims have been made about the number of sheets required and how tall of a stack those sheets would form, so I am providing some accurate numbers here that might be useful in teaching students about the vast size of the human genome.

Using Microsoft Word, with a Times New Roman True Type 12-point font on 8.5" × 11" (21.59 cm × 27.94 cm) paper with 1" (2.54 cm) margins on all sides, and displaying A, C, G and T in capital letters in equal numbers, I can fit 57 letters on a line and 46 lines on one side of a page, thus representing 2,622 base pairs per page (single sided). To give you a feel for it, here is an example of such a page in your choice of two file formats:

PDF format
MS Word format

The human genome consists of approximately 3.1467 billion base pairs (a number just slightly less than the number of seconds in 100 years: 3.15576 billion). Thus, 1,206,980 single-sided sheets would be needed to display the entire human genome in this way. A modern ream of printer paper consists of 500 sheets of paper and is about 2.1" (5.33 cm) thick. One would therefore need to use 2,414 reams of paper to print the human genome in the format just described. A stack of that many reams of paper would stand 424.4 feet (129.36 m) tall, approximately midway in height between the Statue of Liberty and the Washington Monument:


Illustration modified from http://www.nps.gov/jeff/images/compare.jpg

Thus, the human genome is very large. On the other hand, modern digital technology allows us to fit the genome into a fairly small device. If we were to use binary digits to represent letters in the following way...

A 00
C 01
G 10
T 11

...we could then represent four base pairs using only one byte (eight binary digits) and the entire genome would use approximately 787 million bytes which is about 12% more than we can fit on a single, typical compact disk (CD). A DVD of the same physical size as a CD holds about 6 times as much data as a CD and the DVD could therefore hold more than 5 human genomes. The 1 GB flash drive that I carry in my pocket can hold about 1.25 human genomes worth of data.

Of course, nature is still cleverer than we humans and every cell of our bodies holds an entire human genome (really four copies -- maternally inherited and paternally inherited and both of those are double-stranded), and much more, within its nucleus in a space so small that we need a microscope just to see it. Each base pair in DNA (in the B form), whether AT or GC, extends the length of the DNA by about 3.4 ångströms (0.34 nm or 3.4 × 10-10 m). This means that 3.1467 billion base pairs of B-DNA would be about 1.07 meters long, or about 3 feet 6 1/8 inches. Thus, DNA is a very long and skinny molecule that is wound up extremely tightly ("supercoiled") in our chromosomes.

published on the web on October 15, 2005 by

Michael B. Miller, Ph.D.
Assistant Professor
Division of Epidemiology and Community Health
and Institute of Human Genetics
University of Minnesota

Close