Rough a map of the human genome

Researchers now have a rough map of the human genome — the genetic blueprint for a human being.

Ninety-five per cent of the human genome is called “junk DNA” — scientists don’t know why it’s there. The five percent that scientists would most like to understand now is distributed throughout the junk. Geneticists — like Lisa Stubbs at Lawrence Livermore National Laboratory in California — are using the mouse genome — completed last December — to sort through the human genome.

Lisa Stubbs: It turns out mice and humans have virtually all of the same genes. They’re not identical, they’ve changed a lot over time, but they’ve only changed in very constrained ways . . .

If genes critical for life mutated too much, the animals wouldn’t survive to pass on their genes. But junk DNA is free to change without consequences — so junk DNA of a mouse and a human no longer look alike. And that’s proved handy to DNA research.

Lisa Stubbs: . . . if you take a piece of human DNA sequence and a related piece of mouse DNA sequence, and you line them up . . . the similar sequences shine out at you as a very small percentage of the genome . . .

This process of comparison has proven essential to spotting the hardest to find, most important human genes.

Genetic information is stored in DNA. You can think of DNA as a long ladder. At each rung, there is a pair of letters. But DNA is extremely small. If you could lay a piece of DNA on its side and zoom in on it so that you could see the individual letters, you could start at one end and “read” the genetic information contained in it. There are only four letters in the DNA alphabet. So an example of what a bit of DNA might say is: GACCTAAGCGGTATT … and so on …

In the case of the human genome, there are millions of letters. So it took many years just to “read” the string of letters in a typical human. Now, an even bigger task is figuring out what it says.

The DNA building blocks (nucleotides) that comprise the human genome are symbolized by the first letters of their chemical names: A, T, C and G. The study of the human genome map is the attempt to find meaningful patterns within a long chain of all possible combinations of these four letters. The “junk DNA” is (apparently) merely filler; although some is made of units scientists understand, such as repetitive, self-replicating elements (related to viruses).

Other junk is completely unrecognizable.

The important regulatory sequences hidden in the remaining five percent are essential to our understanding of how cells and genes communicate and keep our bodies functioning.

The (approximately) five percent of the code that scientists would most like to understand is distributed throughout the junk. Computational tools are used to extract the gems from the junk. Existing programs have done a pretty good job of finding certain protein-coded bits, but using mouse (and other species) strings for comparative purposes has proven essential to the process of spotting the hardest to find, most important genes. These are the regulatory sequences that instruct the protein codes to turn on or off in particular cells at particular times. They are switches, comprising a significant fraction of our DNA.

Every cell in the human body has one (or some) of our approximately 40,000 genes turned on, but the rest are silent. (For example, the carotene gene (involved in hair) shouldn’t be turned on in every cell all the time.) Each cell has its own on/off pattern. And every cell “chooses” a set of genes. A skin cell might choose 15,000 genes because it recognizes the controlling sequences of those genes. These controlling sequences are extremely important, but scientists barely understand them. Locating them is the first step.

Over time DNA accumulates mutations. Every time a cell replicates, a few mistakes are made when the DNA makes a copy of itself. Mutations can also come from exposure to outside elements (e.g. germs). So over evolutionary time quite a few mutations accumulate. Such mutations are passed on to future generations as changes in the genes. This makes us different from each other, and can cause genetic improvements. But mutations within important sequences cripple a gene. Diseases like cystic fibrosis are the result of a single change. So people who inherit them don?t typically survive and contribute to the gene pool. Whereas the junk DNA can accumulate all sorts of changes with no consequences. The important sequences therefore are deemed “protected” because they can?t be passed on. The unprotected junk has been passed on and altered for so long that there is now almost no similarity left between human junk and rodent junk.

Researchers suspected the great similarities between mouse and human genes as far back as ten years ago, but they?ve only found proof within the last couple of years. There is, however, a very small percentage of rodent genes that are quite different from human genes. Researchers want to be sure not to accidentally overlook such genes by mixing them in with the differentiated junk. That’s one of the reasons they?re working on mapping the genomes of more and more species: the more samples available for comparative purposes, the fewer missed genes. Scientists are currently mapping not only the mouse genome, but the cow, the chimpanzee and others, hoping to identify more of these critical regulatory genes in humans.

Scientists hope that understanding the important sequences that mice and humans do not share will lead to a greater understanding of species-specific functions. For example, mice can digest drugs and toxins better than humans.

What you have in your mind?