All the most important single-gene inherited illnesses were tracked down within a few years. Huntington's Disease leads to a degeneration of the nervous system and death in middle age. It was once called Huntington's Chorea (a word with the same root as choreography) after the involuntary dancing movements of those afflicted. An eighteenth-century Harvard professor claimed that those with the disease were blasphemers as their gestures were imitations of the movements of Christ on the Cross and some sufferers were burned. It is a dominant, but with a nasty twist: because of the late onset of symptoms, those at risk are left in uncertainty about their predicament. In 1983 came a breakthrough helped by great good luck. Soon after the search started, the approximate sire of the Huntington's gene was found by following its association with a linked DNA variant some ilisi.nuc away on rhe same chromosome. Then, luck ran out,tiul it took tt-n years to find the gene. It has now been tracked 10 ihc up of chromosome 4. The shape of the protein which has j» imc wrong — huntingtin, as it is with some lack of imagination called — has been worked out to give, for the Hrst time, some insight into the nature of the disease, which involves nerve cells in effect committing suicide when the aberrant protein (which looks like nothing else in the cell) instructs them to do so. Many more damaged genes soon fell victim to the genetic explorers and were pinned onto the map.
Type in the four letters OLIM — On Line Inheritance In Man — into any search engine and a list of ten thousand inherited diseases at once appears; symptoms, inheritance patterns, and, for nearly all, chromosomal grid reference. From the hunt for inherited illness, the search shifted to a wider set of genes. No longer were diseases needed as a first clue. To look for genes only when they go wrong is like trying to work out the principles of the internal combustion engine from car breakdowns. Now, the machine itself can be dismantled and its mechanism inferred directly.
When a gene makes something, it generates a complementary molecule — a messenger, as it is known — which transfers information from DNA to the main part of the cell. Because it produces nothing, most DNA generates no messengers at all. To find such molecules is hence an excellent way to search out working genes. There are tens of thousands of distinct messengers. What most do is quite unknown. In most cells, most are switched off but in the brain a large proportion are at work at any time. The brain is more active than is any other tissue (which may help to explain why more than a quarter of all inherited diseases lead to mental illness).
The hunt for genes is more like that for Timbuctu than for El Dorado. The mappers soon found that genes are oases of sense in a desert of nonsense. At one time, it seemed scarcely worth sifting the sands between the genetic cities, but, in the end, the complete map was made mainly on the grounds that it was worth while as one never knows what might turn up. It reaffirmed one of the most misunderstood facts in science; that it is possible to solve most problems by throwing money at them.
The assault on the physical map is best compared to surveying a country with a six-inch ruler, starting at one end and driving on to the opposite frontier. Twenty and more years ago, when the job began, one person could do about five thousand DNA bases a year. Now, it is routine to do thousands of times as many. Much of the intellectual effort of the job has moved from the simple accumulation of information to understanding it. Computer wizardry has played as important a part in the gene map as has biochemical machinery.
Once a segment of DNA has been sequenced, the local maps — the town plans — must be put in the right order. One way to build up a larger chart is to make a series of overlapping sequences of short pieces of DNA. The approach is a little like putting pages ripped out of a street guide back together by looking at the overlaps at the edge of each page in an attempt to find streets which run into each other. Sophisticated programs look for superimposed segments, long or short, and reassemble the torn fragments of DNA. That is much harder than it seems. An alphabet of just four letters and — like the map of an American city — many repeats of the same pattern of streets, gives plenty of chances for confusion. There are some short cuts. One trick, useful in the early days, was to jump several pages in the guide in the hope of missing out particularly tedious parts of the neighbourhood but for completion even the dullest parts of town must be charted.
New and powerful computers have made it possible, in principle at least, to nuikc.1 wholr genetic atlas at once, rather than piecing it together page by page. The 'random shotgun' approach lives up to its name. It blasts topics of the genome into thousands of segments, again and again, and, like a taxidermist rebuilding a single pheasant from the casual slaughter of many by a blind man with a twclvc-bore, reconstitutes the whole thing from scratch. A giant program puts all the shattered pieces together, until at last they look like a map (or a game-bird). That approach worked well in fruit-flies, whose genome was sequenced before that of our own, but flies have a tenth as many DNA letters and far less repetition of easily-confused short sequences than we do. The less audacious 'clone by clone' approach takes tiny fragments (each about a twenty-thousandth of the whole of human DNA) and sequences them one by one. Then, it reassembles short segments of genes and, in time, re-forms the whole atlas. The approach, plodding as it may be, has worked well with humans and was used by the publicly-funded mappers to publish each clone as it appeared and to help thwart the privatised plan to sequence (and patent) the whole of our DNA at one fell swoop.
The physical map does not look at all like the linkage maps which emerged from family studies. The central difficulty is one of scale. A few tens of thousands of functional genes fit into three thousand million DNA letters. As most genes use only the information coded into several thousand bases there seems to be far more DNA thanT is needed. Mapping shows that just one part in twenty represents part of a gene. Our genome has an extraordinary and quite unexpected structure.
A geographical analogy may help. Imagine the journey along the whole of your own DNA as a trip from Land's Knd to John o'Groat's via London; about a thousand miles altogether. To fit in all the DNA letters into a road map on this scale, there have to be fifty DNA bases per inch, or about three million per mile. The journey passes through twenty-three counties of different sizes. These administrative divisions, conveniently enough, are the same in number as the twenty-three chromosomes into which human DNA is packaged. With the exception of some short segments a few hundred yards long which, for various technical reasons, have proved recalcitrant, the whole lot has been mapped out with an accuracy of one part in fifty thousand — an inch in a mile (which is as good or better than the maps sold by the Ordnance Survey).
The scenery for most of the trip is tedious. Like much of modern Britain it seems to be unproductive. About a third of the whole distance is covered by repeats of the same message. Fifty miles, more or less, is filled with words of five, six or more letters, repeated next to each other. Many are palindromes. They read the same backwards as forwards, like the obituary of Ferdinand de Lesseps — lA man, a plan, a canal: Panama!' Some of these 'tandem repeats' are scattered in blocks all over the genome. The position and length of each block varies from person to person. The famous 'genetic fingerprints', the unique inherited signature used in forensic work, depend on variation in the number and position of such segments. Other repeated sequences involve just the two letters, C and A, multiplied thousands of times while yet more are remnants of ancient viruses. Large sections of the genome are given over to long and complicated messages that seem to say nothing.