Some Assembly Required, Neil Shubin (review) [Long]
A shorter version of this review has appeared on 3 Quarks Daily.
This book will be of interest to anyone who is interested in the way in which evolution actually proceeds, and the insights that we are now gaining into the genome, which controls the process. The author, Neil Shubin, has made major contributions to our understanding, using in turn the traditional methods of palaeontology and comparative anatomy, and the newer methods of molecular biology that have emerged in the last few decades. He is writing about subject matter that he knows intimately, often describing the contributions of scientists that he knows personally. Like Shubin’s earlier writings, the book is a pleasure to read, and I was not surprised to learn here that Shubin was a teaching assistant in Stephen Jay Gould’s lectures on the history of life.
Shubin is among other things Professor of Organismal Biology and Anatomy at the University of Chicago. He first came to the attention of a wider public for the discovery of Tiktaalik, completing the bridge between lungfish and terrestrial tetrapods, and that work is described and placed in context in his earlier book, Your Inner Fish. The present volume is an overview, from his unique perspective, of our understanding of evolutionary change, from Darwin, through detailed palaeontological studies, and into the current era of molecular biology, a transition that, as he reminds us, parallels his own intellectual evolution.
18 March 2020, this just in: https://edition.cnn.com/2020/03/18/world/fish-finger-fossil-scn/index.html Fish finger fossils show the beginnings of hands;
In addition to the underlying science narrative, we have a wealth of biographical detail regarding those involved in the discoveries being discussed. Time and again, Shubin builds in interesting and unexpected biographical details about those responsible for the major discoveries that he describes, placing both discovery and discoverer in their scientific and cultural context. These details are not mere embroidery, but an integral part of his exposition. For example, I was aware that Linus Pauling, with Emil Zuckerkandl, was a pioneer in the use of sequence differences as a molecular clock, but did not know how this related to Pauling’s interest in radiation damage to proteins, a topic that brought together his scientific and political concerns.
The exposition is clear enough to be followed by readers without background scientific training, but the range of topics discussed, the choice of illustrative details, and the historical and biographical background are such that I would expect even experts to find much in this book to inform and delight. The endnotes, as well as providing leading references and background material of interest to those who wish to dig deeper, add numerous interesting details worthy of the attention of any reader.
We start with a blackboard sketch from the author’s student days; a single arrow, reflecting what was known at the time, connecting a fish to an amphibian. But how could this possibly happen? There are so many things that have to change. Fins have to become legs, and breathing has to change from gills extracting oxygen from water, to lungs breathing air. But neither of these changes seems possible unless the other one has already taken place. Worse, we can understand that natural selection would refine the structures of legs and lungs, but what use would they be to a fish, and so how did they get started in the first place? To quote (as Shubin quotes) one of the strongest objections offered to Darwin’s theory when first presented, how can we get past “The incompetency of natural selection to account for the incipient stages of useful structures”?
Darwin paid great attention to such criticisms, and even added a whole chapter (“Miscellaneous Objections to the Theory of Natural Selection”) to the sixth edition of On the Origin of Species. His answer to the problem of incipient stages is encapsulated within five words: “This subject is intimately connected with that of the gradation of the characters, often accompanied by a change in function.” For example, as shown clearly by embryological comparisons in 1895, the lungs of the lungfish and its terrestrial descendants, and the swim bladder used by other fish to control their density and position in the water, start out in exactly the same way, and we now know that their development is controlled by closely related genes. So the lungs first appeared, not in terrestrial vertebrates to which they are essential, but in fish that were already able to absorb oxygen from water through their gills. Likewise, we can regard tetrapod limbs as derived from fins, and as having had the early function of supporting part of the weight of the fish in shallow water.
Shubin quotes an observation made by the playwright Lillian Hellman, “Nothing, of course, begins at the time you think it did.” On inspection, the roots always live far deeper. To which I would add an observation of my own; nothing that is at the time essential can arise by evolution. For if it had been essential, the organism could not have survived to that stage without possessing it. This is in the spirit of Lillian Hellman’s observation, since everything that is now essential must have first arisen at a stage when it was possible to live without it. Your backbone is essential; its Cambrian precursor tissue was not.
The problem of incipient stages is even more obvious in the case of birds, which TH Huxley had seen, as early as the 1860s, as related to dinosaurs and alligators. Archaeopteryx, described too late for the first edition of Origins but briefly mentioned in the fourth, was clearly in some sense intermediate between a modern bird and a reptile, but what kind of reptile? Dinosaurs were back then thought of as lumbering smooth-skinned quadrupeds, very different from birds with their speed, slender frame, and feathered bodies. More recent work led to the discovery of light-boned bipedal dinosaurs, covered in feathers, whose tracks showed rapid gait. Flight feathers seem to have arisen by progressive modification of a downy covering, which would have evolved towards their present elaborate form as an aid to gliding, while wings of course derived from modification of the forelimbs of bipedal dinosaurs. Bird wings show the signature structure of all tetrapod limbs; one bone, two bones, wrist bones, digits, although the digits are now vestigial. Birds also, I would add, have a highly efficient respiratory system, where air is drawn in through the lungs but exhaled through air sacs. Again, this did not evolve to meet the high metabolic demands of flight, since many relevant features of the anatomy are shared with fossil dinosaurs and even with alligators, and may have been relevant to the success of the dinosaurs in the relatively oxygen-poor atmosphere of the Triassic. As Shubin puts it, the changes required for such transitions as water to land, or the development of flight, must already have been in place, or the transition would not have been possible. The crucial steps do not involve the development of new features, but the repurposing1 of what is already in place.
Repurposing also occurs during development. The embryo gill arches, which in fish become, as their name suggests, gills and gill slits, develop in tetrapods to become the lower jaw and throat, while the notochord, present in sharks and fish as connective tissue and identifiable as such in the vertebrate embryo, gives rise to the discs between the vertebrae. This is not, as once imagined, recapitulation of earlier evolutionary stages, but the sharing of stages of development. To understand what is happening, we need to understand how development is controlled, and development is controlled, like so much else, by DNA. And it is this DNA that provides the theme of the final two thirds of the book.
Shubin first takes us through some well-travelled territory. Genetic information is held in DNA, there is a lot of it (with humans, roughly 3.2 billion units, each of which can contain any one out of 4 so-called bases, identifying initials A, T, G, C), and most sexual organisms including us have two copies of each stretch of DNA, one from the male parent and one from the female. This full set is present in almost every cell in our body.2 The most obvious function of DNA is to code for proteins, large molecules constructed by joining together amino acids, and the genetic code specifies which amino acid corresponds to which segment of DNA.3 Mutations are, at their simplest, copying errors in DNA, and such a change may have major effects on the protein derived from a gene. For example, one single mutation in the relevant gene is enough to cause a change in the shape of the haemoglobin molecule, giving rise to sickle cell anaemia.
Before DNA sequencing became available, Zuckerkandl and Pauling had realized that accumulated mutations would give rise to differences in the proteins found in different species, and that the degree of difference was a measure of the length of time that had passed since those species shared a common ancestor. By 1965,4 they had developed this technique to the point where they could validate the method, by showing that the split in ancestry between horses and cattle was more recent than that between hoofed mammals and humans. They were also able to show that two different major portions of the haemoglobin molecule were related to each other by a gene duplication that had already taken place in the common ancestor of all three species.
Sequencing proteins is tedious, but gross differences can be detected by how quickly they move through a gel under the influence of an electric field, since this depends on the size and charge of the protein molecule. This technique found no difference between these properties of human and chimpanzee protein, an initially surprising result that is now explained by the fact that human and chimp DNA are 95% to 98% similar. Indeed, the difference between humans and chimps is less than the difference between some species of fruit fly.
Protein sequencing has long since been superseded by DNA sequencing. Over the past two decades, DNA sequences have been obtained and published for numerous species, and by comparing these sequences we can construct evolutionary family trees that turn out to confirm, sometimes in great detail, what was already known from the fossil record and comparative anatomy. We can also pick out the sequences corresponding to particular proteins, and see how these have changed. As Zuckerkandl and Pauling had already realized in 1965, some kinds of change occur more often than others, depending on whether they affect properties important to protein function. One surprise is that only some 2% of the DNA directly codes for proteins. A lot of it must be junk, since there is no connection between the amount of DNA, and the complexity of the organism.5 One other interesting question also arises. How is it that a brain cell, a liver cell, and a skin cell develop in such extremely different ways, since they all contain exactly the same DNA?
It turns out that genes can be switched on and off depending on the environment that a cell finds itself in. One relatively simple example is the bacterium E.coli, of which you have huge numbers in your gut. This bacterium can digest either one of two different sugars, glucose or lactose, but each sugar requires a different enzyme, and it would be a waste of resources to make what is not required. It turns out that among the proteins present in E. coli are molecular switches, responsive to the presence of these sugars. When one of these sugars is present, the trigger turns on the gene that makes the enzyme required to digest that it. In other words, genes don’t automatically come into play, but need to be activated. In development, these activators are produced by genes which are themselves responsive to their environment, giving rise to complex networks that cause different development-controlling genes to be turned on at different stages. If that sounds complicated, that’s because it is. The molecular mechanisms of development, and how they themselves evolved, are among the most exciting areas of current research, and Shubin devotes considerable space to describing for us the current state of play.
Left: Millipede (Harpaphe haydeniana); note repeat segmented structure
Some features turn out to be common across the entire animal kingdom. Worms have segments, insects and other arthropods have segments, and at a very fundamental level our backbone is a series of segments. One common theme among segments is the ability to develop more or less similar appendages, such as the extremely similar appendages of millipedes, the very similar (at a deep structural level) limbs of tetrapods, and the remarkably diverse sequences of appendages of some of the creatures that Shubin discusses. The genes controlling development are placed close together, like beads on a string, and using gene-editing techniques, specific genes can be turned switched off one at a time, making possible to identify the developmental features that each one controls. These genes in worms, crabs and insects are directly related to the Hox genes of vertebrates, although vertebrates carry four separate sets, each one on a different chromosome. The Hox genes that control the development of separate sections of the spine also control the development of limbs. The one bone – two bones – wrist bones – digits pattern that is almost universal among mammals6 relies, again, on this same set of genes. Block a particular gene from acting in a developing mouse’s tail, and you will get a tail-less mouse. Stop the same gene from acting in the developing limbs, and you get a mouse with upper and lower legs, but no feet. Fish don’t have feet, but they do have Hox genes, and the same genes that are active in the development of toes are also active in the development of the terminal ends of fishes’ fins. Everywhere, we see the redeployment of similar subroutines.
We are not told (this is my only serious criticism of the book) how it is possible for us to tell what gene is active where, or to block the action of a specific gene, either in general or at some specific location within the body, but the chapter notes contain copious references, and I suspect that the techniques are too complex for brief description to a general audience.
Similar surprises emerge when we examine the structure of the genetic material itself. Long before the advent of DNA sequencing, we could learn a great deal about this simply from the appearance of the chromosomes in which it is embedded, which show patterns of light and dark stripes. As early as 1936, Calvin Bridges was able in this way to relate a mutation in the eye development of fruit flies to duplication of one specific stretch of chromosomal material, or as we would now say, of the flies’ DNA. Simply weighing photographs of chromosomes turns out to be surprisingly informative. The way chromosomes are organized in mammals is highly variable. We have 23 pairs of chromosomes, while the black rhinoceros has 84. However, the total amount of chromosomal material varies relatively little among mammals. It’s very different with salamanders, where superficially similar species can differ in their amounts of DNA by a factor of 10, the more extensive material deriving, as examination of the banding pattern showed, from repeated duplications.
A sperm or ovum normally contains just half as much genetic material as the parental cell (a single copy of the genome, generated from combining sections of the two copies present in the parent). But sometimes this process goes wrong, so it gets more (or less) than it should, and if it gets fertilized would pass on that excess or deficit to the offspring. Down syndrome is a condition that occurs in humans as the result of an extra copy of one particular chromosome. The existence of two separate subunits within the haemoglobin molecule is, as mentioned earlier, the result of a duplication older than the split between primates and hoofed animals. The fourfold repetition of vertebrate Hox genes, also mentioned earlier, is the result of two episodes in each of which the entire genome was duplicated. Colour vision is a much quoted example of the possibilities created by gene duplication. Old world monkeys and their close relatives (including us) have three-colour vision, unlike the two-colour vision of most mammals, because of duplication of the gene for green-sensitive visual pigment. The new copy was then free to pursue its own evolutionary path, and was selected for sensitivity to red. Human and chimpanzee genomes are 95% to 98% similar, and the most obvious difference between humans and chimps, a three-fold increase in the size of thehuman brain, seems to be directly related to the three-fold replication of one particular control gene.
Sequencing shows that multiple repetition is extremely common. There is one fragment of DNA, around 300 bases long, known as ALU, that occurs so often in the genomes of all primates that in humans it makes up 13% of the total. There are other commonly repeated fragments, with no known function, and two thirds of the entire human genome is made up of these. Two particular fragments, ALU and LINEI, which between them make up a third of the human genome, have the additional property of jumping around from place to place within the genome, which may be enhancing their ability to replicate themselves wherever they land. Such jumping genes were first discovered by Barbara McClintock, in work dating back to the 1940s, although it was decades before its full significance was acknowledged. But replicating huge amounts of useless material must have a metabolic cost, and genomes must have a way of protecting themselves against parasitic duplicators, or else this duplication would be killing species in much the same way that unregulated duplication of cells kills cancer patients.
We can understand major changes in body form, even such major changes as those between fish and land animal, or between running dinosaur and bird, in terms of a sequence of small interlocking adaptations. The appearance of a new kind of cell, such as bone cells, presents a much more difficult problem, since it entails an entire new suite of proteins, so that it looks as if many different things would have had to happen at once.
The placenta, found of course in humans and in almost all mammals other than marsupials, is a striking example. It contains cells of a very specific kind, decidual stromal cells, which are produced from cells known as fibroblasts under the influence of the pregnancy hormone progesterone. These cells play an essential role in protecting the pregnancy from the mother’s own immune system. It turns out that there are several hundred genes that are active in the decidual stromal cells, but inactive (although of course they must be present) in the fibroblasts, and these genes can be recognized by the fact that they contain a sequence responsive to progesterone. That sequence also resembles a sequence with the ability to jump, which explains its presence in so many different locations in the genome. However, one part of the sequence, essential to jumping, has been deleted. With the progesterone-responsive sequences now tethered in place, the genes into which they have been inserted can undergo the normal processes of evolution.
Pregnancy holds other surprises. Proper development of the placenta requires a protein known as syncytin, which regulates transport of materials between the mother and the fetus. From the structure of the protein, we can infer the sequence of the DNA in the gene that regulates its production, and see where else similar genetic material occurs, since this will give information about its evolution. The surprising result is that the only other place that such a sequence occurs is in a virus, where it has a role in promoting the adhesion of host cells, so that the virus can transmit itself from an infected cell to its neighbours.7 The human version of the gene is recognizably the same as that in all other primates, while rodents have their own rather different version. So we are witnessing two separate cases of a virus being co-opted by the host genome, once in a common ancestor of all primates, and once in a common ancestor of all rodents. A virus-derived gene is also involved in the formation of human memories; I do not know how widely this one is shared throughout the animal kingdom.
How predictable is evolution? More than might be imagined. Parasites with different ancestries end up looking very similar to each other, having retained the same set of organs still essential to their lifestyle, and discarded the others. Salamanders have an extremely elaborate mechanism (Shubin goes into details) for shooting out their sticky tongues at insect targets. The evolution of this process involves complete loss of one muscle, and the re-purposing of gill bones to form a projectile, but this has happened independently at least three times in types of salamander that live far apart from each other and are only distantly related. This poses a problem for biologists; when are similarities a result of common descent, and when are they merely the product of parallel evolution? Are there even, despite the huge diversity of life, general rules that guide evolution?
Some generalisations arise from physical forces. Animals living in cold regions generally have smaller appendages (limbs, ears, snouts, tails) than their relatives in more equitable climates, and tend to grow to larger size, all of which makes sense in terms of reducing heat loss in relation to body mass. But are there other propensities built into the evolutionary process itself?
One of Shubin’s own early research projects involved examining the internal foot structure of a thousand salamanders that had frozen in a cold snap, by using a dye that stuck to cartilage and immersing in glycerin to make the flesh transparent. The variations he found corresponded to changes found during development. Where (as often happens) digits are lost, the first digits to be lost were always the last to develop. Where (another common variant) bones were joined together, they were bones that bud off from each other during development. These variations, occasionally occurring between individuals, also resembled variations found between species. With hindsight this should not surprise us. If similar control genes are involved in the development of the same organ in different species, then similar mutations will give rise to similar evolutionary changes. There is a certain repertoire of possible changes, so we can expect similar changes to occur independently in separate lines of descent, under the influence of similar environmental pressures.
We can see this with lizards on different Caribbean islands, where canopy living lizards on different islands resemble each other more than they do ground living lizards on the same island, although DNA sequences show that the canopy and ground lizards on any one island are more closely related to each other than to any lizards on other islands, having arisen from the same small original population of colonists. We can see such parallel evolution on a grand scale when we compare the marsupial mammals that made their way to Australia with the placental mammals that now dominate the other continents. Shubin mentions a marsupial flying squirrel, a marsupial mole, a marsupial ground cat, and among extinct species marsupial lions, wolves, and tigers.
Left: a Flying squirrel. Below, right, a squirrel glider, its marsupial look-alike, but actually a relative of the opossum.
Towards the end of the book, Shubin turns his attention to what he terms “Mergers and acquisitions”. Here the most important work was carried out by Lynne Margulis in the 1960s, although others had made similar suggestions during earlier decades. The kind of cell found in all animals, plants and fungi is quite a complicated structure, with a nucleus containing nearly all the cell’s DNA, and little compartments, organelles, each with its own boundary membrane and its own small chromosome. Moreover, while the chromosomes in the nucleus are packed into bundles, the organelle chromosomes are simple loops, reminiscent in appearance to the chromosomes of bacteria. Margulis’ insight was that this happened because in a very real sense, the organelles were bacteria, which had been engulfed and domesticated. This radical notion was not generally accepted until the deployment of DNA sequencing, when it became clear that nuclear DNA and the DNA in organelles occupy very different positions near the base of the evolutionary tree. The deepest branching in that tree is between two different kinds of single-celled organism, known technically as eubacteria (bacteria for short) and archaea. Our nucleus is descended from archaea, but our mitochondria, in which much of the work of metabolism actually takes place, come from bacteria. Mergers like this have happened more than once in the history of life. Green plants contain two kinds of organelle, mitochondria and the green organelles that are responsible for photosynthesis. These organelles are descended from cyanobacteria, commonly known by the rather misleading name of blue-green algae.
Cyanobacteria are ancient. We can see the isotopic signature of photosynthesis8 in scraps of carbon within very ancient rocks. The oxygen that they generated would have presented both challenges and opportunities to other ancient lifeforms. The incorporation of mitochondria, which use reactions between oxygen and organic molecules to produce energy, within cells capable of producing many different kinds of protein, is what made complex multicellular life possible.
The individual cells of complex life forms are complex assemblages. The individual organisms are assemblages of such complex cells. But what is the origin of these higher level assemblages, which require a range of control genes, and specialized connective tissues?
Single-celled creatures known as choanoflagellates clump together, and their genome already shows the presence of genes that make collagen and sticky proteins. Here they are used to ward off predators, or to attach food particles. Once again, something absolutely essential at a later stage, the molecular scaffolding of complex organisms, had its origin in something much earlier.
So bodies are assemblages of assemblages, and each component has been selected for its ability to reproduce itself, otherwise would have gone extinct. And at every level, then, there is the need for monitoring and control, or our bodies would rapidly be taken over by cancer cells and our genomes by jumping genes. All these interactions generate the complexities that are the raw material of evolution.
Shubin concludes with one final example. A strange pattern was discovered in salt-loving bacteria. A short palindromic sequence (one that reads the same forwards and backwards) occurred multiple times in the genome, with the space between repetitions filled by sequences resembling those in viruses to which the bacteria had become resistant. The palindrome-space-palindrome assemblage was acting as a scalpel, cutting up invading viral nucleic acid. This system has now been refined into the molecular editing technique known as CRISPR, an extremely powerful tool for DNA editing, already in use in laboratories throughout the world. A system developed by one of the simplest of organisms to protect itself from an even simpler invader is now being used by the most complex organism on Earth to carry out feats of DNA modification that, in nature, would take millions of years. The ultimate in repurposing.
1] The formal term is exaptation, but I prefer Shubin’s jargon-free replacement.
2] Red blood cells, which function essentially as passive carriers of haemoglobin, are an exception
3] As many readers will know, there are some 20 different amino acids, each of which is coded for by one or more of the 64 possible different DNA triplets.
4] Evolutionary Divergence and Convergence in Proteins, Emile Zuckerkandl and Linus Pauling, Evolving Genes and Proteins, 1965, Pages 97-166, and earlier papers.
5] My favourite example (Gregory’s onion test, proposed by T. Ryan Gregory); an onion has five times as much DNA as a human. Gregory, T.R. (2007). The onion test. Genomicron, http://www.genomicron.evolverzone.com/2007/04/onion-test/.
6] Porpoise embryos develop hind leg buds, but these failed to develop, and it has recently been shown https://www.nytimes.com/2020/02/08/science/horses-toes-hooves.html that five toes are evident in the developing horse embryo.
7] This was featured in the Alice Roberts – Aoife McLysaght 2018 Royal Institution Christmas Lectures https://www.rigb.org/christmas-lectures/2018-who-am-i/ .
8] Carbon has two stable isotopes, carbon-12 and carbon-13 (the short-lived isotope, carbon-14, used in radiocarbon dating, is not relevant here). CO2 containing carbon-12 is a fraction of a percent more reactive towards photosynthesis than that containing carbon-13, so as a result, carbonaceous material derived from photosynthesis is measurably depleted in the heavier isotope.
Disclosure: I have been guilty of piracy, albeit unintentionally. The copy of the book that I have been working from, which was obtained from an on line bookseller before publication date, turned out to be an uncorrected “not for sale” proof. This also means that I cannot comment on the index or illustrations, since my own copy lacks these.
Shubin image from University of Chicago website. Tiktaalik image from The Field Museum, Chicago, via Wikipedia, under Creative Commons license. Human embryo image from Gray’s Anatomy (1918) via Wikipedia. Millipede (Harpaphe haydeniana) image by Walter Siegmund via Wikipedia under Creative Commons license. Flying squirrel and squirrel glider images, public domain. Choanoflagellate rosette image by Dzhanette, public domain, via Wikipedia.
Posted on March 16, 2020, in Evolution, Fossil record, Science and tagged endogenous retrovirus, Hox genes, Molecular phylogeny, Neil Shubin, Some Assembly Required. Bookmark the permalink. 1 Comment.