Thursday, September 27, 2012

Endogenous Retroviruses: Fascinating Evidence for Evolution

Endogenous Retroviruses are without a doubt one of the coolest phenomena in all of biology. The human genome is made up of a string of some 3 billion "letters" (either A,T,C or G) using the alphabet of DNA. However, a significant portion of that string of letters (~8%) is not actually human DNA. Instead, it is composed of strands of virus DNA (chunks around 10,000 letters long) that have inserted themselves into our genome. How does this happen, and is there anything we can learn from it?

As you may have learned in high school biology, viruses are so effective because they hijack our own cellular machinery to spread themselves. A virus can be as simple as a strand of DNA or RNA (with a protective coat of protein) that contains the instructions for copying and making more of itself. When this virus gets inside one of your cells, your cell can't tell the difference between your genetic material and the viral genetic material. So your cellular machinery starts reading the viral DNA/RNA and follows the instructions. Unfortunately, instead of instructions that tell the cell to make more healthy human proteins, the instructions tell the cell to make more virus! Thousands of brand new viruses then exit that cell, proceed to infect other cells, and begin a chain reaction that can make you ill or worse as more and more cells get infected.

Sometimes the short strand of viral DNA floats around inside the nucleus of a cell on it's own, but sometimes it actually inserts itself into the massive length of letters that is human DNA (Only one specific type of virus does this: retroviruses, of which HIV is the most well-known example).
Retroviral DNA Insertion

Every once and a while, a combination of three rare events occurs in a retroviral infection:

1) The virus happens to affect a germ cell (either a sperm or an egg)
2) The virus inserts itself into the germ cell DNA without seriously harming the cell or the organism
3) An infected sperm or egg cell happens to be the one that combines with it's complimentary partner during sexual reproduction, creating a new organism.

When an individual is born from a cell infected with a retrovirus, that genetic material will be imbedded in all of his or her cells. It will actually have become part of the individual's genome. (This usually only happens when the virus is no longer infective for one or more reasons. Otherwise the germ cell or offspring organism would probably not survive!).

Finally, one more rare event has to occur for the retrovirus to move from being part of the genome of an individual to being part of the genome of a population:

4) The infected individual has to get lucky enough to pass his or her copy of the now "Endogenous" RetroVirus [ERV] to its descendants, and eventually throughout the entire population of the species.

Retroviral infections occur all the time. However, for all 4 of the above situations to occur, it takes an extremely lucky chance. Even in a population of millions or billions of individuals, it can take thousands of years for a single retrovirus to become endogenous to a species. But guess what? Life has been evolving for billions of years. Our DNA is absolutely riddled with ERVs.

This gives us a great chance to confirm the phylogenic tree of life that people have been working on for centuries. Previously, people had to guess which animals were more closely related to other animals by comparing their how similar they look. For example, we had a pretty good idea that chimps are more closely related to gorillas than they are to spider monkeys, because they look more closely related. Now, we can prove it with ERVs (We can prove it with lots of other genetic evidence as well, but ERVs are one of the coolest and most straightforward ways).

Imagine that a common ancestor of humans and chimpanzees was infected by an endogenous retrovirus. Over 6 million years, humans diverged from the chimpanzees, but the short (10,000 "letter" long strand of DNA) ERV is still unmistakable in my genome, your genome, and the genome of all chimpanzees, in precisely the same location. If not for the bold statement, one could easily argue that chimpanzees don't actually share an ancestor with us:

 "Couldn't chimpanzees and humans both have been separately infected by the same retrovirus?"

Sure, but the ERV in chimps wouldn't happen to have been randomly inserted between the exact same two letters in as the ERV in humans' 3 billion-letter genome, which is what we observe.

We don't have to be hypothetical about all of this. As it turns out, this exact scenario has happened tens of thousands of times. Most of that 8% of our genome made up of ERVs is shared with chimpanzees. There are of course a few ERVs that were inserted after our divergence with chimps, as would be expected, but that didn't happen too long ago in the geologic time scale.
Young Chimpanzees. Attribution : Delphine Bruyere, CC License
Humans and chimps share many ERVs. Gorillas also share many ERVs with us (see ERV1 in the following diagram), but slightly fewer than between chimps and us, because Gorillas diverged from our ancestors before chimps did. Humans and chimps shared the same common ancestors for a couple million more years, providing time for additional retroviral insertions that are now shared by both species (See ERV2 in the following diagram).
Hypothetical ERV Insertion in Great Apes

If the theory of evolution were a good theory, it would make testable predictions. A testable prediction of the theory of evolution would look something like this:

All ERVs that chimps share (in the same genetic location) with gorillas will also be shared with humans.

We can make such a claim because if a gorilla and chimp share an ERV (Like ERV1, the red ERV in the diagram above), it would have been acquired earlier than 8 million years ago by a common ancestor of chimps and gorillas. Since humans share the same common ancestor with gorillas that chimps do, humans should also share the same ERV. Of course, we now have the technology to test this prediction, and find it to be nearly always true. In fact, we can show the same thing with all sorts of organisms! House cats and tigers share tons of ERVs, many of which are not shared by humans, because they diverged from our lineage so long ago. However, any ERV that housecats share with humans will also be found in tigers!

I find ERVs fascinating not only because of the unique mechanism by which they become a part of us, but also because of the great confirmation they give us of the theory of evolution. There is simply no other explanation for the perfect pattern we see: closely related animals share a huge number of ERVs, while more distant cousins share fewer, and distant cousins never share ERVs in the same location that the closer cousins do not.

This is intended to be a high-level introduction to ERVs. A more thorough discussion would include an analysis of things like insertion locus specificity, excision of ERVs, ERVs acting as promoter regions or with other functionality, latent infectivity, etc. At this point I will leave it up to the reader to learn more about that elsewhere.  

References and further reading:

No comments:

Post a Comment