This is the fourth (and final) part of the miniseries on the genetics and process of speciation. To start from Part One, click here.
In last week’s post, we looked at how we can use genetic tools to understand and study the process of speciation, and particularly the transition from populations to species along the speciation continuum. Following on from that, the question of “how many species do I have?” can be further examined using genetic data. Sometimes, it’s entirely necessary to look at this question using genetics (and genomics).
Genetic tools to study species: the ‘Barcode of Life’
A classically employed method that uses DNA to detect and determine species is referred to as the ‘Barcode of Life’. This uses a very specific fragment of DNA from the mitochondria of the cell: the cytochrome c oxidase I gene, CO1. This gene is made of 648 base pairs and is found pretty well universally: this and the fact that CO1 evolves very slowly make it an ideal candidate for easily testing the identity of new species. Additionally, mitochondrial DNA tends to be a bit more resilient than its nuclear counterpart; thus, small or degraded tissue samples can still be sequenced for CO1, making it amenable to wildlife forensics cases. Generally, two sequences will be considered as belonging to different species if they are certain percentage different from one another.
Despite the apparent benefits of CO1, there are of course a few drawbacks. Most of these revolve around the mitochondrial genome itself. Because mitochondria are passed on from mother to offspring (and not at all from the father), it reflects the genetic history of only one sex of the species. Secondly, the actual cut-off for species using CO1 barcoding is highly contentious and possibly not as universal as previously suggested. Levels of sequence divergence of CO1 between species that have been previously determined to be separate (through other means) have varied from anywhere between 2% to 12%. The actual translation of CO1 sequence divergence and species identity is not all that clear.
Gene tree – species tree incongruences
One particularly confounding aspect of defining species based on a single gene, and with using phylogenetic-based methods, is that the history of that gene might not actually be reflective of the history of the species. This can be a little confusing to think about but essentially leads to what we call “gene tree – species tree incongruence”. Different evolutionary events cause different effects on the underlying genetic diversity of a species (or group of species): while these may be predictable from the genetic sequence, different parts of the genome might not be as equally affected by the same exact process.
A classic example of this is hybridisation. If we have two initial species, which then hybridise with one another, we expect our resultant hybrids to be approximately made of 50% Species A DNA and 50% Species B DNA (if this is the first generation of hybrids formed; it gets a little more complicated further down the track). This means that, within the DNA sequence of the hybrid, 50% of it will reflect the history of Species A and the other 50% will reflect the history of Species B, which could differ dramatically. If we randomly sample a single gene in the hybrid, we will have no idea if that gene belongs to the genealogy of Species A or Species B, and thus we might make incorrect inferences about the history of the hybrid species.
There are a number of other processes that could similarly alter our interpretations of evolutionary history based on analysing the genetic make-up of the species. The best way to handle this is simply to sample more genes: this way, the effect of variation of evolutionary history in individual genes is likely to be overpowered by the average over the entire gene pool. We interpret this as a set of individual gene trees contained within a species tree: although one gene might vary from another, the overall picture is clearer when considering all genes together.
In earlier posts on The G-CAT, I’ve discussed the biogeographical patterns unveiled by my Honours research. Another key component of that paper involved using statistical modelling to determine whether cryptic species were present within the pygmy perches. I didn’t exactly elaborate on that in that section (mostly for simplicity), but this type of analysis is referred to as ‘species delimitation’. To try and simplify complicated analyses, species delimitation methods evaluate possible numbers and combinations of species within a particular dataset and provides a statistical value for which configuration of species is most supported. One program that employs species delimitation is Bayesian Phylogenetics and Phylogeography(BPP): to do this, it uses a plethora of information from the genetics of the individuals within the dataset. These include how long ago the different populations/species separated; which populations/species are most related to one another; and a pre-set minimum number of species (BPP will try to combine these in estimations, but not split them due to computational restraints). This all sounds very complex (and to a degree it is), but this allows the program to give you a statistical value for what is a species and what isn’t based on the genetics and statistical modelling.
The end result of a BPP run is usually reported as a species tree (e.g. a phylogenetic tree describing species relationships) and statistical support for the delimitation of species (0-1 for each species). Because of the way the statistical component of BPP works, it has been found to give extremely high support for species identities. This has been criticised as BPP can, at time, provide high statistical support for genetically isolated lineages (i.e. divergent populations) which are not actually species.
Improving species identities with integrative taxonomy
Due to this particular drawback, and the often complex nature of species identity, using solely genetic information such as species delimitation to define species is extremely rare. Instead, we use a combination of different analytical techniques which can include genetic-based evaluations to more robustly assign and describe species. In my own paper example, we suggested that up to three ‘species’ of N. vittata that were determined as cryptic species by BPP could potentially exist pending on further analyses. We did not describe or name any of the species, as this would require a deeper delve into the exact nature and identity of these species.
As genetic data and analytical techniques improve into the future, it seems likely that our ability to detect and determine species boundaries will also improve. However, the additional supported provided by alternative aspects such as ecology, behaviour and morphology will undoubtedly be useful in the progress of taxonomy.
Given the strong influence of genetic identity on the process and outcomes of the speciation process, it seems a natural connection to use genetic information to study speciation and species identities. There is a plethora of genetics-based tools we can use to investigate how speciation occurs (both the evolutionary processes and the external influences that drive it). One clear way to test whether two populations of a particular species are actually two different species is to investigate genes related to reproductive isolation: if the genetic differences demonstrate reproductive incompatibilities across the two populations, then there is strong evidence that they are separate species (at least under the Biological Species Concept; see Part One for why!). But this type of analysis requires several tools: 1) knowledge of the specific genes related to reproduction (e.g. formation of sperm and eggs, genital morphology, etc.), 2) the complete and annotated genome of the species (to be able to find and analyse the right genes properly) and 3) a good amount of data for the populations in question. As you can imagine, for people working on non-model species (i.e. ones that haven’t had the same history and detail of research as, say, humans and mice), this can be problematic. So, instead, we can use other genetic information to investigate and suggest patterns and processes related to the formation of new species.
Is reproductive isolation naturally selected for or just a consequence?
A fundamental aspect of studies of speciation is a “chicken or the egg”-type paradigm: does natural selection directly select for rapid reproductive isolation, preventing interbreeding; or as a secondary consequence of general adaptive differences, over a long history of evolution? This might be a confusing distinction, so we’ll dive into it a little more.
The reproductive incompatibility of two populations (thus making them species) is often intrinsically linked to the genetic make-up of those two species. Some conflicts in the genetics of Population 1 and Population 2 may mean that a hybrid having half Population 1 genes and half Population 2 genes will have serious fitness problems (such as sterility or developmental problems). Dramatic genetic differences, particularly a difference in the number of chromosomes between the two sources, is a significant component of reproductive isolation and is usually to blame for sterile hybrids such as ligers, zorse and mules.
We can study the process of speciation in the natural world without focussing on the ‘reproductive isolation’ element of species identity as well. For many species, we are unlikely to have the detail (such as an annotated genome and known functions of genes related to reproduction) required to study speciation at this level in any case. Instead, we might choose to focus on the different factors that are currently influencing the process of speciation, such as how the environmental, demographic or adaptive contexts of populations plays a role in the formation of new species. Many of these questions fall within the domain of phylogeography; particularly, how the historical environment has shaped the diversity of populations and species today.
Although these can help answer some questions related to speciation, new tools are constantly needed to provide a clearer picture of the process. Understanding how and why new species are formed is a critical aspect of understanding the world’s biodiversity. How can we predict if a population will speciate at some point? What environmental factors are most important for driving the formation of new species? How stable are species identities, really? These questions (and many more) remain elusive for a wide variety of life on Earth.
This is Part 1 of a four part miniseries on the process of speciation; how we get new species, how we can see this in action, and the end results of the process. This week, we’ll start with a seemingly obvious question: what is a species?
The definition of a ‘species’
‘Species’ are a human definition of the diversity of life. When we talk about the diversity of life, and the myriad of creatures and plants on Earth, we often talk about species diversity. This might seem glaringly obvious, but there’s one key issue: what is a species, anyway? While we might like to think of them as discrete and obvious groups (a dog is definitely not the same species as a cat, for example), the concept of a singular “species” is actually the result of human categorisation.
In reality, the diversity of life is spread across a huge spectrum of differentiation: from things which are closely related but still different to us (like chimps), to more different again (other mammals), to hardly relatable at all (bacteria and plants). So, what is the cut-off for calling something a species, and not a different genus, family, or kingdom? Or alternatively, at what point do we call a specific sub-group of a species as a sub-species, or another species entirely?
This might seem like a simple question: we look at two things, and they look different, so they must be different species, right? Well, of course, nature is never simple, and the line between “different” and “not different” is very blurry. Here’s an example: consider that you knew nothing about the history, behaviour or genetics of dogs. If you simply looked at all the different breeds of dogs on Earth, you might suggest that there are hundreds of species of domestic dogs. That seems a little excessive though, right? In fact, the domestic dog, Eurasian wolf, and the Australian dingo are all the same species (but different subspecies, along with about 38 others…but that’s another issue altogether).
For example, a horse and zebra can breed to produce a zorse, however zorse are fundamentally infertile (due to the different number of chromosomes between a horse and a zebra) and thus a horse is a different species to a zebra. However, a German Shepherd and a chihuahua can breed and make a hybrid mutt, so they are the same species.
To try and account for the issues with the BSC, taxonomists try to push for the usage of “integrative taxonomy”. This means that species should be defined by multiple different agreeing concepts, such as reproductive isolation, genetic differentiation, behavioural differences, and/or ecological traits. The more traits that can separate the two, the greater support there is for the species to be separated: if they disagree, then more information is needed to determine exactly whether or not that should be called different species. Debates about taxonomy are ongoing and are likely going to be relevant for years to come, but form critical components of understanding biodiversity, patterns of evolution, and creating effective conservation legislation to protect endangered or threatened species (for whichever groups we decide are species).
As regular readers of The G-CAT are likely aware, my first ever scientific paper was published this week. The paper is largely the results of my Honours research (with some extra analysis tacked on) on the phylogenomics (the same as phylogenetics, but with genomic data) and biogeographic history of a group of small, endemic freshwater fishes known as the pygmy perch. There are a number of different messages in the paper related to biogeography, taxonomy and conservation, and I am really quite proud of the work.
To my honest surprise, the paper has received a decentamount of media attention following its release. Nearly all of these have focused on the biogeographic results and interpretations of the paper, which is arguably the largest component of the paper. In these media releases, the articles are often opened with “…despite the odds, new research has shown how a tiny fish managed to find its way across the arid Australian continent – more than once.” So how did they manage it? These are tiny fish, and there’s a very large desert area right in the middle of Australia, so how did they make it all the way across? And more than once?!
The Great (southern) Southern Land
To understand the results, we first have to take a look at the context for the research question. There are seven officially named species of pygmy perches (‘named’ is an important characteristic here…but we’ll go into the details of that in another post), which are found in the temperate parts of Australia. Of these, three are found with southwest Western Australia, in Australia’s only globally recognised biodiversity hotspot, and the remaining four are found throughout eastern Australia (ranging from eastern South Australia to Tasmania and up to lower Queensland). These two regions are separated by arid desert regions, including the large expanse of the Nullarbor Plain.
As one might expect, the formation of the Nullarbor Plain was a huge barrier for many species, especially those that depend on regular accessible water for survival. In many species of both plants and animals, we see in their phylogenetic history a clear separation of eastern and western groups around this time; once widely distributed species become fragmented by the plain and diverged from one another. We would most certainly expect this to be true of pygmy perch.
This is where the real difference between everything else and pygmy perch happens. For most species, we see only one east and west split in their phylogenetic tree, associated with the Nullarbor Plain; before that, their ancestors were likely distributed across the entire southern continent and were one continuous unit.
Not for pygmy perch, though. Our phylogenetic patterns show that there were multiple splits between eastern and western ancestral pygmy perch. We can see this visually within the phylogenetic tree; some western species of pygmy perches are more closely related, from an evolutionary perspective, to eastern species of pygmy perches than they are to other western species. This could imply a couple different things; either some species came about by migration from east to west (or vice versa), and that this happened at least twice, or that two different ancestral pygmy perches were distributed across all of southern Australia and each split east-west at some point in time. These two hypotheses are called “multiple invasion” and “geographic paralogy”, respectively.
So, which is it? We delved deeper into this using a type of analysis called ‘ancestral clade reconstruction’. This tries to guess the likely distributions of species ancestors using different models and statistical analysis. Our results found that the earliest east-west split was due to the fragmentation of a widespread ancestor ~20 million years ago, and a migration event facilitated by changing waterways from the Nullarbor Plain pushing some eastern pygmy perches to the west to form the second group of western species. We argue for more than one migration across Australia since the initial ancestor of pygmy perches must have expanded from some point (either east or west) to encompass the entirety of southern Australia.
So why do we see this for pygmy perch and no other species? Well, that’s the real mystery; out of all of the aquatic species found in southeast and southwest Australia, pygmy perch are one of the worst at migrating. They’re very picky about habitat, small, and don’t often migrate far unless pushed (by, say, a flood). It is possible that unrecorded extinct species of pygmy perch might help to clarify this a little, but the chances of finding a preserved fish fossil (let alone for a fish less than 8cm in size!) is extremely unlikely. We can really only theorise about how they managed to migrate.
What does this mean for pygmy perches?
Nearly all species of pygmy perch are threatened or worse in the conservation legislation; there have been many conservation efforts to try and save the worst-off species from extinction. Pygmy perches provide a unique insight to the history of the Australian climate and may be a key in unlocking some of the mysteries of what our land was like so long ago. Every species is important for conservation and even those small, hard-to-notice creatures that we might forget about play a role in our environmental history.