The roles of aridification and sea level changes in the diversification and persistence of freshwater fish lineages
The process of publishing science is a lengthy one – there are many rounds of revisions, assessments, and review required before a paper can be published. With that, I’m very proud to announce that the first paper from my PhD has recently been published in the journal Molecular Ecology. This paper is a collection of a lot of complex analyses, and addressing some relatively complicated biogeographical questions, so I’ve decided to provide a simplified summary here.
This is partly where the concept of a ‘model’ comes into it: it’s much easier to pick a particular species to study as a target, and use the information from it to apply to other scenarios. Most people would be familiar with the concept based on medical research: the ‘lab rat’ (or mouse). The common house mouse (Mus musculus) and the brown rat (Rattus norvegicus) are some of the most widely used models for understanding the impact of particular biochemical compounds on physiology and are often used as the testing phase of medical developments before human trials.
The idea of using the genetic sequences of living organisms to understand the evolutionary history of species is a concept much repeated on The G-CAT. And it’s a fundamental one in phylogenetics, taxonomy and evolutionary biology. Often, we try to analyse the genetic differences between individuals, populations and species in a tree-like manner, with close tips being similar and more distantly separated branches being more divergent. However, this runs on one very key assumption; that the patterns we observe in our study genes matches the overall patterns of species evolution. But this isn’t always true, and before we can delve into that we have to understand the difference between a ‘gene tree’ and a ‘species tree’.
However, a phylogenetic tree based on a single gene only demonstrates the history of that gene. What we assume in most cases is that the history of that gene matches the history of the species: that branches in the genetic tree mirror when different splits in species occurred throughout history.
The easiest way to conceptualise gene trees and species trees is to think of individual gene trees that are nested within an overarching species tree. In this sense, individual gene trees can vary from one another (substantially, even) but by looking at the overall trends of many genes we can see how the genome of the species have changed over time.
One of the most prolific, but more complicated, ways gene trees can vary from their overarching species tree is due to what we call ‘incomplete lineage sorting’. This is based on the idea that species and the genes that define them are constantly evolving over time, and that because of this different genes are at different stages of divergence between population and species. If we imagine a set of three related populations which have all descended from a single ancestral population, we can start to see how incomplete lineage sorting could occur. Our ancestral population likely has some genetic diversity, containing multiple alleles of the same locus. In a true phylogenetic tree, we would expect these different alleles to ‘sort’ into the different descendent populations, such that one population might have one of the alleles, a second the other, and so on, without them sharing the different alleles between them.
If this separation into new populations has been recent, or if gene flow has occurred between the populations since this event, then we might find that each descendent population has a mixture of the different alleles, and that not enough time has passed to clearly separate the populations. For this to occur, sufficient time for new mutations to occur and genetic drift to push different populations to differently frequent alleles needs to happen: if this is too recent, then it can be hard to accurately distinguish between populations. This can be difficult to interpret (see below figure for a visualisation of this), but there’s a great description of incomplete lineage sorting here.
Hybridisation and horizontal transfer
Another way individual genes may become incongruent with other genes is through another phenomenon we’ve discussed before: hybridisation (or more specifically, introgression). When two individuals from different species breed together to form a ‘hybrid’, they join together what was once two separate gene pools. Thus, the hybrid offspring has (if it’s a first generation hybrid, anyway) 50% of genes from Species A and 50% of genes from Species B. In terms of our phylogenetic analysis, if we picked one gene randomly from the hybrid, we have 50% of picking a gene that reflects the evolutionary history of Species A, and 50% chance of picking a gene that reflects the evolutionary history of Species B. This would change how our outputs look significantly: if we pick a Species A gene, our ‘hybrid’ will look (genetically) very, very similar to Species A. If we pick a Species B gene, our ‘hybrid’ will look like a Species B individual instead. Naturally, this can really stuff up our interpretations of species boundaries, distributions and identities.
This can have a profound impact as paralogous genes are difficult to detect: if there has been a gene duplication early in the evolutionary history of our phylogenetic tree, then many (or all) of our study samples will have two copies of said gene. Since they look similar in sequence, there’s all possibility that we pick Variant 1 in some species and Variant 2 in other species. Being unable to tell them apart, we can have some very weird and abstract results within our tree. Most importantly, different samples with the same duplicated variant will seem similar to one another (e.g. have evolved from a common ancestor more recently) than it will to any sample of the other variant (even if they came from the exact same species)!
Overcoming incongruence with genomics
Although a tricky conundrum in phylogenetics and evolutionary genetics broadly, gene tree incongruence can largely be overcome with using more loci. As the random changes of any one locus has a smaller effect of the larger total set of loci, the general and broad patterns of evolutionary history can become clearer. Indeed, understanding how many loci are affected by what kind of process can itself become informative: large numbers of introgressed loci can indicate whether hybridisation was recent, strong, or biased towards one species over another, for example. As with many things, the genomic era appears poised to address the many analytical issues and complexities of working with genetic data.
There are a massive number of potential traits we could focus on, each of which could have a large number of different (and interacting) impacts on evolution. One that is often considered, and highly relevant for genetic studies, is the influence of dispersal capability.
Dispersal is essentially the process of an organism migrating to a new habitat, to the point of the two being used almost interchangeably. Often, however, we regard dispersal as a migration event that actually has genetic consequences; particularly, if new populations are formed or if organisms move from one population to another. This can differ from straight migration in that animals that migrate might not necessarily breed (and thus pass on genes) into a new region during their migration; thus, evidence of those organisms will not genetically proliferate into the future through offspring.
As these individuals occupy large ranges, localised impacts are unlikely to critically affect their full distribution. Individual organisms that are occupying an unpleasant space can easily move to a more favourable habitat (provided that one exists). Furthermore, with a large population (which is more likely with highly dispersive species), genetic drift is substantially weaker and natural selection (generally) has a higher amount of genetic diversity to work with. This is, of course, assuming that dispersal leads to a large overall population, which might not be the case for species that are critically endangered (such as the cheetah).
A large number of species, however, are likely to occupy a more intermediate range of dispersal ability. These species might be able to migrate to neighbouring populations, or across a large proportion of their geographic range, but individuals from one end of the range are still somewhat isolated from individuals at the other end.
Species with low dispersal capabilities are often at risk of local extinction and are unable to easily recolonise these habitats after the event has ended. Their movement is often restricted to rare environmental events such as flooding that carry individuals long distances despite their physiological limitations. Because of this, low dispersal species are often at greater risk of total extinction and extinction vertices than their higher dispersing counterparts.
Accounting for dispersal in population genetics
Incorporating biological and physiological aspects of our study taxa is important for interpreting the evolutionary context of species. Dispersal ability is but one of many characteristics that can influence the ability of species to respond to selective pressures, and the context in which this natural selection occurs. Thus, understanding all aspects of an organism is important in building the full picture of their evolution and future prospects.