The space for species: how spatial aspects influence speciation

Spatial and temporal factors of speciation

The processes driving genetic differentiation, and the progressive development of populations along the speciation continuum, are complex in nature and influenced by a number of factors. Generally, on The G-CAT we have considered the temporal aspects of these factors: how time much time is needed for genetic differentiation, how this might not be consistent across different populations or taxa, and how a history of environmental changes affect the evolution of populations and species. We’ve also touched on the spatial aspects of speciation and genetic differentiation before, but in significantly less detail.

To expand on this, we’re going to look at a few different models of how the spatial distribution of populations influences their divergence, and particularly how these factor into different processes of speciation.

What comes first, ecological or genetic divergence?

One key paradigm in understanding speciation is somewhat an analogy to the “chicken and the egg scenario”, albeit with ecological vs. genetic divergence. This concept is based on the idea that two aspects are key for determining the formation of new species: genetic differentiation of the populations in question, and ecological (or adaptive) changes that provide new ecological niches for species to inhabit. Without both, we might have new morphotypes or ecotypes of a singular species (in the case of ecological divergence without strong genetic divergence) or cryptic species (genetically distinct but ecologically identical species).

The order of these two processes have been in debate for some time, and different aspects of species and the environment can influence how (or if) these processes occur.

Different spatial models of speciation

Generally, when we consider the spatial models for speciation we divide these into distinct categories based on the physical distance of populations from one another. Although there is naturally a lot of grey area (as there is with almost everything in biological science), these broad concepts help us to define and determine how speciation is occurring in the wild.

Allopatric speciation

The simplest model is one we have described before called “allopatry”. In allopatry, populations are distributed distantly from one another, so that there are separated and isolated. A common way to imagine this is islands of populations separated by ocean of unsuitable habitat.

Allopatric speciation is considered one of the simplest and oldest models of speciation as the process is relatively straightforward. Geographic isolation of populations separates them from one another, meaning that gene flow is completely stopped and each population can evolve independently. Small changes in the genes of each population over time (e.g. due to different natural selection pressures) cause these populations to gradually diverge: eventually, this divergence will reach a point where the two populations would not be compatible (i.e. are reproductively isolated) and thus considered separate species.

Allopatry_example
The standard model of allopatric speciation, following an island model. 1) We start with a single population occupying a single island.  2) A rare dispersal event pushes some individuals onto a new island, forming a second population. Note that this doesn’t happen often enough to allow for consistent gene flow (i.e. the island was only colonised once). 3) Over time, these populations may accumulate independent genetic and ecological changes due to both natural selection and drift, and when they become so different that they are reproductively isolated they can be considered separate species.

Although relatively straightforward, one complex issue of allopatric speciation is providing evidence that hybridisation couldn’t happen if they reconnected, or if populations could be considered separate species if they could hybridise, but only under forced conditions (i.e. it is highly unlikely that the two ‘species’ would interact outside of experimental conditions).

Parapatric and peripatric speciation

A step closer in bringing populations geographically together in speciation is “parapatry” and “peripatry”. Parapatric populations are often geographically close together but not overlapping: generally, the edges of their distributions are touching but do not overlap one another. A good analogy would be to think of countries that share a common border. Parapatry can occur when a species is distributed across a broad area, but some form of narrow barrier cleaves the distribution in two: this can be the case across particular environmental gradients where two extremes are preferred over the middle.

The main difference between paraptry and allopatry is the allowance of a ‘hybrid zone’. This is the region between the two populations which may not be a complete isolating barrier (unlike the space between allopatric populations). The strength of the barrier (and thus the amount of hybridisation and gene flow across the two populations) is often determined by the strength of the selective pressure (e.g. how unfit hybrids are). Paraptry is expected to reduce the rate and likelihood of speciation occurring as some (even if reduced) gene flow across populations is reduces the amount of genetic differentiation between those populations: however, speciation can still occur.

Parapatric speciation across a thermocline.jpg
An example of parapatric species across an environment gradient (in this case, a temperature gradient along the ocean coastline). Left: We have two main species (red and green fish) which are adapted to either hotter or colder temperatures (red and green in the gradient), respectively. A small zone of overlap exists where hybrid fish (yellow) occur due to intermediate temperature. Right: How the temperature varies across the system, forming a steep gradient between hot and cold waters.

Related to this are peripatric populations. This differs from parapatry only slightly in that one population is an original ‘source’ population and the other is a ‘peripheral’ population. This can happen from a new population becoming founded from the source by a rare dispersal event, generating a new (but isolated) population which may diverge independently of the source. Alternatively, peripatric populations can be formed when the broad, original distribution of the species is reduced during a population contraction, and a remnant piece of the distribution becomes fragmented and ‘left behind’ in the process, isolated from the main body. Speciation can occur following similar processes of allopatric speciation if gene flow is entirely interrupted or paraptric if it is significantly reduced but still present.

Peripatric distributions.jpg
The two main ways peripatric species can form. Left: The dispersal method. In this example, there is a central ‘source’ population (orange birds on the main island), which holds most of the distribution. However, occasionally (more frequently than in the allopatric example above) birds can disperse over to the smaller island, forming a (mostly) independent secondary population. If the gene flow between this population and the central population doesn’t overwhelm the divergence between the two populations (due to selection and drift), then a new species (blue birds) can form despite the gene flow. Right: The range contraction method. In this example, we start with a single widespread population (blue lizards) which has a rapid reduction in its range. However, during this contraction one population is separated from the main body (i.e. as a refugia), which may also be a precursor of peripatric speciation.

Sympatric (ecological) speciation

On the other end of the distribution spectrum, the two diverging populations undergoing speciation may actually have completely overlapping distributions. In this case, we refer to these populations as “sympatric”, and the possibility of sympatric speciation has been a highly debated topic in evolutionary biology for some time. One central argument rears its head against the possibility of sympatric speciation, in that if populations are co-occurring but not yet independent species, then gene flow should (theoretically) occur across the populations and prevent divergence.

It is in sympatric speciation that we see the opposite order of ecological and genetic divergence happen. Because of this, the process is often referred to as “ecological speciation”, where individual populations adapt to different niches within the same area, isolating themselves from one another by limiting their occurrence and tolerances. As the two populations are restricted from one another by some kind of ecological constraint, they genetically diverge over time and speciation can occur.

This can be tricky to visualise, so let’s invent an example. Say we have a tropical island, which is occupied by one bird species. This bird prefers to eat the large native fruit of the island, although there is another fruit tree which produces smaller fruits. However, there’s only so much space and eventually there are too many birds for the number of large fruit trees available. So, some birds are pushed to eat the smaller fruit, and adapt to a different diet, changing physiology over time to better acquire their new food and obtain nutrients. This shift in ecological niche causes the two populations to become genetically separated as small-fruit-eating-birds interact more with other small-fruit-eating-birds than large-fruit-eating-birds. Over time, these divergences in genetics and ecology causes the two populations to form reproductively isolated species despite occupying the same island.

Ecological sympatric speciation
A diagram of the ecological speciation example given above. Note that ecological divergence occurs first, with some birds of the original species shifting to the new food source (‘ecological niche’) which then leads to speciation. An important requirement for this is that gene flow is somehow (even if not totally) impeded by the ecological divergence: this could be due to birds preferring to mate exclusively with other birds that share the same food type; different breeding seasons associated with food resources; or other isolating mechanisms.

Although this might sound like a simplified example (and it is, no doubt) of sympatric speciation, it’s a basic summary of how we ended up with so many species of Darwin’s finches (and why they are a great model for the process of evolution by natural selection).

The complexity of speciation

As you can see, the processes and context driving speciation are complex to unravel and many factors play a role in the transition from population to species. Understanding the factors that drive the formation of new species is critical to understanding not just how evolution works, but also in how new diversity is generated and maintained across the globe (and how that might change in the future).

 

What’s the (allele) frequency, Kenneth?

Allele frequency

A number of times before on The G-CAT, we’ve discussed the idea of using the frequency of different genetic variants (alleles) within a particular population or species to test a number of different questions about evolution, ecology and conservation. These are all based on the central notion that certain forces of nature will alter the distribution and frequency of alleles within and across populations, and that these patterns are somewhat predictable in how they change.

One particular distinction we need to make early here is the difference between allele frequency and allele identity. In these analyses, often we are working with the same alleles (i.e. particular variants) across our populations, it’s just that each of these populations may possess these particular alleles in different frequencies. For example, one population may have an allele (let’s call it Allele A) very rarely – maybe only 10% of individuals in that population possess it – but in another population it’s very common and perhaps 80% of individuals have it. This is a different level of differentiation than comparing how different alleles mutate (as in the coalescent) or how these mutations accumulate over time (like in many phylogenetic-based analyses).

Allele freq vs identity figure.jpg
An example of the difference between allele frequency and identity. In this example (and many of the figures that follow in this post), the circle denote different populations, within which there are individuals which possess either an A gene (blue) or a B gene. Left: If we compared Populations 1 and 2, we can see that they both have A and B alleles. However, these alleles vary in their frequency within each population, with an equal balance of A and B in Pop 1 and a much higher frequency of B in Pop 2. Right: However, when we compared Pop 3 and 4, we can see that not only do they vary in frequencies, they vary in the presence of alleles, with one allele in each population but not the other.

Non-adaptive (neutral) uses

Testing neutral structure

Arguably one of the most standard uses of allele frequency data is the determination of population structure, one which more avid The G-CAT readers will be familiar with. This is based on the idea that populations that are isolated from one another are less likely to share alleles (and thus have similar frequencies of those alleles) than populations that are connected. This is because gene flow across two populations helps to homogenise the frequency of alleles within those populations, by either diluting common alleles or spreading rarer ones (in general). There are a number of programs that use allele frequency data to assess population structure, but one of the most common ones is STRUCTURE.

Gene flow homogeneity figure
An example of how gene flow across populations homogenises allele frequencies. We start with two initial populations (and from above), which have very different allele frequencies. Hybridising individuals across the two populations means some alleles move from Pop 1 and Pop 2 into the hybrid population: which alleles moves is random (the smaller circles). Because of this, the resultant hybrid population has an allele frequency somewhere in between the two source populations: think of like mixing red and blue cordial and getting a purple drink.

 

Simple YPP structure figure.jpg
An example of a Structure plot which long-term The G-CAT readers may be familiar with. This is taken from Brauer et al. (2013), where the authors studied the population structure of the Yarra pygmy perch. Each small column represents a single individual, with the colours representing how well the alleles of that individual fit a particular genetic population (each population has one colour). The numbers and broader columns refer to different ‘localities’ (different from populations) where individuals were sourced. This shows clear strong population structure across the 4 main groups, except for in Locality 6 where there is a mixture of Eastern and Merri/Curdies alleles.

Determining genetic bottlenecks and demographic change

Other neutral aspects of population identity and history can be studied using allele frequency data. One big component of understanding population history in particular is determining how the population size has changed over time, and relating this to bottleneck events or expansion periods. Although there are a number of different approaches to this, which span many types of analyses (e.g. also coalescent methods), allele frequency data is particularly suited to determining changes in the recent past (hundreds of generations, as opposed to thousands of generations ago). This is because we expect that, during a bottleneck event, it is statistically more likely for rare alleles (i.e. those with low frequency) in the population to be lost due to strong genetic drift: because of this, the population coming out of the bottleneck event should have an excess of more frequent alleles compared to a non-bottlenecked population. We can determine if this is the case with tests such as the heterozygosity excess, M-ratio or mode shift tests.

Genetic drift and allele freq figure
A diagram of how allele frequencies change in genetic bottlenecks due to genetic drift. Left: Large circles again denote a population (although across different sequential times), with smaller circle denoting which alleles survive into the next generation (indicated by the coloured arrows). We start with an initial ‘large’ population of 8, which is reduced down to 4 and 2 in respective future times. Each time the population contracts, only a select number of alleles (or individuals) ‘survive’: assuming no natural selection is in process, this is totally random from the available gene pool. Right: We can see that over time, the frequencies of alleles A and B shift dramatically, leading to the ‘extinction’ of Allele B due to genetic drift. This is because it is the less frequent allele of the two, and in the smaller population size has much less chance of randomly ‘surviving’ the purge of the genetic bottleneck. 

Adaptive (selective) uses

Testing different types of selection

We’ve also discussed previously about how different types of natural selection can alter the distribution of allele frequency within a population. There are a number of different predictions we can make based on the selective force and the overall population. For understanding particular alleles that are under strong selective pressure (i.e. are either strongly adaptive or maladaptive), we often test for alleles which have a frequency that strongly deviates from the ‘neutral’ background pattern of the population. These are called ‘outlier loci’, and the fact that their frequency is much more different from the average across the genome is attributed to natural selection placing strong pressure on either maintaining or removing that allele.

Other selective tests are based on the idea of correlating the frequency of alleles with a particular selective environmental pressure, such as temperature or precipitation. In this case, we expect that alleles under selection will vary in relation to the environmental variable. For example, if a particular allele confers a selective benefit under hotter temperatures, we would expect that allele to be more common in populations that occur in hotter climates and rarer in populations that occur in colder climates. This is referred to as a ‘genotype-environment association test’ and is a good way to detect polymorphic selection (i.e. when multiple alleles contribute to a change in a single phenotypic trait).

Genotype by environment figure.jpg
An example of how the frequency of alleles might vary under natural selection in correlation to the environment. In this example, the blue allele A is adaptive and under positive selection in the more intense environment, and thus increases in frequency at higher values. Contrastingly, the red allele B is maladaptive in these environments and decreases in frequency. For comparison, the black allele shows how the frequency of a neutral (non-adaptive or maladaptive) allele doesn’t vary with the environment, as it plays no role in natural selection.

Taxonomic (species identity) uses

At one end of the spectrum of allele frequencies, we can also test for what we call ‘fixed differences’ between populations. An allele is considered ‘fixed’ it is the only allele for that locus in the population (i.e. has a frequency of 1), whilst the alternative allele (which may exist in other populations) has a frequency of 0. Expanding on this, ‘fixed differences’ occur when one population has Allele A fixed and another population has Allele B fixed: thus, the two populations have as different allele frequencies (for that one locus, anyway) as possible.

Fixed differences are sometimes used as a type of diagnostic trait for species. This means that each ‘species’ has genetic variants that are not shared at all with its closest relative species, and that these variants are so strongly under selection that there is no diversity at those loci. Often, fixed differences are considered a level above populations that differ by allelic frequency only as these alleles are considered ‘diagnostic’ for each species.

Fixed differences figure.jpg
An example of the difference between fixed differences and allelic frequency differences. In this example, we have 5 cats from 3 different species, sequencing a particular target gene. Within this gene, there are three possible alleles: T, A or G respectively. You’ll quickly notice that the allele is both unique to Species A and is present in all cats of that species (i.e. is fixed). This is a fixed difference between Species A and the other two. Alleles and G, however, are present in both Species B and C, and thus are not fixed differences even if they have different frequencies.

Intrapopulation (relatedness) uses

Allele frequency-based methods are even used in determining relatedness between individuals. While it might seem intuitive to just check whether individuals share the same alleles (and are thus related), it can be hard to distinguish between whether they are genetically similar due to direct inheritance or whether the entire population is just ‘naturally’ similar, especially at a particular locus. This is the distinction between ‘identical-by-descent’, where alleles that are similar across individuals have recently been inherited from a similar ancestor (e.g. a parent or grandparent) or ‘identical-by-state’, where alleles are similar just by chance. The latter doesn’t contribute or determine relatedness as all individuals (whether they are directly related or not) within a population may be similar.

To distinguish between the two, we often use the overall frequency of alleles in a population as a basis for determining how likely two individuals share an allele by random chance. If alleles which are relatively rare in the overall population are shared by two individuals, we expect that this similarity is due to family structure rather than population history. By factoring this into our relatedness estimates we can get a more accurate overview of how likely two individuals are to be related using genetic information.

The wild world of allele frequency

Despite appearances, this is just a brief foray into the many applications of allele frequency data in evolution, ecology and conservation studies. There are a plethora of different programs and methods that can utilise this information to address a variety of scientific questions and refine our investigations.

Notes from the Field: Cliff racer

Scientific name

Cinis descendens

Meaning: Cinis: from [ash] in Latin; descendens from [descends] in Latin.

Translation: descending from the ash; describes hunting behaviour in ash mountains of Vvardenfell.

Common name

Cliff racer

cliff racer
A cliff racer hovering above a precipice on Vvardenfell.

Taxonomic status

Kingdom Animalia; Phylum Chordata; Class Aves; Subclass Archaeornithes; Family Vvardidae; Genus Cinis; Species descendens

Conservation status

Least Concern [circa 3E 427]

Threatened [circa 4E 433]

Distribution

Once widespread throughout the north eastern region of Tamriel, occupying regions from the island of Vvardenfell to mainland Morrowind and Solstheim. Despite their name, the cliff racer is found across nearly all geographic regions of Vvardenfell, although the species is found in greatest densities in the rocky interior region of Stonefalls.

Following a purge of the species as part of pest control management, the cliff racer was effectively exterminated from parts of its range, including local extinction on the island of Solstheim. Since the cull the cliff racer is much less abundant throughout its range although still distributed throughout much of Vvardenfell and mainland Morrowind.

Morrowind
The province of Morrowind, which largely contains the distribution of the cliff racer. The island of Solstheim is found to the northwest of the map (the lower half of the island can be seen in brown).

Habitat

Although, much as the name suggests, the cliff racer prefers rocky outcroppings and mountainous regions in which it can build its nest, the species is frequently seen in lowland swamp and plains regions of Morrowind.

Behaviour and ecology

The cliff racer is a highly aggressive ambush predator, using height and range to descend on unsuspecting victims and lashing at them with its long, sharp tail. Although preferring to predate on small rodents and insects (such as kwama), cliff racers have been known to attack much larger beasts such as agouti and guar if provoked or desperate. The highly territorial nature of cliff racer means that they often attack travellers, even if they pose no immediate threat or have done nothing to provoke the animal.

Cliff_Racer_(Online).png
A cliff racer descends upon its prey.

Despite the territoriality of cliff racers, large flocks of them can often be found in the higher altitude regions of Vvardenfell, perhaps facilitated by an abundance of food (reducing competition) or communal breeding grounds. Attempts by researchers to study these aggregations have been limited due to constant attacks and damage to equipment by the flock.

Demography

Prior to the purging of cliff racers in the early 4E by Saint Jiub, the cliff racer was overly abundant throughout its range and considered a pest species by native peoples. Although formal studies on the population structure of the species was never conducted due to their aggressive nature, suppositions of migratory rates, distances and geographies suggested that potentially three major (ESUs) populations existed; one of Solstheim, one of Vvardenfell, and another of mainland Morrowind.

Following the control measures implemented, the population size of these populations of cliff racers declined severely; however, given the survival of the majority of the population it does not appear this bottleneck has severely impacted the longevity of the species. The extirpation of the Solstheim population of cliff racers likely removed a unique ESU from the species, given the relative isolation of the island. Whether the island will be recolonised in time by Vvardenfell cliff racers is unknown, although the presence of any cliff racers back onto Solstheim would likely be met with strong opposition from the local peoples.

Adaptive traits

The broad wings, dorsal sail and long tail allow the cliff racer to travel large distances in the air, serving them well in hunting behaviour. The drawback of this is that, if hunting during the middle hours of the day, the cliff racer leaves an imposing shadow on the ground and silhouette in the sky, often alerting aware prey to their presence. That said, the speed of descent and disorienting cry of the animal often startles prey long enough for the cliff racer to attack.

The plumes of the cliff racer are a well-sought-after commodity by local peoples, used in the creation of garments and household items. Whether these plumes serve any adaptive purpose (such as sexual selection through mate signalling) is unknown, given the difficulties with studying wild cliff racer behaviour.

Management actions

Although suffering from a strong population bottleneck after the purge, the cliff racer is still relatively abundant across much of its range and maintains somewhat stable size. Management and population control of the cliff racer is necessary across the full distribution of the species to prevent strong recovery and maintain public safety and ecosystem balance. Breeding or rescuing cliff racers is strictly forbidden and the species has been widely declared as ‘native pest’, despite the somewhat oxymoron nature of the phrase.

Moving right along: dispersal and population structure

The impact of species traits on evolution

Although we often focus on the genetic traits of species in molecular ecology studies, the physiological (or phenotypic) traits are equally as important in shaping their evolution. These different traits are not only the result themselves of evolutionary forces but may further drive and shape evolution into the future by changing how an organism interacts with the environment.

There are a massive number of potential traits we could focus on, each of which could have a large number of different (and interacting) impacts on evolution. One that is often considered, and highly relevant for genetic studies, is the influence of dispersal capability.

Dispersal

Dispersal is essentially the process of an organism migrating to a new habitat, to the point of the two being used almost interchangeably. Often, however, we regard dispersal as a migration event that actually has genetic consequences; particularly, if new populations are formed or if organisms move from one population to another. This can differ from straight migration in that animals that migrate might not necessarily breed (and thus pass on genes) into a new region during their migration; thus, evidence of those organisms will not genetically proliferate into the future through offspring.

Naturally, the ability of organisms to disperse is highly variable across the tree of life and reliant on a number of other physiological factors. Marine mammals, for example, can disperse extremely far throughout their lifetimes, whereas some very localised species like some insects may not move very far within their lifetime at all. The movement of organisms directly facilitates the movement of genetic material, and thus has significant impacts on the evolution and genetic diversity of species and populations.

Dispersal vs pop structure
The (simplistic) relationship between dispersal capability and one aspect of population genetics, population structure (measured as Fst). As organisms are more capable of dispersing longer distance (or more frequently), the barriers between populations become weaker.

Highly dispersive species

At one end of the dispersal spectrum, we have highly dispersive species. These can move extremely long distances and thus mix genetic material from a wide range of habitats and places into one mostly-cohesive population. Because of this, highly dispersive species often have strong colonising abilities and can migrate into a range of different habitats by tolerating a wide range of conditions. For example, a single whale might hang around Antarctica for part of the year but move to the tropics during other times. Thus, this single whale must be able to tolerate both ends of the temperature spectrum.

As these individuals occupy large ranges, localised impacts are unlikely to critically affect their full distribution. Individual organisms that are occupying an unpleasant space can easily move to a more favourable habitat (provided that one exists). Furthermore, with a large population (which is more likely with highly dispersive species), genetic drift is substantially weaker and natural selection (generally) has a higher amount of genetic diversity to work with. This is, of course, assuming that dispersal leads to a large overall population, which might not be the case for species that are critically endangered (such as the cheetah).

Highly dispersive animals often fit the “island model” of Wright, where individual subpopulations all have equal proportions of migrants from all other subpopulations. In reality, this is rare (or unreasonable) due to environmental or physiological limitations of species; distance, for example, is not implicitly factored into the basic island model.

Island model
The Wright island model of population structure. In this example, different independent populations are labelled in the bold letters, with dispersal pathways demonstrated by the different arrows. In the island model, dispersal is equally likely between all populations (including from BD in this example, even though there aren’t any arrows showing it). Naturally, this is not overly realistic and so the island model is used mostly as a neutral, base model.

Intermediately dispersing species

A large number of species, however, are likely to occupy a more intermediate range of dispersal ability. These species might be able to migrate to neighbouring populations, or across a large proportion of their geographic range, but individuals from one end of the range are still somewhat isolated from individuals at the other end.

This often leads to some effect of population structure; different portions of the geographic range are genetically segregated from one another depending on how much gene flow (i.e. dispersal) occurs between populations. In the most simplest scenario, this can lead to what we call isolation-by-distance. Rather than forming totally independent populations, gene flow occurs across short ranges between adjacent ‘populations’. This causes a gradient of genetic differentiation, with one end of the range being clearly genetically different to the other end, with a gradual slope throughout the range. We see this often in marine invertebrates, for example, which might have somewhat localised dispersal but still occupy a large range by following oceanographic currents.

River IDB network
An example of how an isolation-by-distance population network might come about. In this example, we have a series of populations (the different pie charts) spread throughout a river system (that blue thing). The different pie charts represent how much of the genetics of that population matches one end of the river: either the blue end (left) or red end (right). Populations can easily disperse into adjacent populations (the green arrows) but less so to further populations. This leads to gradual changes across the length of the river, with the far ends of the river clearly genetically distinct from the opposite end but relatively similar to neighbouring populations.
River IDB pop structure.jpg
The genetic representation of the above isolation-by-distance example. Each column represents a single population (in the previous figure, a pie chart), with the different colours also representing the relative genetic identity of that population. As you can see, moving from Population 1 to 10 leads to a gradient (decreasing) in blue genes but increase in red genes. The inverse can be said moving in the opposite direction. That said, comparing Population 1 and Population 10 shows that they’re clearly different, although there is no clear cut-off point across the range of other populations.

Medium dispersal capabilities are also often a requirement for forming ‘metapopulations’. In this population arrangement, several semi-independent populations are present within the geographic range of the species. Each of these are subject to their own local environmental pressures and demographic dynamics, and because of this may go locally extinct at any given time. However, dispersal connections between many of these populations leads to recolonization and gene flow patterns, allowing for extinction-dispersal dynamics to sustain the overall metapopulation. Generally, this would require greater levels of dispersal than those typically found within metapopulation species, as individuals must traverse uninhabitable regions relatively frequently to recolonise locally extinct habitat.

Metapopulation structure.jpg
An example of metapopulation dynamics. Different subpopulations (lettered circles) are connected via dispersal (arrows). These different subpopulations can be different sizes and are mostly independent of one another, meaning that a single subpopulation can go locally extinct (the red X) without collapsing the entire system. The different dispersal pathways mean that one population can recolonise extinct habitat and essentially ‘rebirth’ other subpopulations (the green arrows).

Weakly dispersing species

At the far opposite end of the dispersal ability spectrum, we have low dispersal species. These are often localised, endemic species that for various reasons might be unable to travel very far at all; for some, they may spend their entire adult life in a sedentary form. The lack of dispersal lends to very strong levels of population structure, and individual populations often accumulate genetic differences relatively quickly due to genetic drift or local adaptation.

Species with low dispersal capabilities are often at risk of local extinction and are unable to easily recolonise these habitats after the event has ended. Their movement is often restricted to rare environmental events such as flooding that carry individuals long distances despite their physiological limitations. Because of this, low dispersal species are often at greater risk of total extinction and extinction vertices than their higher dispersing counterparts.

Accounting for dispersal in population genetics

Incorporating biological and physiological aspects of our study taxa is important for interpreting the evolutionary context of species. Dispersal ability is but one of many characteristics that can influence the ability of species to respond to selective pressures, and the context in which this natural selection occurs. Thus, understanding all aspects of an organism is important in building the full picture of their evolution and future prospects.

The direction of selection

The nature of adaptation

One of the most fundamental aspects of natural selection and evolution is, of course, the underlying genetic traits that shape the physical, selected traits. Most commonly, this involves trying to understand how changes in the distribution and frequencies of particular genetic variants (alleles) occur in nature and what forces of natural election are shaping them. Remember that natural selection acts directly on the physical characteristics of species; if these characteristics are genetically-determined (which many are), then we can observe the flow-on effects on the genetic diversity of the target species.

Although we might expect that natural selection is a fairly predictable force, there are a myriad of ways it can shape, reduce or maintain genetic diversity and identity of populations and species. In the following examples, we’re going to assume that the mentioned traits are coded for by a single gene with two different alleles for simplicity. Thus, one allele = one version of the trait (and can be used interchangeably). With that in mind, let’s take a look at the three main broad types of changes we observe in nature.

Directional selection

Arguably the most traditional perspective of natural selection is referred to as ‘directional selection’. In this example, nature selection causes one allele to be favoured more than another, which causes it to increase dramatically in frequency compared to the alternative allele. The reverse effect (natural selection pushing against a maladaptive allele) is still covered by directional selection, except that it functions in the opposite way (the allele under negative selection has reduced frequency, shifting towards the alternative allele).

Directional selection diagram
An example of directional selection. In this instance, we have one population of cats and a single phenotypic trait (colour) which ranges from 0 (yellow) to 1 (red). Red colour is selected for above all other colours; the original population has a pretty diverse mix of colours to start. Over time, we can see the average colour of the entire population moves towards more red colours whilst yellow colours start to disappear. Note that although the final population is predominantly red, there is still some (minor) variation in colours. These changes are reflected in the distribution of the colour-coding alleles (right), as it moves towards the red end of the spectrum.

Balancing selection

Natural selection doesn’t always push allele frequencies into different directions however, and sometimes maintains the diversity of alleles in the population. This is what happens in ‘balancing selection’ (sometimes also referred to as ‘stabilising selection’). In this example, natural selection favours non-extreme allele frequencies, and pushes the distribution of allele frequencies more to the centre. This may happen if deviations from the original gene, regardless of the specific change, can have strongly negative effects on the fitness of an organism, or in genes that are most fit when there is a decent amount of variation within them in the population (such as the MHC region, which contributes to immune response). There are a couple other reasons balancing selection may occur, though.

Heterozygote advantage

One example is known as ‘heterozygote advantage’. This is when an organism with two different alleles of a particular gene has greater fitness than an organism with two identical copies of either allele. A seemingly bizarre example of heterozygote advantage is related to sickle cell anaemia in African people. Sickle cell anaemia is a serious genetic disorder which is encoded for by recessive alleles of a haemoglobin gene; thus, a person has to carry two copies of the disease allele to show damaging symptoms. While this trait would ordinarily be strongly selected against in many population, it is maintained in some African populations by the presence of malaria. This seems counterintuitive; why does the presence of one disease maintain another?

Well, it turns out that malaria is not very good at infecting sickle cells; there are a few suggested mechanisms for why but no clear single answer. Naturally, suffering from either sickle cell anaemia or malaria is unlikely to convey fitness benefits. In this circumstance, natural selection actually favours having one sickle cell anaemia allele; while being a carrier isn’t ordinarily as healthy as having no sickle cell alleles, it does actually make the person somewhat resistant to malaria. Thus, in populations where there is a selective pressure from malaria, there is a heterozygote advantage for sickle cell anaemia. For those African populations without likely exposure to malaria, sickle cell anaemia is strongly selected against and less prevalent.

Malaria and sickle diagram
A diagram of how heterozygote advantage works in sickle cell anaemia and malaria resistance. On the top we have our two main traits: the blood cell shape (which has two different alleles; normal and sickle celled) and malaria infection by mosquitoes. Blue circles indicate that the trait has good fitness, whilst red crosses indicate the trait has bad fitness. For the left hand person, having two sickle cell alleles (ss) means they are symptomatic of sickle cell anaemia and is unlikely to have a good quality of life. On the right, having two normal blood cell alleles (SS) means that he is susceptible to malaria infection. The middle person, however, having only one sickle cell allele (Ss) means they are asymptomatic but still resistant to malaria. Thus, being heterozygous for sickle cell is actually beneficial over being homozygous in either direction: this is reflected in the distribution of alleles (bottom). The left side is pushed down by sickle cell anaemia whilst the right side is pushed down by malaria, thus causing both blood cell alleles (s and S) to be maintained at an intermediate frequency (i.e. balanced). 

Frequency-dependent selection

Another form of balancing selection is called ‘frequency-dependent selection’, where the fitness of an allele is inversely proportional to its frequency. Thus, once the allele has become common due to selection, the fitness of that allele is reduced and selection will start to favour the alternative allele (which is at much lower frequency). The constant back-and-forth tipping of the selective scales results in both alleles being maintained at an equilibrium.

This can happen in a number of different ways, but often the rarer trait/allele is fundamentally more fit because of its rarity. For example, if one allele allows an individual to use a new food source, it will be very selectively fit due to the lack of competition with others. However, as that allele accumulates within the population and more individuals start to feed on that food source, the lack of ‘uniqueness’ will mean that it’s not particularly better than the original food source. A balance between the two food sources (and thus alleles) will be maintained over time as shifts towards one will make the other more fit, and natural selection will compensate.

Frequency dependent selection diagram
An example of frequency-dependent selection. The colour of the cat indicates both their genotype and their food sources: black cats eat red apples whilst green cats eat green apples (this species has apparently developed herbivory, okay?) To start with, the incredibly low frequency of green cats mean that the one green cat can exploit a huge food source compared to black cats. Because of this, natural selection favours green cats. However, in the next generation evolution overcompensates and produces way too many green cats, and now black cats are getting much more food. Natural selection bounces back to favour black cats. Eventually, this causes and equilibrium balance of the two cat types (as shifts one way will cause a shift back the other way immediately after). These changes are reflected in the overall frequency of the two types over time (top right), which eventually evens out. The bottom right figure demonstrates that for both cat types, the frequency of that colour is inversely proportional to the overall fitness (measured as a proxy by amount of food per cat).

Disruptive selection

A third category of selection (although not as frequently mentioned) is known as ‘disruptive selection’, which is essentially the direct opposite of balancing selection. In this case, both extremes of allele frequencies are favoured (e.g. 1 for one allele or 1 for the other) but intermediate frequencies are not. This can be difficult to untangle in natural populations since it could technically be attributed to two different cases of directional selection. Each allele of the same gene is directionally selected for, but in opposite populations and directions so that overall pattern shows very little intermediates.

In direct contrast to balancing selection, disruptive selection can often be a case of heterozygote disadvantage (although it’s rarely called that). In these examples, it may be that individuals which are not genetically committed to one end or the other of the frequency spectrum are maladapted since they don’t fit in anywhere. An example would be a species that occupies both the desert and a forested area, with little grassland-type habitat in the middle. For the relevant traits, strongly desert-adapted genes would be selected for in the desert and strongly forest-adapted genes would be selected for in the forest. However, the lack of gradient between the two habitats means that individuals that are half-and-half are less adaptive in both the desert and the forest. A case of jack-of-all-trades, master of none.

Disruptive selection diagram
The above example of disruptive selection. Bird colour is coded for by a single gene; green birds have a HH genotype, orange birds have a hh genotype, and yellow birds are heterozygotes (Hh). Habitats where the two homozygote colours are most adaptive are found; green birds do well in the forest whereas orange birds do well in the desert. However, there’s no intermediate habitat between the two and so yellow birds don’t really fit well anywhere; they’re outcompeted in the forest and desert by the respective other colours. This means selection favours either extreme (homozygotes), shown in the top right. If we split up the two alleles of the genotype though, we can see that this disruptive selection is really the product of two directionally selective traits working in inverse directions: H is favoured at one end and h at the other.

Direction of selection

Although it would be convenient if natural selection was entirely predictable, it often catches up by surprise in how it acts and changes species and populations in the wild. Careful analysis and understanding of the different processes and outcomes of adaptation can feed our overall understanding of evolution, and aid in at least pointing in the right direction for our predictions.

Fantastic Genes and Where to Find Them

The genetics of adaptation

Adaptation and evolution by natural selection remains one of the most significant research questions in many disciplines of biology, and this is undoubtedly true for molecular ecology. While traditional evolutionary studies have been based on the physiological aspects of organisms and how this relates to their evolution, such as how these traits improve their fitness, the genetic component of adaptation is still somewhat elusive for many species and traits.

Hunting for adaptive genes in the genome

We’ve previously looked at the two main categories of genetic variation: neutral and adaptive. Although we’ve focused predominantly on the neutral components of the genome, and the types of questions about demographic history, geographic influences and the effect of genetic drift, they cannot tell us (directly) about the process of adaptation and natural selective changes in species. To look at this area, we’d have to focus on adaptive variation instead; that is, genes (or other related genetic markers) which directly influence the ability of a species to adapt and evolve. These are directly under natural selection, either positively (‘selected for’) or negatively (‘selected against’).

Given how complex organisms, the environment and genomes can be, it can be difficult to determine exactly what is a real (i.e. strong) selective pressure, how this is influenced by the physical characteristics of the organism (the ‘phenotype’) and which genes are fundamental to the process (the ‘genotype’). Even determining the relevant genes can be difficult; how do we find the needle-like adaptive genes in a genomic haystack?

Magnifying glass figure
If only it were this easy.

There’s a variety of different methods we can use to find adaptive genetic variation, each with particular drawbacks and strengths. Many of these are based on tests of the frequency of alleles, rather than on the exact genetic changes themselves; adaptation works more often by favouring one variant over another rather than completely removing the less-adaptive variant (this would be called ‘fixation’). So measuring the frequency of different alleles is a central component of many analyses.

FST outlier tests

One of the most classical examples is called an ‘FST outlier test’. This can be a bit complicated without understanding what FST is actually measures: in short terms, it’s a statistical measure of ‘population differentiation due to genetic structure’. The FST value of one particular population can determine how genetically similar it is to another. An FST value of 1 implies that the two populations are as genetically different as they could possibly be, whilst an FST value of 0 implies that they are genetically identical populations.

Generally, FST reflects neutral genetic structure: it gives a background of how, on average, different are two populations. However, if we know what the average amount of genetic differentiation should be for a neutral DNA marker, then we would predict that adaptive markers are significantly different. This is because a gene under selection should be more directly pushed towards or away from one variant (allele) than another, and much more strongly than the neutral variation would predict. Thus, the alleles that are way more or less frequent than the average pattern we might assume are under selection. This is the basis of the FST outlier test; by comparing two or more populations (using FST), and looking at the distribution of allele frequencies, we can pick out a few alleles that vary from the average pattern and suggest that they are under selection (i.e. are adaptive).

There are a few significant drawbacks for FST outlier tests. One of the most major ones is that genetic drift can also produce a large number of outliers; in a small population, for example, one allele might be fixed (has a frequency of 1, with no alternative allele in the population) simply because there is not enough diversity or population size to sustain more alleles. Even if this particular allele was extremely detrimental, it’d still appear to be favoured by natural selection just because of drift.

Drift leading to outliers diagram
An example of genetic drift leading to outliers, featuring our friends the cat population. Top row: Two cat populations, one small (left; n = 5) and one large (middle, n = 12) show little genetic differentiation between them (right; each triangle represents a single gene or locus; the ‘colour’ gene is marked in green). The average (‘neutral’) pattern of differentiation is shown by the dashed line. Much like in our original example, one cat in the small population is horrifically struck by lightning and dies (RIP again). Now when we compare the frequency of the alleles of the two populations (bottom), we see that (because a green cat died), the ‘colour’ locus has shifted away from the general trend (right) and is now an outlier. Thus, genetic drift in the ‘colour’ gene gives the illusion of a selective loci (even though natural selection didn’t cause the change, since colour does not relate to how likely a cat is to be struck by lightning).

Secondly, the cut-off for a ‘significant’ vs. ‘relatively different but possibly not under selection’ can be a bit arbitrary; some genes that are under weak selection can go undetected. Furthermore, recent studies have shown a growing appreciation for polygenic adaptation, where tiny changes in allele frequencies of many different genes combine together to cause strong evolutionary changes. For example, despite the clear heritable nature of height (tall people often have tall children), there is no clear ‘height’ gene: instead, it appears that hundreds of genes are potentially very minor height contributors.

Polygenic height figure final
In this example, we have one tall parent (top) who produces two offspring; one who is tall (left) and one who isn’t (right). In order to understand what genetic factors are contributing to their height differences, we compare their genetics (right; each dot represents a single locus). Although there aren’t any particular loci that look massively different between the two, the cumulative effect of tiny differences (the green triangles) together make one person taller than the other. There are no clear outliers, but many (poly) different genes (genic) acting together.

Genotype-environment associations

To overcome these biases, sometimes we might take a more methodological approach called ‘genotype-environment association’. This analysis differs in that we select what we think our selective pressures are: often environmental characteristics such as rainfall, temperature, habitat type or altitude. We then take two types of measures per individual organism: the genotype, through DNA sequencing, and the relevant environmental values for that organisms’ location. We repeat this over the full distribution of the species, taking a good number of samples per population and making sure we capture the full variation in the environment. Then we perform a correlation-type analysis, which seeks to see if there’s a connection or trend between any particular alleles and any environmental variables. The most relevant variables are often pulled out of the environmental dataset and focused on to reduce noise in the data.

The main benefit of GEA over FST outlier tests is that it’s unlikely to be as strongly influenced by genetic drift. Unless (coincidentally) populations are drifting at the same genes in the same pattern as the environment, the analysis is unlikely to falsely pick it up. However, it can still be confounded by neutral population structure; if one population randomly has a lot of unique alleles or variation, and also occurs in a somewhat unique environment, it can bias the correlation. Furthermore, GEA is limited by the accuracy and relevance of the environmental variables chosen; if we pick only a few, or miss the most important ones for the species, we won’t be able to detect a large number of very relevant (and likely very selective) genes. This is a universal problem in model-based approaches and not just limited to GEA analysis.

New spells to find adaptive genes?

It seems likely that with increasing datasets and better analytical platforms, many more types of analysis will be developed to delve deeper into the adaptive aspects of the genome. With whole-genome sequencing starting to become a reality for non-model species, better annotation of current genomes and a steadily increasing database of functional genes, the ability of researchers to investigate evolution and adaptation at the genomic level is also increasing.

Drifting or driving: directionality in evolution

How random is evolution?

Often, we like to think of evolution fairly anthropomorphically; as if natural selection actively decides what is, and what isn’t, best for the evolution of a species (or population). Of course, there’s not some explicit Evolution God who decrees how a species should evolve, and in reality, evolution reflects a more probabilistic system. Traits that give a species a better chance of reproducing or surviving, and can be inherited by the offspring, will over time become more and more dominant within the species; contrastingly, traits that do the opposite will be ‘weeded out’ of the gene pool as maladaptive organisms die off or are outcompeted by more ‘fit’ individuals. The fitness value of a trait can be determined from how much the frequency of that trait varies over time.

So, if natural selection is just probabilistic, does this mean evolution is totally random? Is it just that traits are selected based on what just happens to survive and reproduce in nature, or are there more direct mechanisms involved? Well, it turns out both processes are important to some degree. But to get into it, we have to explain the difference between genetic drift and natural selection (we’re assuming here that our particular trait is genetically determined).  

Allele frequency over time diagram
The (statistical) overview of natural selection. In this example, we have two different traits in a population; the blue and the red O. Our starting population is 20 individuals (N), with 10 of each trait (a 1:1 ratio, or 50% frequency of each). We’re going to assume that, because the blue is favoured by natural selection, it doubles in frequency each generation (i.e. one individual with the blue has two offspring with one blue each). The red is neither here nor there and is stable over time (one red O produces one red O in the next generation). So, going from Gen 1 to Gen 2, we have twice as many blue Xs (Nt) as we did previously, changing the overall frequency of the traits (highlighted in yellow). Because populations probably don’t exponentially increase every generation, we’ll cut it back down to our original total of 20, but at the same ratios (Np). Over time, we can see that the population gradually accumulates more blue Xs relative to red Os, and by Gen 5 the red is extinct. Thus, the blue X has evolved!

When we consider the genetic variation within a species to be our focal trait, we can tell that different parts of the genome might be more related with natural selection than others. This makes sense; some mutations in the genome will directly change a trait (like fur colour) which might have a selective benefit or detriment, while others might not change anything physically or change traits that are neither here-nor-there under natural selection (like nose shape in people, for example). We can distinguish between these two by talking about adaptive or neutral variation; adaptive variation has a direct link to natural selection whilst neutral variation is predominantly the product of genetic drift. Depending on our research questions, we might focus on one type of variation over the other, but both are important components of evolution as a whole.

Genetic drift

Genetic drift is considered the random, selectively ‘neutral’ changes in the frequencies of different traits (alleles) over time, due to completely random effects such as random mutations or random loss of alleles. This results in the neutral variation we can observe in the gene pool of the species. Changes in allele frequencies can happen due to entirely stochastic events. If, by chance, all of the individuals with the blue fur variant of a gene are struck by lightning and die, the blue fur allele would end up with a frequency of 0 i.e. go extinct. That’s not to say the blue fur ‘predisposed’ the individuals to be struck be lightning (we assume here, anyway), so it’s not like it was ‘targeted against’ by natural selection (see the bottom figure for this example).

Because neutral variation appears under a totally random, probabilistic model, the mathematical basis of it (such as the rate at which mutations appear) has been well documented and is the foundation of many of the statistical aspects of molecular ecology. Much of our ability to detect which genes are under selection is by seeing how much the frequencies of alleles of that gene vary from the neutral model: if one allele is way more frequent than you’d expect by random genetic drift, then you’d say that it’s likely being ‘pushed’ by something: natural selection.

Manhattan plot example
A Manhattan plot, which measures the level of genetic differentiation between two different groups across the genome. The x-axis shows the length of the genome, in this example colour-coded by the specific chromosome of the sequence, while the y-axis shows the level of differentiation between the two groups being studied. The dots represent certain spots (loci, singular locus) in the genome, with the level of differentiation (Fst) measured for that locus in one group vs that locus in the other group. The dotted line represents the ‘average differentiation’: i.e. how different you’d expect the two groups to be by chance. Anything about that line is significantly different between the two groups, either because of drift or natural selection. This plot has been slightly adapted from Axelsson et al. (2013), who were studying domestication in dogs by comparing the genetic architecture of wild wolves versus domestic dogs. In this example we can see that certain regions of the genome are clearly different between dogs and wolves (circled); when the authors looked at the genes within those blocks, they found that many were related to behavioural changes (nervous system), competitive breeding (sperm-egg recognition) and interestingly, starch digestion. This last category suggests that adaptation to an omnivorous diet (likely human food waste) was key in the domestication process.

Natural selection

Contrastingly to genetic drift, natural selection is when particular traits are directly favoured (or unfavoured) in the environmental context of the population; natural selection is very specific to both the actual trait and how the trait works. A trait is only selected for if it conveys some kind of fitness benefit to the individual; in evolutionary genetics terms, this means it allows the individual to have more offspring or to survive better (usually).

While this might be true for a trait in a certain environment, in another it might be irrelevant or even have the reverse effect. Let’s again consider white fur as our trait under selection. In an arctic environment, white fur might be selected for because it helps the animal to camouflage against the snow to avoid predators or catch prey (and therefore increase survivability). However, in a dense rainforest, white fur would stand out starkly against the shadowy greenery of the foliage and thus make the animal a target, making it more likely to be taken by a predator or avoided by prey (thus decreasing survivability). Thus, fitness is very context-specific.

Who wins? Drift or selection?

So, which is mightier, the pen (drift) or the sword (selection)? Well, it depends on a large number of different factors such as mutation rate, the importance of the trait under selection, and even the size of the population. This last one might seem a little different to the other two, but it’s critically important to which process governs the evolution of the species.

In very small populations, we expect genetic drift to be the stronger process. Natural selection is often comparatively weaker because small populations have less genetic variation for it to act upon; there are less choices for gene variants that might be more beneficial than others. In severe cases, many of the traits are probably very maladaptive, but there’s just no better variant to be selected for; look at the plethora of physiological problems in the cheetah for some examples.

Genetic drift, however, doesn’t really care if there’s “good” or “bad” variation, since it’s totally random. That said, it tends to be stronger in smaller populations because a small, random change in the number or frequency of alleles can have a huge effect on the overall gene pool. Let’s say you have 5 cats in your species; they’re nearly extinct, and probably have very low genetic diversity. If one cat suddenly dies, you’ve lost 20% of your species (and up to that percentage of your genetic variation). However, if you had 500 cats in your species, and one died, you’d lose only <0.2% of your genetic variation and the gene pool would barely even notice. The same applies to random mutations, or if one unlucky cat doesn’t get to breed because it can’t find a mate, or any other random, non-selective reason. One way we can think of this is as ‘random error’ with evolution; even a perfectly adapted organism might not pass on its genes if it is really unlucky. A bigger sample size (i.e. more individuals) means this will have less impact on the total dataset (i.e. the species), though.

Drift in small pops
The effect of genetic drift on small populations. In this example, we have two very similar populations of cats, each with three different alleles (black, blue and green) in similar frequencies across the populations. The major difference is the size of the population; the left is much smaller (5 cats) compared to the right (20 cats). If one cat randomly dies from a bolt of lightning (RIP), and assuming that the colour of the cat has no effect on the likelihood of being struck by lightning (i.e. is not under natural selection), then the outcome of this event is entirely due to genetic drift. In this case, the left population has lost 1/5th of its population size and 1/3rd of its total genetic diversity thanks to the death of the genetically unique blue cat (He will be missed) whereas the right population has only really lost 1/20th of its size and no changes in total diversity (it’ll recover).

Both genetic drift and natural selection are important components of evolution, and together shape the overall patterns of evolution for any given species on the planet. The two processes can even feed into one another; random mutations (drift) might become the genetic basis of new selective traits (natural selection) if the environment changes to suit the new variation. Therefore, to ignore one in favour of the other would fail to capture the full breadth of the processes which ultimately shape and determine the evolution of all species on Earth, and thus the formation of the diversity of life.

“Who Do You Think You Are?”: studying the evolutionary history of species

The constancy of evolution

Evolution is a constant, endless force which seeks to push and shape species based on the context of their environment: sometimes rapidly, sometimes much more gradually. Although we often think of discrete points of evolution (when one species becomes two, when a particular trait evolves), it is nevertheless a continual force that influences changes in species. These changes are often difficult to ‘unevolve’ and have a certain ‘evolutionary inertia’ to them; because of these factors, it’s often critical to understand how a history of evolution has generated the organisms we see today.

What do I mean when I say evolutionary history? Well, the term is fairly diverse and can relate to the evolution of particular traits or types of traits, or the genetic variation and changes related to these changes. The types of questions and points of interest of evolutionary history can depend at which end of the timescale we look at: recent evolutionary histories, and the genetics related to them, will tell us different information to very ancient evolutionary histories. Let’s hop into our symbolic DeLorean and take a look back in time, shall we?

Labelled_evolhistory
A timeslice of evolutionary history (a pseudo-phylogenetic tree, I guess?), going from more recent history (bottom left) to deeper history (top right). Each region denoted in the tree represents the generally area of focus for each of the following blog headings. 1: Recent evolutionary history might look at individual pedigrees, or comparing populations of a single species. 2: Slightly older comparisons might focus on how species have arisen, and the factors that drive this (part of ‘phylogeography’). 3: Deep history might focus on the origin of whole groups of organisms and a focus on the evolution of particular traits like venom or sociality.

Very recent evolutionary history: pedigrees and populations

While we might ordinarily consider ‘evolutionary history’ to refer to events that happened thousands or millions of years ago, it can still be informative to look at history just a few generations ago. This often involves looking at pedigrees, such as in breeding programs, and trying to see how very short term and rapid evolution may have occurred; this can even include investigating how a particular breeding program might accidentally be causing the species to evolve to adapt to captivity! Rarely does this get referred to as true evolutionary history, but it fits on the spectrum, so I’m going to count it. We might also look at how current populations are evolving differently to one another, to try and predict how they’ll evolve into the future (and thus determine which ones are most at risk, which ones have critically important genetic diversity, and the overall survivability of the total species). This is the basis of ‘evolutionarily significant units’ or ESUs which we previously discussed on The G-CAT.

Captivefishcomic
Maybe goldfish evolved 3 second memory to adapt to the sheer boringness of captivity? …I’m joking, of course: the memory thing is a myth and adaptation works over generations, not a lifetime.

A little further back: phylogeography and species

A little further back, we might start to look at how different populations have formed or changed in semi-recent history (usually looking at the effect of human impacts: we’re really good at screwing things up I’m sorry to say). This can include looking at how populations have (or have not) adapted to new pressures, how stable populations have been over time, or whether new populations are being ‘made’ by recent barriers. At this level of populations and some (or incipient) species, we can find the field of ‘phylogeography’, which involves the study of how historic climate and geography have shaped the evolution of species or caused new species to evolve.

Evolution of salinity
An example of trait-based phylogenetics, looking at the biogeographic patterns and evolution/migration to freshwater in perch-like fishes, by Chen et al. (2014). The phylogeny shows that a group of fishes adapted to freshwater environments (black) from a (likely) saltwater ancestor (white), with euryhaline tolerance evolving two separate times (grey).

One high profile example of phylogeographic studies is the ‘Out of Africa’ hypothesis and debate for the origination of the modern human species. Although there has been no shortage of debate about the origin of modern humans, as well as the fate of our fellow Neanderthals and Denisovans, the ‘Out of Africa’ hypothesis still appears to be the most supported scenario.

human phylogeo
A generalised diagram of the ‘Out of Africa’ hypothesis of human migration, from Oppenheimer, 2012. 

Phylogeography is also component for determining and understanding ‘biodiversity hotspots’; that is, regions which have generated high levels of species diversity and contain many endemic species and populations, such as tropical hotspots or remote temperate regions. These are naturally of very high conservation value and contribute a huge amount to Earth’s biodiversity, ecological functions and potential for us to study evolution in action.

Deep, deep history: phylogenetics and the origin of species (groups)

Even further back, we start to delve into the more traditional concept of evolutionary history. We start to look at how species have formed; what factors caused them to become new species, how stable the new species are, and what are the genetic components underlying the change. This subfield of evolution is called ‘phylogenetics’, and relates to understanding how species or groups of species have evolved and are related to one another.

Sometimes, this includes trying to look at how particular diagnostic traits have evolved in a certain group, like venom within snakes or eusocial groups in bees. Phylogenetic methods are even used to try and predict which species of plants might create compounds which are medically valuable (like aspirin)! Similarly, we can try and predict how invasive a pest species may be based on their phylogenetic (how closely related the species are) and physiological traits in order to safeguard against groups of organisms that are likely to run rampant in new environments. It’s important to understand how and why these traits have evolved to get a good understanding of exactly how the diversity of life on Earth came about.

evolution of venom
An example of looking at trait evolution with phylogenetics, focusing on the evolution of venom in snakes, from Reyes-Velasco et al. (2014). The size of the boxes demonstrates the number of species in each group, with the colours reflecting the number of venomous (red) vs. non-venomous (grey) species. The red dot shows the likely origin of venom.

Phylogenetics also allows us to determine which species are the most ‘evolutionarily unique’; all the special little creatures of plant Earth which represent their own unique types of species, such as the tuatara or the platypus. Naturally, understanding exactly how precious and unique these species are suggests we should focus our conservation attention and particularly conserve them, since there’s nothing else in the world that even comes close!

Who cares what happened in the past right? Well, I do, and you should too! Evolution forms an important component of any conservation management plan, since we obviously want to make sure our species can survive into the future (i.e. adapt to new stressors). Trying to maintain the most ‘evolvable’ groups, particularly within breeding programs, can often be difficult when we have to balance inbreeding depression (not having enough genetic diversity) with outbreeding depression (obscuring good genetic diversity by adding bad genetic diversity into the gene pool). Often, we can best avoid these by identifying which populations are evolutionarily different to one another (see ESUs) and using that as a basis, since outbreeding vs. inbreeding depression can be very difficult to measure. This all goes back to the concept of ‘adaptive potential’ that we’ve discussed a few times before.

In any case, a keen understanding of the evolutionary trajectory of a species is a crucial component for conservation management and to figure out the processes and outcomes of evolution in the real world. Thus, evolutionary history remains a key area of research for both conservation and evolution-related studies.

 

Playing around with science

Science in pop culture

For most people, scientific research can seem somewhat distant and detached from the average person (and society generally). However, the distillation of scientific ideas into various forms of media has been done for ages, and is particularly prevalent (although not limited to) within science fiction. It’s not all that uncommon for scientists to describe the origination of their scientific interest to have come from classic sci-fi movies, tv shows, or games. I’m not saying dinosaurs haven’t always been cool, but after seeing them animated and ferocious in Jurassic Park, I have no doubt a new generation of palaeontologists were inspired to enter the field. I’m sure the same must also be at least partially true for archaeology and Indiana Jones. While I can guarantee the actual scientific research is nowhere near as adventurous and high-octane thriller as those movies would depict, their respective popularities renew interest in the science and inspire new students of the disciplines.

Velociraptor
Sure, they’re not perfectly scientifically accurate, but the certainly get the attention of the public. Source: Jurassic Park wiki.

The inclusion of science within pop culture media such as movies, tv shows, music and video games can have profound impacts on the overall perception of science. This influence seems to go either way depending on how the science is presented and perceived: positive outlooks on science can succinctly present scientific matter in a way that is easy to interpret, and thus can generate interest in the fields of science. Contrastingly, negative outlooks on science, or misinterpretations of science, can drastically impact what people understand about scientific theory. For example, despite being a horrendously outdated belief, Lucy proposed that the average human only uses 10% of their brain capacity: achieving 100% brain capacity using a stimulant, the titular character becomes miraculously superhuman. While this concept is clearly outrageously behind the times for anyone who follows psychological sciences, a disturbing number of people apparently still believe this notion. Thus, misrepresentation of scientific theory perpetuates outdated concepts.

10% brain comic
I mean, someone may as well, right?

Don’t get me wrong: I love ridiculous science fiction as much as the next nerd, and I’m certainly not of the expectation that all science-based information needs to be 100% accurate, without fail (after all, the fiction and fantasy has to fit somewhere…). But it’s important to make sure the transition from scientific research to popular media doesn’t lose the important facts along the way.

Evolution’s relationship with pop culture has been a little more complicated than other scientific theories. Sometimes it’s invoked rather loosely to explain supernatural alien monsters (e.g. Xenomorphs; Alien franchise); other times it’s flipped on its head to show a type of de-evolution (Planet of the Apes). Science fiction has long recognised the innovative and seemingly endless possibilities of evolution and the formation of new species. Generally, the audience is fairly familiar with the concept of evolution (at least in principle) and it makes for a useful tool for explaining the myriad of life in science fiction stories.

Evolution in video games?

It probably doesn’t come as a huge surprise to note that I’m a nerd in all aspects of my life, not just my career. For me, this is particularly a love of video games. Rarely, however, do these two forms of nerdism coincide for me: while some games apply science and scientific theory, they are usually biased towards physics and engineering disciplines (looking at you, Portal). As far as my field is concerned, there are a few notable examples (such as Spore) which encapsulate the essence and majesty of evolution, but rarely do they incorporate the ‘genetic’ aspect that I love.

Spore screenshot
There’s nothing quite like making a horrific carnivorous monster and collapsing ecosystems by exterminating all of the wildlife, then taking over the Universe. Hmm…

You can then imagine my utter delight at the discovery of a game that actually incorporates both population genetics and interesting gameplay. The indie survival game, aptly named Niche: A Genetics Survival Game, very literally represents this ‘niche’ for me (and I will not apologise for the pun!). Combining simplified models of population genetics processes such as genetic diversity, inbreeding (and associated inbreeding depression), natural selection, and stochastic events, Niche beautifully incorporates scientific theory (albeit toned down to a layman level) with challenging, yet engaging, gameplay mechanics and adorable art style.

Niche screenshot
Niche: A Genetics Survival Game epitomises the intersection of evolutionary theory and pop culture.

As one might expect from the title, Niche is at heart a survival game: the aim is to have your very own population of animals (dubbed ‘Nichelings’) survive the stresses of the world, through balancing population size, gene pools, resources (such as food, nests, space) and fighting off predators. Over time, the genetics component drives the evolution of your Nichelings, pushing them to be better at certain tasks depending on the traits selected for: the ultimate aim of the game is to create the perfectly adapted species that can colonise all of the land masses randomly generated.

Niche screenshot DNA
The user interface of Niche. A: The ranking of the selected Nicheling, moving from alpha, to beta, to gamma. This determines the order the Nichelings eat in (gammas get the short end of the stick). B: The traits of the selecting Nicheling. In order, these are the physical traits (i.e. the strength, speed and abilities of the animal), the genetic sequence (genotype) of the animal (expanded in C). the user-chosen mutations for that Nicheling and the pedigree of NichelingsC: The expanded DNA sequence of the selected Nicheling, showing the paternal and maternal variants (alleles) of all the possible genes. Highlighted traits are the expressed trait (dominant) whilst the faded ones indicate recessive carrier genes that aren’t expressed. D: Collected food, one of the most important resources in the game. E: Nest material, required to build nests and produce offspring. F: The different senses (sight, smell, hearing) which can be toggled to give different viewpoints of the surrounding environments (with different benefits and weaknesses).

Niche requires cunning strategy, good foresight and planning, and sometimes a little luck. Although I’m decidedly not very good at Niche yet (I think my rates of extinction would mirror the real world a little too much for my liking…), the chance to involve my scientific background into my favourite hobby is a somewhat magical experience.

Niche screenshot extinct
Oh god, I hope this isn’t a premonition for my career!

You might wonder why I care so much about a video game. While the game is in and of itself an interesting concept, to me it exemplifies one way we can make science an enjoyable and digestible concept for non-scientists. It’s possible that Niche could open the door of population-level genetics and evolution to a new audience, and potentially inspire the next generation of scientists in the field. Although that might be an extraordinarily long shot, it is my hope that the curiosity, mystery and creativity of scientific research is at least partially represented in media such as gaming to help integrate science and society.

Using video games for science?!

Both science and society can benefit from the (accurate) representation of science in pop culture, not just through fostering a connection between scientific theory and the recreational hobbies of people. In rare occasions, pop culture can even be used as a surrogate medium for testing scientific theories and hypotheses in a specific environment: for example, World of Warcraft has unwittingly contributed to scientific progress. As part of a particular boss battle, characters could become infected with a particular disease (called “Corrupted Blood”), which would have significant effects on players but only for a few seconds. While this was supposed to be removed after leaving the area of the fight, a bug in the game caused it to stay on animal pets that were afflicted, and thus become a viral phenomenon when it started to spread into the wider world (of Warcraft). The presence of the epidemic wiped out swathes of lower level players and caused significant social repercussions in the World of Warcraft community as players adjusted their behaviour to avoid or prevent transmission of the deadly disease.

This unique circumstance allowed a group of scientists to use it as a simulation of a real viral outbreak, as the spread of the disease was directly related to the social behaviour and interactivity of players within the game. The “Corrupted Blood” incident such enthralled scientists that multiple papers were published discussing the feasibility of using virtual gaming worlds to simulate human reactions to epidemic outbreaks and viral transmissions on an unparalleled scale. Similarities between the method of transmission and behavioural responses to real-world events such as the avian flu epidemic were made.

Corrupted blood event
And you thought Bird Flu was bad, at least they couldn’t teleport! Source: GameRant.

This isn’t the only example of even World of Warcraft informing research, with others using it to model economic theories through a free market auction system. While these may seem extraordinarily strange (to scientists and non-scientists alike), these examples demonstrate how popular media such as gaming can be an important interactive front between science and society.

“How do you conserve genes?”: clarifying conservation genetics

Sometimes when I talk about the concept of conservation genetics to friends and family outside of the field, there can be some confusion about what this actually means. Usually, it’s assumed that means the conservation of genetics: that is, instead of trying to conserve individual animals or plants, we try to conserve specific genes. While in some cases this is partially true (there might be genes of particular interest that we want to maintain in a wild population), often what we actually mean is using genetic information to inform conservation management and to give us the best chance of long-term rescue for endangered species.

DNA Zoo comic
Don’t worry, it’s an open range zoo: the genes have plenty of room to roam.

See, the DNA of individuals contains much more information than just the genes that make up an organism. By looking at the number, frequency or distribution of changes and differences in DNA across individuals, populations or species, we can see a variety of different patterns. Typically, genetics-based conservation analysis is based on a single unifying concept: that different forces create different patterns in the genetic make-up of species and populations, and that these can be statistically evaluated using genetic data. The exact type or scale of effect depends on how the data is collected and what analysis we use to evaluate that data, although we could do multiple types of analysis using the same dataset.

Oftentimes, we want to know about the current or historical state of a species or population to best understand how to move forward: by understanding where a species has come from, what it has been affected by, and how it has responded to different pressures, we can start to suggest and best manage these species into the future.

However, there are lots of possible avenues for exploration: here are just a few…

Evolutionary significant units (ESUs) and management units (MUs)

One commonly used application of genetic information for conservation is the designation of what we call ‘Evolutionary Significant Units’ (ESUs). Using genetics, we can determine the boundaries of particular populations which correspond to their own unique evolutionary groups. These are often the results of historical processes which have separated and driven the independent evolution of each ESU, usually with low or no gene flow across these units. Generally, managing and conserving each of these can lead to overall more robust management of the species as a whole by making sure certain groups that have unique and potentially critical adaptations are maintained in the wild. Although ESUs can sometimes be arguable (particularly when there is some, but not much, gene flow across units), it forms an important aspect of conservation designations.

In cases of shorter term separations across these populations, where there are noticeable differences in the genetics of the populations but not necessarily massively different evolutionary histories, conservationists will sometimes refer to ‘Management Units’ (MUs). These have much weaker evolutionary pressure behind them but might be indicative of very recent impacts, such as human-driven fragmentation of habitat or contemporary climate change. MUs often reflect very sudden and recent changes in populations and might have profound implications for the future of these groups: thus, they are an important way of assessing the current state of the species. The next couple of figures demonstrate this from one of my colleagues’ research papers.

YPP_map
The geographic distributions of Yarra pygmy perch populations, generously taken from Brauer et al. (2013). Each dot and number on the map represents a single population of pygmy perch used in the analysis. The colour of the population represents which MU it belongs to, whilst the shape of the marker represents the ESU. To make this easier to visualise, the solid lines indicate the boundaries of ESUs while the dashed lines represent MU boundaries. You’ll notice that MUs are subsets of ESUs, and that Population 6 actually fits into two different ESUs: see below.
YPP_Structure
An example of the output of an analysis (STRUCTURE) that determines population boundaries for Yarra pygmy perch using genetic data, generously taken from Brauer et al. (2013). Structure is an ‘assignment test’; using the input genetic information, it tries to make groups of individuals which are more similar to one another than other groups. In the graphs, each small column represents a single individual, with the colour bars representing how well it fits that (colour) population. The smaller numbers at the bottom and the labels above the graphs represent geographic populations (see the figure above). A) Shows the 4 major ESUs of Yarra pygmy perch, with some clear mixing between the Eastern ESU and the Merri/Curdies ESU in population 6. The rest of the populations fit pretty well entirely into one ESU. B) The MUs of Yarra pygmy perch, which shows the genetic structure within ESUs that can’t be seen well in A). Notice that some ESUs are made of many MUs (E.g. Central) while others are only one MU (e.g. MDB).

The two can be thought of as part of the same hierarchy, with ESUs reflecting more historic, evolutionary groups and MUs reflecting more recent (but not necessarily evolutionary) groups. For conservation management, this has traditionally meant that individuals from one ESU were managed independent of one another (to preserve their ‘pure’ evolutionary history) whilst translocations of individuals across MUs were common and often recommended. This is based on the idea that mixing very genetically different populations could cause adaptive genes in each population to become ‘diluted’, negatively affecting the ability of the populations to evolve: this is referred to as ‘outbreeding depression’ (OD).

Coffee comic
Sometimes, adding something can make what you had even worse than before. The most depressing analogy of outbreeding depression; a ruined coffee.

However, more recent research has suggested that the concerns with OD from mixing across ESUs are less problematic than previously thought. Analysis of the effect of OD versus not supplementing populations with more genetic diversity has shown that OD is not the more dangerous option, and there is a current paradigm push to acknowledge the importance of mixing ESUs where needed.

Adaptive potential and future evolution

Understanding the genetic basis of evolution also forms an important research area for conservation management. This is particularly relevant for ‘adaptive potential’: that is, the ability for a particular species or population to be able to adapt to a variety of future stressors based on their current state. It is generally understood that having lots of different variants (alleles) of genes in the total population or species is a critical part of evolution: the more variants there are, the more choices there are for natural selection to act upon.

We can estimate this from the amount of genetic diversity within the population, as well as by trying to understand their previous experiences with adaptation and evolution. For example, it is predicted that species which occur in much more climatically variable habitats (such as in desert regions) are more likely to be able to handle and tolerate future climate change scenarios since they’ve demonstrated the ability to adapt to new, more extreme environments before. Examples of this include the Australian rainbowfishes, which are found in pretty well every climatic region across the continent (and therefore must be very good at adapting to new, varying habitats!).

Rainbowfish both.jpg
Left: The distribution of rainbowfish across Australia, with each colour representing a particular ecotypeRight: A photo of a (very big) tropical rainbowfish taken from a recent MELFU field trip. Source: MELFU Facebook page. He really got around after that one stint in that children’s story.

Genetics-based breeding programs and pedigrees

A much more direct use of genetic information for conservation is in designing breeding programs. We know that breeding related individuals can have very bad results for offspring (this is referred to as ‘inbreeding depression’): so obviously, we would avoid breeding siblings together. However, in complex breeding systems (such as polygamous animals), or in wild populations, it can be very difficult to evaluate relationships and overall relatedness.

That’s where genetics comes in: by looking at how similar or different the DNA of two individuals are, we can not only check what relationship they are (e.g. siblings, cousins, or very distantly related) but also get an exact value of their genetic relatedness. Since we know that having a diverse gene pool is critical for future adaptation and survival of a species, genetics-based breeding programs can maximise the amount of genetic diversity in following generations. We can even use a computer algorithm to make the very best of breeding groups, using a quirky program called SWINGER.

Cats DNA dating
If You Are the One, conservation genetics edition.

Taxonomy for conservation legislation

Another (slightly more complicated) application of genetics is the designation of species status. Large amounts of genetic information can often clarify complex issues of species descriptions (later issues of The G-CAT will discuss exactly how this works and why it’s not so straightforward…).

Why should we care what we call a species or not? Well, much of the protective legislation at the government level is designed at the species-level: legislative protections are often designated for a particular species, but doesn’t often distinguish particular populations. Thus, misidentified species can sometimes but lost if they were never detected as a unique species (and assumed to be just a population of another species). Alternatively, managing two species as one based on misidentification could mess with the evolutionary pathways of both by creating unfit hybrid species which do not naturally come into contact together (say, breeding individuals from one species with another because we thought they were the same species).

Cryptic cats comic
Awkward.

Additionally, if we assume that multiple different species are actually only one species, this can provide an overestimate of how well that species is doing. Although in total it might look like there are plenty of individuals of the species around, if this was actually made of 4 separate species then each one would be doing ¼ as well as we thought. This can feed back into endangered status classification and thus conservation management.

 

These are just some of the most common examples of applied genetics in conservation management. No doubt going into the future more innovative and creative methods of applying genetic information to maintaining threatened species and populations will become apparent. It’s an exciting time to be in the field and inspires hope that we may be able to save species before they disappear from the planet permanently.