The ‘other’ allele frequency: applications of the site frequency spectrum

The site-frequency spectrum

In order to simplify our absolutely massive genomic datasets down to something more computationally feasible for modelling techniques, we often reduce it to some form of summary statistic. These are various aspects of the genomic data that can summarise the variation or distribution of alleles within the dataset without requiring the entire genetic sequence of all of our samples.

One very effective summary statistic that we might choose to use is the site-frequency spectrum (aka the allele frequency spectrum). Not to be confused with other measures of allele frequency which we’ve discussed before (like Fst), the site-frequency spectrum (abbreviated to SFS) is essentially a histogram of how frequent certain alleles are within our dataset. To do this, the SFS classifies each allele into a certain category based on how common it is, tallying up the number of alleles that occur at that frequency. The total number of categories would be the maximum number of possible alleles: for organisms with two copies of every chromosome (‘diploids’, including humans), this means that there are double the number of samples included. For example, a dataset comprised of genomic sequence for 5 people would have 10 different frequency bins.

For one population

The SFS for a single population – called the 1-dimensional SFS – this is very easy to visualise as a concept. In essence, it’s just a frequency distribution of all the alleles within our dataset. Generally, the distribution follows an exponential shape, with many more rare (e.g. ‘singletons’) alleles than there are common ones. However, the exact shape of the SFS is determined by the history of the population, and like other analyses under coalescent theory we can use our understanding of the interaction between demographic history and current genetic variation to study past events.

1DSFS example.jpg
An example of the 1DSFS for a single population, taken from a real dataset from my PhD. Left: the full site-frequency spectrum, counting how many alleles (y-axis) occur a certain number of times (categories of the x-axis) within the population. In this example, as in most species, the vast majority of our DNA sequence is non-variable (frequency = 0). Given the huge disparity in number of non-variable sites, we often select on the variable ones (and even then, often discard the 1 category to remove potential sequencing errors) and get a graph more like the right. Right: the ‘realistic’ 1DSFS for the population, showing a general exponential decline (the blue trendline) for the more frequent classes. This is pretty standard for an SFS. ‘Singleton’ and ‘doubleton’ are alternative names for ‘alleles which occur once’ and ‘alleles which occur twice’ in an SFS.

Expanding the SFS to multiple populations

Further to this, we can expand the site-frequency spectrum to compare across populations. Instead of having a simple 1-dimensional frequency distribution, for a pair of populations we can have a grid. This grid specifies how often a particular allele occurs at a certain frequency in Population A and at a different frequency in Population B. This can also be visualised quite easily, albeit as a heatmap instead. We refer to this as the 2-dimensional SFS (2DSFS).

2dsfs example
An example of a 2DSFS, also taken from my PhD research. In this example, we are comparing Population A, containing 5 individuals (as diploid, 2 x 5 = max. of 10 occurrences of an allele) with Population B, containing 4 individuals. Each row denotes the frequency at which a certain allele occurs in Population whilst the columns indicate the frequency a certain allele occurs in Population A. Each cell therefore indicates the number of alleles that occur at the exact frequency of the corresponding row and column. For example, the first cell (highlighted in green) indicates the number of alleles which are not found in either Population A or Population B (this dataset is a subsample from a larger one). The yellow cell indicates the number of alleles which occur 4 times in Population and also 4 times in Population A. This could mean that in one of those Populations 4 individuals have one copy of that allele each, or two individuals have two copies of that allele, or that one has two copies and two have one copy. The exact composition of how the alleles are spread across samples within each population doesn’t matter to the overall SFS.

The same concept can be expanded to even more populations, although this gets harder to represent visually. Essentially, we end up with a set of different matrices which describe the frequency of certain alleles across all of our populations, merging them together into the joint SFS. For example, a joint SFS of 4 populations would consist of 6 (4 x 4 total comparisons – 4 self-comparisons, then halved to remove duplicate comparisons) 2D SFSs all combined together. To make sense of this, check out the diagrammatic tables below.

populations for jsfs
A summary of the different combinations of 2DSFSs that make up a joint SFS matrix. In this example we have 4 different populations (as described in the above text). Red cells denote comparisons between a population and itself – which is effectively redundant. Green cells contain the actual 2D comparisons that would be used to build the joint SFS: the blue cells show the same comparisons but in mirrored order, and are thus redundant as well.
annotated jsfs heatmap
Expanding the above jSFS matrix to the actual data, this matrix demonstrates how the matrix is actually a collection of multiple 2DSFSs. In this matrix, one particular cell demonstrates the number of alleles which occur at frequency x in one population and frequency y in another. For example, if we took the cell in the third row from the top and the fourth column from the left, we would be looking at the number of alleles which occur twice in Population B and three times in Population A. The colour of this cell is moreorless orange, indicating that ~50 alleles occur at this combination of frequencies. As you may notice, many population pairs show similar patterns, except for the Population C vs Population D comparison.

The different forms of the SFS

Which alleles we choose to use within our SFS is particularly important. If we don’t have a lot of information about the genomics or evolutionary history of our study species, we might choose to use the minor allele frequency (MAF). Given that SNPs tend to be biallelic, for any given locus we could have Allele A or Allele B. The MAF chooses the least frequent of these two within the dataset and uses that in the summary SFS: since the other allele’s frequency would just be 2N – the frequency of the other allele, it’s not included in the summary. An SFS made of the MAF is also referred to as the folded SFS.

Alternatively, if we know some things about the genetic history of our study species, we might be able to divide Allele A and Allele B into derived or ancestral alleles. Since SNPs often occur as mutations at a single site in the DNA, one allele at the given site is the new mutation (the derived allele) whilst the other is the ‘original’ (the ancestral allele). Typically, we would use the derived allele frequency to construct the SFS, since under coalescent theory we’re trying to simulate that mutation event. An SFS made of the derived alleles only is also referred to as the unfolded SFS.

Applications of the SFS

How can we use the SFS? Well, it can moreorless be used as a summary of genetic variation for many types of coalescent-based analyses. This means we can make inferences of demographic history (see here for more detailed explanation of that) without simulating large and complex genetic sequences and instead use the SFS. Comparing our observed SFS to a simulated scenario of a bottleneck and comparing the expected SFS allows us to estimate the likelihood of that scenario.

For example, we would predict that under a scenario of a recent genetic bottleneck in a population that alleles which are rare in the population will be disproportionately lost due to genetic drift. Because of this, the overall shape of the SFS will shift to the right dramatically, leaving a clear genetic signal of the bottleneck. This works under the same theoretical background as coalescent tests for bottlenecks.

SFS shift from bottleneck example.jpg
A representative example of how a bottleneck causes a shift in the SFS, based on a figure from a previous post on the coalescentCentre: the diagram of alleles through time, with rarer variants (yellow and navy) being lost during the bottleneck but more common variants surviving (red). Left: this trend is reflected in the coalescent trees for these alleles, with red crosses indicating the complete loss of that allele. Right: the SFS from before (in red) and after (in blue) the bottleneck event for the alleles depicted. Before the bottleneck, variants are spread in the usual exponential shape: afterwards, however, a disproportionate loss of the rarer variants causes the distribution to flatten. Typically, the SFS would be built from more alleles than shown here, and extend much further.

Contrastingly, a large or growing population will have a larger number of rare (i.e. unique) alleles from the sudden growth and increase in genetic variation. Thus, opposite to the bottleneck the SFS distribution will be biased towards the left end of the spectrum, with an excess of low-frequency variants.

SFS shift from expansion example.jpg
A similar diagram as above, but this time with an expansion event rather than a bottleneck. The expansion of the population, and subsequent increase in Ne, facilitates the mutation of new alleles from genetic drift (or reduced loss of alleles from drift), causing more new (and thus rare) alleles to appear. This is shown by both the coalescent tree (left) and a shift in the SFS (right).

The SFS can even be used to detect alleles under natural selection. For strongly selected parts of the genome, alleles should occur at either high (if positively selected) or low (if negatively selected) frequency, with a deficit of more intermediate frequencies.

Adding to the analytical toolbox

The SFS is just one of many tools we can use to investigate the demographic history of populations and species. Using a combination of genomic technologies, coalescent theory and more robust analytical methods, the SFS appears to be poised to tackle more nuanced and complex questions of the evolutionary history of life on Earth.

Notes from the Field: Octoroks

Scientific name

Octorokus infletus

Meaning: Octorokus from [octorok] in Hylian; infletus from [inflate] in Latin.

Translation: inflating octorok; all varieties use an inflatable air sac derived from the swim bladder to float and scan the horizon.

Varieties

Octorokus infletus hydros [aquatic morphotype]

Octorokus infletus petram [mountain morphotype]

Octorokus infletus silva [forest morphotype]

Octorokus infletus arctus [snow morphotype]

Octorokus infletus imitor [deceptive morphotype]

All octoroks.jpg
The various morphotypes of inflating octoroksA: The water octorok, considered the morphotype closest to the ancestral physiology of the species. B: The forest octorok, with grass camouflage. C: The deceptive octorok, which has replaced its tufted vegetation with a glittering chest as bait. D: The mountainous octorok, with rock camouflage. E: The snow octorok, with tundra grass camouflage.

Common name

Variable octorok

Taxonomic status

Kingdom Animalia; Phylum Mollusca; Class Cephalapoda; Order Octopoda; Family Octopididae; Genus Octorokus; Species infletus

Conservation status

Least Concern

Distribution

The species is found throughout all major habitat regions of Hyrule, with localised morphotypes found within specific habitats. The only major region where the variable octorok is not found is within the Gerudo Desert, suggesting some remnant dependency of standing water.

Octorok distribution.jpg
The region of Hyrule, with the distribution of octoroks in blue. The only major region where they are not found is the Gerudo Desert in the bottom left.

Habitat

Habitat choice depends on the physiology of the morphotype; so long as the environment allows the octorok to blend in, it is highly likely there are many around (i.e. unseen).

Behaviour and ecology

The variable octorok is arguably one of the most diverse species within modern Hyrule, exhibiting a large number of different morphotypic forms and occurring in almost all major habitat zones. Historical data suggests that the water octorok (Octorokus infletus hydros) is the most ancestral morphotype, with ancient literature frequently referring to them as sea-bearing or river-traversing organisms. Estimates from the literature suggests that their adaptation to land-based living is a recent evolutionary step which facilitated rapid morphological radiation of the lineage.

Several physiological characteristics unite the variable morphological forms of the octorok into a single identifiable species. Other than the typical body structure of an octopod (eight legs, largely soft body with an elongated mantle region), the primary diagnostic trait of the octorok is the presence of a large ‘balloon’ with the top of the mantle. This appears to be derived from the swim bladder of the ancestral octorok, which has shifted to the cranial region. The octorok can inflate this balloon using air pumped through the gills, filling it and lifting the octorok into the air. All morphotypes use this to scan the surrounding region to identify prey items, including attacking people if aggravated.

inflated octorok
A water morphotype octorok with balloon inflated.

Diets of the octorok vary depending on the morphotype and based on the ecological habitat; adaptations to different ecological niches is facilitated by a diverse and generalist diet.

Demography

Although limited information is available on the amount of gene flow and population connectivity between different morphotypes, by sheer numbers alone it would appear the variable octorok is highly abundant. Some records of interactions between morphotypes (such as at the water’s edge within forested areas) implies that the different types are not reproductively isolated and can form hybrids: how this impacts resultant hybrid morphotypes and development is unknown. However, given the propensity of morphotypes to be largely limited to their adaptive habitats, it would seem reasonable to assume that some level of population structure is present across types.

Adaptive traits

The variable octorok appears remarkably diverse in physiology, although the recent nature of their divergence and the observed interactions between morphological types suggests that they are not reproductively isolated. Whether these are the result of phenotypic plasticity, and environmental pressures are responsible for associated physiological changes to different environments, or genetically coded at early stages of development is unknown due to the cryptic nature of octorok spawning.

All octoroks employ strong behavioural and physiological traits for camouflage and ambush predation. Vegetation is usually placed on the top of the cranium of all morphotypes, with the exact species of plant used dependent on the environment (e.g. forest morphotypes will use grasses or ferns, whilst mountain morphotypes will use rocky boulders). The octorok will then dig beneath the surface until just the vegetation is showing, effectively blending in with the environment and only occasionally choosing to surface by using the balloon. Whether this behaviour is passed down genetically or taught from parents is unclear.

Management actions

Few management actions are recommended for this highly abundant species. However, further research is needed to better understand the highly variable nature and the process of evolution underpinning their diverse morphology. Whether morphotypes are genetically hardwired by inheritance of determinant genes, or whether alterations in gene expression caused by the environmental context of octoroks (i.e. phenotypic plasticity) provides an intriguing avenue of insight into the evolution of Hylian fauna.

Nevertheless, the transition from the marine environment onto the terrestrial landscape appears to be a significant stepping stone in the radiation of morphological structures within the species. How this has been facilitated by the genetic architecture of the octorok is a mystery.

 

Not that kind of native-ity: endemism and invasion of Australia

The endemics of Australia

Australia is world-renowned for the abundant and bizarre species that inhabit this wonderful island continent. We have one of the highest numbers of unique species in the entire world (in the top few!): this is measured by what we call ‘endemism’. A species is considered endemic to a particular place or region if that it is the only place it occurs: it’s completely unique to that environment. In Australia, a whopping 87% of our mammals, 45% of our birds, 93% of our reptiles, 94% of our amphibians 24% of our fishes and 86% of our plants are endemic, making us a real biodiversity paradise! Some lists even label us as a ‘megadiverse country’, which sounds pretty awesome on paper. And although we traditionally haven’t been very good at looking after it, our array of species is a matter of some pride to Aussies.

Endemism map
A map representing the relative proportion of endemic species in Australia, generated through the Atlas of Living Australia. The colours range from no (white; 0% endemics) or little (blue) to high levels of endemism (red; 100% of species are endemic). As you can see, some biogeographic hotspots are clearly indicated (southwest WA, the east coast, the Kimberley ranges).

But the real question is: why are there so many endemics in Australia? What is so special about our country that lends to our unique flora and fauna? Although we naturally associate tropical regions with lush, vibrant and diverse life, most of Australia is complete desert. That said, most of our species are concentrated in the tropical regions of the country, particularly in the upper east coast and far north (the ‘Top End’).

There are a number of different factors which contribute to the high species diversity of Australia. Most notably is how isolated we are as a continent: Australia has been separated from most of the rest of the world for millions of years. In this time, the climate has varied dramatically as the island shifted northward, creating a variety of changing environments and unique ecological niches for species to specialise into. We refer to these species groups as ‘Gondwana relicts’, since their last ancestor with the rest of the world would have been distributed across the supercontinent Gondwana over 100 million years ago. These include marsupials, many birds groups (including ratites and megapodes), many fish groups and a plethora of others. A Gondwanan origin explains why they are only found within Australia, southern Africa and South America (the closest landmass that was also historically connected to Gondwana).

Early arrivals and naturalisation to the Australian ecosystem 

But not all of Australia’s species are so ancient and ingrained in the landscape. As Australia drifted northward and eventually collided with the Sunda plate (forming the mountain ranges across southeast Asia), many new species and groups managed to disperse into Australia. This includes the first indigenous people to colonise Australia, widely regarded as one of the oldest human civilisations and estimated to have arrived down under over 65 thousand years ago.

Eventually, this connection also brought with them one of our most iconic species; the dingo. Estimates of their arrival dates the migration at around 6 thousand years ago. As Australia’s only ‘native’ dog, there has been much debate about its status as an Australian icon. To call the dingo ‘native’ implies it’s always been there: but 6 thousand years is more than enough time to become ingrained within the ecosystem in a stable fashion. So, to balance the debate (and prevent the dingo from being labelled as an ‘invasive pest’ unfairly), we often refer to them as ‘naturalised’. This term helps us to disentangle modern-day pests, many of which our immensely destructive to the natural environment, from other species that have naturally migrated and integrated many years ago.

Patriotic dingo
Although it may not be a “true native”, the dingo will forever be a badge of our native species pride.

Invaders of the Australian continent

Of course, we can never ignore the direct impacts of humans on the ecosystem. Particularly with European settlement, another plethora of animals were introduced for the first time into Australia; these were predominantly livestock animals or hunting-related species (both as predators and prey). This includes the cane toad, widely regarded as one of the biggest errors in pest control on the planet.

When European settlers in the 1930s attempted to grow sugar cane in the far eastern part of the country, they found their crops decimated by a local beetle. In an effort to eradicate them, they brought over a species of cane toad, with the idea that they would control the beetle population and all would be well. Only, cane toads are particularly lazy and instead of targeting the cane beetles, they just thrived on all the other native invertebrates around. They’re also very resilient and adaptable (and highly toxic), so their numbers exploded and they’ve since spread across a large swathe of the country. Their toxic skin makes them fatal food objects for many native predators and they strongly compete against other similar native animals (such as our own amphibians). The cane toad introduction of 1935 is the poster child of how bad failed pest control can be.

DSC_0867_small
This guy here, he’s a bastard. Spotted in my parent’s backyard in Ipswich, QLD. Source: me, with spite.

But is native always better?

History tells a very stark tale about the poor native animals and the ravenous, rampaging pest species. Because of this, it is a widely adopted philosophical viewpoint that ‘native is always best’. And while I don’t disagree with the sentiment (of course we need to preserve our native wildlife, and not the massively overabundant pests), there are rare examples where nature is a little more complicated. In Australia, this is exemplified in the noisy miner.

The noisy miner is a small bird which, much like its name implies, is incredibly noisy and aggressive. It’s highly abundant, found predominantly throughout urban and suburban areas, and seems to dominate the habitat. It does this by bullying out other bird species from nesting grounds, creating a monopoly on the resource to the exclusion of many other species (even larger ones such as crows and magpies). Despite being native, it seems to have thrived on human alteration of the landscape and is a serious threat to the survival and longevity of many other species. If we thought of it solely under the ‘nature is best’ paradigm, we would dismiss the noisy miner as ‘doing what it should be.’ The truth is really more of a philosophical debate: is it natural to let the noisy miner outcompete many other natives, possibly resulting in their extinction? Or is it only because of human interference (and thus is our responsibility to fix) that the noisy miner is doing so well in the first place? It’s not a simple question to answer, although the latter seems to be incredibly important.

Noisy miner harassing currawong
An example of the aggressive behaviour of the noisy miner (top), swooping down on a pied currawong (bottom). Despite the size differences, noisy miners will frequently attempt to harass and scare off other larger birds. Image source: Bird Ecology Study Group website.

The amazing biodiversity of Australia is a badge of honour we should wear with patriotic pride. Conservation efforts of our endemic fauna are severely limited by a lack of funding and resources, and despite a general acceptance of the importance of diverse ecosystems we remain relatively ineffective at preserving it. Understanding and connecting with our native wildlife, whilst finding methods to control invasive species, is key to conserving our wonderful ecosystems.

Why we should always pander to diversity

Diversity in the natural world

‘Diversity’ is a term that gets used a lot these days, albeit usually in reference to social changes and structures. However, diversity is not merely a human construct and reflects an extremely important aspect of the natural world at a variety of levels. From the smallest genes to the biggest ecosystems, diversity is a trait that confers a massive range of benefits to individuals, populations, species and even the entire globe. Let’s dissect this diversity down at different scales and see how beneficial it can be.

Hierarchy of diversity
The generalised hierarchy at life, with diversity being an important component of each tier. At the smallest tier, genes underpin all life. The collection of genetic diversity is often summarised into a population (as a single cohesive genetic unit). Several populations can be pooled together into a single (usually) cohesive speciesDifferent species are then components of a larger community (which in turn are components of a broader ecosystem).

Genetic diversity

At the smallest scale in the hierarchy of genetic differentiation, we have the genes themselves. It is a well-established concept that having a diversity of genetic variants (alleles) within a population or species is critical to their future adaptation, evolution and persistance. This is because different alleles will have different benefits (or costs) depending on the environmental pressure that influences them; natural selection might favour one allele over another at one time, but a different one as the pressure changes. Having a higher number of alleles within the population or species means that there is a greater chance at least a few individuals will possess an adaptive gene with the changing environment (which we know can be quite rapid and very, very strong). The diversity serves as a ‘buffer’ against extinction; evolution by natural selection functions best when there are many options to choose from.

Without this diversity, species run the risk of having no adaptive genes at the ready to deal with a selective pressure. Either a new adaptive gene must mutate (or come about in other ways, such as through gene flow from another population or species) or the population/species will suffer and potentially go extinct. As strong selection causes the species to dwindle, it enters what is referred to as the ‘extinction vortex’. Without genetic diversity, they can’t adapt: thus, more individuals die off, causing more genetic diversity to be lost from the population. This pattern is a vicious cycle which can inevitably destroy species (without serious intervention).

Extinction vortex
A very dramatic representation of the extinction vortex.

For this reason, captive breeding programs aim to maintain as much of the genetic diversity of the original population as possible. This reduces the probability of entering a downward extinction spiral from inbreeding depression and helps to maintain populations into the future (both the captive one and the wild population when we reintroduce individuals into the wild).

“Population”  diversity

Because genetic diversity is critically important for species survival, we must also try to preserve the diversity of the entire gene pool of a species. This means conserving highly genetically differentiated populations within a species as a priority, as they may be the only ones that possess the necessary adaptive genes to save the rest of the species. This adaptive genetic variation can then be introduced into other populations in genetic rescue programs and serve as a means to semi-naturally allow the species to evolve. Evolutionarily-significant units (ESUs) are one measure of the invaluable nature of genetically unique populations.

Although many more traditional conservationists strongly believe that ESUs should be managed entirely independently of one another (to preserve their evolutionary ‘pedigree’ and prevent the risk of outbreeding depression), it has been suggested that the benefit of genetic rescue in many cases significantly outweighs this risk of outbreeding depression. For some species, this really is an act of rescue: they are at the edge of extinction, and if we do nothing we condemn them to die out.

Introducing genetic material across populations (or even species!) can generate new functional genes that allow the recipient species to adapt to selective pressures. This might sound very strange, and could be extremely rare, but examples of adaptive genetic material in one species originating from another species through hybridisation do exist in nature. For example, the black coat of wolves is a highly adaptive trait in some populations and is encoded for by the Melanocortin 1 receptor (Mc1r) gene. However, the specific mutation in Mc1r gene that generates the black coat colour actually first originated in domestic dogs; when wild wolves and domestic dogs interbred, this mutation was transferred into the wolf gene pool. Natural selection strongly favoured this new variant, and it very rapidly underwent strong positive selection. Thus, the adaptiveness of black wolves is thanks to a domestic dog mutation!

Species diversity

At a higher level of the hierarchy, the diversity of species within a particular community or ecosystem has been shown to be important for the health and stability of said community. Every species, however small or seemingly unimpressive, plays a role in the greater ecosystem balance, through interactions with other species (e.g. as predator, as prey, as competitor) and the abiotic environment. While some species are known to have very strong impacts on the immediate ecosystem (often dubbed ‘keystone species’, such as apex predators), all species have some influence on the world around them (we’re especially good at it).

Species interactions flowchart

The overall health and stability of an ecosystem, as well as the benefits it can provide to all living things (including humans) is largely determined by the diversity of species. For example, ‘habitat engineers’ are types of species that, by altering the physical environment around them (such as to build a home), directly provide new habitat for other species. They are a fundamental underpinning of many incredibly vibrant ecosystems; think of what a reef system would look like if there were no corals in it. There’d be no anemones growing colourfully; no fish to live in them; no sharks to feed on these non-existent fish. This is just one example of a complex ecosystem that truly relies on its inhabiting species to function.

Ecosystem jenga
Much like Jenga, taking out one block (a species) could cause the entire stack (the ecosystem) to collapse in on itself. Even if it stands up, however, the system will still be weaker without the full diversity to support it.

Protecting our diversity

Diversity is not just a social construct and is an important phenomenon in nature, at a variety of different levels. Preserving the full diversity of life, from genetic diversity within populations and species to full species diversity within ecosystems, is critical to maintaining healthy and robust natural systems. The more diversity we have at each level of this hierarchy, the greater robustness and security we will have in the future.

The history of histories: philosophy in biogeography

Biogeography of the globe

The distribution of organisms across the Earth, both over time and across space, is a fundamental aspect of the field of biogeography. But our understanding of the mechanisms by which organisms are distributed across the globe, and how this affects their evolution, can be at times highly enigmatic. Why are Australia and the Americas the only two places that have marsupials? How did lemurs get all the way to Madagascar, and why are they the only primate that has made the trip? How did Darwin’s famous finches get over to the Galápagos, and why are there so many species of them there now?

All of these questions can be addressed with a combination of genetic, environmental and ecological information across a variety of timescales. However, the overall field of biogeography (and phylogeography as a derivative of it) has traditionally been largely rooted on a strong yet changing theoretical basis. The earliest discussions and discoveries related to biogeography as a field of science date back to the 18th Century, and to Carl Linnaeus (to whom we owe our binomial classification system) and Alexander von Humboldt. These scientists (and undoubtedly many others of that era) were among the first to notice how organisms in similar climates (e.g. Australia, South Africa and South America) showed similar physical characteristics despite being so distantly separated (both in their groups and geographic distance). The communities of these regions also appeared to be highly similar. So how could this be possible over such huge distances?

Arctic and fennec final
A pretty unreasonable mechanism (and example) of dispersal in foxes. And yes, all tourists wear sunglasses and Hawaiian shirts, even arctic fox ones.

 

Dispersal or vicariance?

Two main explanations for these patterns are possible; dispersal and vicariance. As one might expect, dispersal denotes that an ancestral species was distributed in one of these places (referred to as the ‘centre of origin’) before it migrated and inhabited the other places. Contrastingly, vicariance suggests that the ancestral species was distributed everywhere originally, covering all contemporary ranges within it. However, changes in geography, climate or the formation of other barriers caused the range of the ancestor to fragment, with each fragmented group evolving into its own distinct species (or group of species).

Dispersal vs vicariance islands
An example of dispersal vs. vicariance patterns of biogeography in an island bird (pale blue). In the top example, the sequential separation of parts of the island also cause parts of the distribution of the original bird species to become fragmented. These fragments each evolve independently of their ancestor and form new species (red, and then blue). In the bottom example, the island geography doesn’t change but in rare events a bird disperses from the main island onto a new island. The new selective pressures of that island cause the dispersed birds to evolve into new species (red and blue). In both examples, islands that were recently connected or are easy to disperse across do not generate new species (in the sandy island in the bottom right). You’ll notice that both processes result in the same biogeographic distribution of species.

In initial biogeographic science, dispersal was the most heavily favoured explanation. At the time, there was no clear mechanism by which organisms could be present all over the globe without some form of dispersal: it was generally believed that the world was a static, unmoving system. Dispersal was well supported by some biological evidence such as the diversification of Darwin’s finches across the Galápagos archipelago. Thus, this concept was supported through the proposals of a number of prominent scientists such as Charles Darwin and A.R. Wallace. For others, however, the distance required for dispersal (such as across entire oceans) seemed implausible and biologically unrealistic.

 

A paradigm shift in biogeography

Two particular developments in theory are credited with a paradigm shift in the field; cladistics and plate tectonics. Cladistics simply involved using shared biological characteristics to reconstruct the evolutionary relationships of species (think like phylogenetics, but using physical traits instead of genetic sequence). Just as importantly, however, was plate tectonic theory, which provided a clear way for organisms to spread across the planet. By understanding that, deep in the past, all continents had been directly connected to one another provides a convenient explanation for how species groups spread. Instead of requiring for species to travel across entire oceans, continental drift meant that one widespread and ancient ancestor on the historic supercontinent (Pangaea; or subsequently Gondwana and Laurasia) could become fragmented. It only required that groups were very old, but not necessarily very dispersive.

Lemur dispersal
Surf’s up, dudes! Although continental drift was no doubt an important factor in the distribution and dispersal of many organisms on Earth, it actually probably wasn’t the reason lemurs got to Madagascar. Sorry for the mislead.

From these advances in theory, cladistic vicariance biogeography was born. The field rapidly overtook dispersal as the most likely explanation for biogeographic patterns across the globe by not only providing a clear mechanism to explain these but also an analytical framework to test questions relating to these patterns. Further developments into the analytical backbone of cladistic vicariance allowed for more nuanced questions of biogeography to be asked, although still fundamentally ignored the role of potential dispersals in explaining species’ distributions.

Modern philosophy of biogeography

So, what is the current state of the field? Well, the more we research biogeographic patterns with better data (such as with genomics) the more we realise just how complicated the history of life on Earth can be. Complex modelling (such as Bayesian methods) allow us to more explicitly test the impact of Earth history events on our study species, and can provide more detailed overview of the evolutionary history of the species (such as by directly estimating times of divergence, amount of dispersal, extent of range shifts).

From a theoretical perspective, the consistency of patterns of groups is always in question and exactly what determines what species occurs where is still somewhat debatable. However, the greater number of types of data we can now include (such as geological, paleontological, climatic, hydrological, genetic…the list goes on!) allows us to paint a better picture of life on Earth. By combining information about what we know happened on Earth, with what we know has happened to species, we can start to make links between Earth history and species history to better understand how (or if) these events have shaped evolution.