Sat. Oct 1st, 2022
A building in a Ndebele village, South Africa.  The Ndebele speakers, currently about a million strong, arrived in South Africa with the Bantu expansion.
enlarge A building in a Ndebele village, South Africa. The Ndebele speakers, currently about a million strong, arrived in South Africa with the Bantu expansion.

Mankind originated in Africa and stayed there for tens of thousands of years. To understand our shared genetic history, it is inevitable that we must look to Africa. Unlike elsewhere on the planet, however, African populations were present throughout our history — they were not subject to the same kind of founder effects seen when populations expanded into unoccupied areas. Instead, those populations got jumbled up as groups migrated to new areas within the continent.

Figuring all this out would be a challenge, but it’s a challenge made even more difficult by the fact that most of the genome data comes from humans in the industrialized world, making Africa’s huge populations poorly sampled. That is beginning to change, and a new paper reports on the efforts of a group that has just analyzed more than 400 African genomes, many of which come from populations that have never participated in genome studies before.

New diversity

New genetic variants are constantly emerging. As a result, the oldest populations – those in Africa – should have the most new variations. But identifying these populations can be difficult when there are so many; the study notes that there are more than 2,000 ethnolinguistic groups in sub-Saharan Africa, and only a small number of them have been sampled. The new study is a huge step forward, with more than 400 complete genome sequences from geographically dispersed populations. But even there it is limited, with only 50 new ethnolinguistic groups and two vast regions of the continent represented by people from a single country (Zambia for Central Africa and Botswana for Southern Africa).

That said, the study still picked up more than about 3.4 million genetic variants that hadn’t been described before. These are some sites in the genome with a base (A, T, C, or G) that was not seen there in other populations.

To put that in perspective, most of us have a lot of genetic variations. In the typical person in the new study, these newly identified variants only account for about 2-5 percent of the total variations in their genomes — the rest had been seen before. In addition, a large majority of them (88 percent) were only seen in one individual and thus may only represent a variation that occurred due to a mutation in the last few generations. So while there are some new variations here that will help us unravel Africa’s population history, most of what we’ve found is the kind of thing you’d expect from looking at random people elsewhere.

If we were to get almost a good handle on the genetic variation present in Africa, we would expect that the number of new variants will decrease as we add new genome sequences to the analysis, as each new fewer and fewer undiscovered copies. So the researchers analyzed the genomes one by one and found no evidence that this was happening — we’re a long way from fully cataloging human diversity. However, they find that looking beyond West African populations would give us the greatest increase in previously undescribed variation.

Population trend

To try and identify what the genomes tell us about population histories, the researchers turned to principal component analysis, which identifies key sources of differences in a wide range of data. The main difference separated speakers of Niger-Congo languages ​​from the rest. The second largest difference reflected the geographic distance between Niger-Congo speakers in West Africa and those in Southern Africa. This is likely a product of the Bantu migration, which spread a mix of technology, language and DNA from a source in West-Central Africa, taking them to the rest of the continent.

The researchers use this data to argue that the Bantu migration passed through Zambia on its way to southern and eastern Africa, but their data included many people from Zambia, so it’s not clear if that affected their results.

The work also identified a number of ethnolinguistic groups worth examining more closely. One was genetically similar to East Africans, but was located in West Africa. Two other populations were clearly associated with known language groups, but were not part of the narrow genetic cluster that most other speakers of that language fell within.

Almost every population on Earth is a mishmash of many sources. Native Americans, for example, are largely a hodgepodge of East Asian and ancient Siberian populations. Africans are certainly no different, but the fact that they have stayed on the same continent for so long increases the complexity of these interactions. The new data really drives that home when analyzed for the origin of different segments of DNA.

People from the far west of Africa have a large majority of their DNA from what you might call a West African source. But as you go east, into Central Africa, there’s an increasing amount of what you should call West-Central African DNA, which is then merged and later displaced by Central Africa and then some South and East African sources. There is a sudden shift towards a majority from East African sources as you leave Central Africa to the east, with an increasing contribution from South Africa as you move a little south.

While geography seems to cause the most differences, contributions from distant regions of the continent are in all populations. So while the Bantu migration may have been the greatest event in recent African history, it is layered on top of a long history of population interactions.

What is changing?

Most variations in the human genome are completely silent, as they do not affect genes or other functions and thus float randomly through populations. However, a few offer evolutionary advantage and it may be possible to detect the signal of selection for or against specific variations.

Looking for these signals, the authors found exactly what you would expect based on previous studies of human populations. The strongest pressure on human evolution is disease, and the genes under most pressure are involved in immune functions. After illness comes diet, and again Africans are quite typical, with strong signs of selection on a handful of genes involved in carbohydrate and lipid metabolism. However, there were some strange results, such as selection for variants of genes involved in DNA repair, kidney disease and fibroids. Obviously these need to be examined in more detail before we can get anything out of it or see if it’s just fake.

Immune function isn’t the only way to deal with disease, as the effects of the sickle cell trait on malaria make clear. And since these are African populations, there is evidence of selection for them in some. But hemoglobin is not the only route to malaria resistance, and some populations show evidence of selection for another gene (G6PD). In some cases, populations with a high frequency of sickle cell traits have emerged right next to others with high G6PD selection, probably due to migration.

In addition to the cases where there are clear signals of selection, there are a number of cases where genes have been knocked out by mutation, but are still present in multiple individuals in this data set. That is something that has been seen a number of times before and where there has been some confusion. In many cases we have no idea what the gene is doing and so we can’t say whether we should be surprised by the loss or not. In others, the gene actually appears to be essential based on studies of its loss in mice. Over time, we’ll probably better understand what’s going on, but each of these genes will need to be studied individually in order to do this.

The beginning of a story

While this is a great effort to understand humanity’s shared genetic history, it is more of a prologue than a complete story. We’re closer to capturing the full diversity of African populations, but we’re clearly not done yet. And we’ve been able to gather more information about some of the migrations within Africa that we know of, but we’re not at the point where we can infer anything about the migrations that we have. do not know about.

That last point is quite critical. At this stage, we can examine a piece of DNA and determine that it probably comes from, say, a West African population. But we can’t say much about how it got to West Africa in the first place. Evidence suggests that just as Eurasian populations picked up archaic DNA from Neanderthals, African populations picked up DNA from earlier branches of the human family tree. But without fossil or DNA-based descriptions of those branches, they remain “ghost lines” invisible to us. It is possible that a small percentage of the ranges we currently assign to an African region belong to one of these branches, and we do not yet have the tools to identify it.

Nature2020. DOI: 10.1038/s41586-020-2859-7 (About DOIs).

By akfire1

Leave a Reply

Your email address will not be published.