B4GEN

Wednesday, 1 August 2018

SCIENCE GENETIC: WHAT IS ANCESTORY?


Source: https://medium.com/@dl1dl1/what-is-ancestry-842109cb8ebd


Homepage
Source: https://medium.com/@dl1dl1/what-is-ancestry-842109cb8ebd


Go to the profile of DNA Land
DNA Land
Know your #genome. Help #science. Non-profit.
Feb 16, 2016


What is ancestry?

This post is by Dr. Joe Pickrell (@joe_pickrell)
Anyone who has used commercial genetic testing products like those offered by 23andMe or AncestryDNA will be familiar with the idea of “genetic ancestry”. After mailing in a saliva kit, these companies return a report with seemingly-precise numbers that tell you “what percent of your DNA” (to quote from the 23andMe report) can be traced back to different populations around the world.
At a superficial level, it seems like getting this estimate should be straighforward — look at someone’s genome, apply some fancy statistics, and out pop numbers like “20.7% British and Irish” or “40% Great Britain, 6% Ireland” (these are numbers from my own 23andMe and AncestryDNA reports, respectively. An astute reader might be wondering: wait, shouldn’t those numbers be the same? Hold that thought).
Once you state the problem of “ancestry inference” in more precise terms, however, you very quickly find yourself in the realm of sociology and psychology rather than statistics and genetics. In this series of blog posts, I’m going to discuss the approach to ancestry inference we’ve taken in DNA.Land, and in the process answer some frequently asked questions (as well as some infrequently asked ones) about our estimates.
But in this first post, I want to start at the beginning, with a discussion that’s more broad than Dna.Land: what exactly is the goal of “ancestry inference” anyway?

What is “ancestry”?

A useful question for anyone working on algorithms for learning about ancestry from genetic data is: “How would you describe your ancestry?”. Try to answer the question yourself. Ask your friends. Bug some strangers on the Internet.
If the people you talk to are anything like the people I’ve talked to, the answers will generally break down into two broad categories:
  1. Many people use geographic labels to describe their ancestry, often based on current political borders. E.g. “French” or “Chinese”
  2. Many people use ethnic labels to describe their ancestry. E.g. “Jewish” or “Caucasian”[1].
Let’s take it for granted that the “correct” definition of “ancestry” is something that aligns with these intuitive responses. This suggests that people expect a genetic “ancestry test” to predict the geographic and/or ethnic labels of their ancestors.
Unfortunately, if you sit down and try to write an algorithm to do this, you will immediately come across two huge and mostly intractable problems.

Problem #1: What time depth are we talking about?

Obviously we all have ancestors that lived at different times. You had maybe 8 ancestors living 100 years ago, but many thousands that lived 500 years ago. So whose geographic and/or ethnic labels should we try to guess — those of your ancestors living 100 years ago, or those living 500 years ago? (Or 1,000 years ago? Or…?).
A reasonable first guess is that when people talk about their ancestry they’re generally talking about recent ancestors, such that the “correct” answer to this question is something like 100 years ago. But this isn’t satisfying: in the United States there are many people whose ancestors immigrated to the country hundreds of years ago but who think of their ancestry as (for example) “British” or “Chinese” rather than “Michigander” or “Californian”.
So it’s not totally clear what time depth people generally think of when they think about their ancestry. Indeed, it seems plausible that the “correct” time depth to report in an ancestry test depends on a user’s…ancestry. This should be a hint that this is not something that can be objectively read from DNA.

Problem #2: Ancestry identifiers are influenced by social and political factors

This becomes even clearer when you notice the fundamental problem that some of these labels that we think of as “ancestry” are strongly influenced (and indeed sometimes determined by) social and political factors. Obviously no genetic markers change when someone converts to Judaism, or when the territory where someone lives is annexed by a neighboring country. But these events often have dramatic influences on how the descendants of these individuals think of their ancestry, via cultural transmission of things like languages and traditions.
Indeed, construction of a shared ancestral identity was (and remains) a method for consolidation of political power over diverse cultures (see e.g. Franco in Spain). This is largely invisible to genetics, except after hundreds or thousands of years (if shared identities influence subsequent marriage and/or migration patterns).

A solution

To get around all of these problems, what you would ideally like to have is a detailed list of your ancestors at different time depths, each labeled with their geographic location and any ethnic self-identifiers. You could then say, for example, that 100 years ago 25% of your ancestors lived in Illinois and identified as Jewish, while 500 years ago 5% of your ancestors lived in present-day Andalucia and identified as Muslim [2].
Unfortunately genetic tests are about as useful as Ouija boards for obtaining much of this information, so we’re going to have to compromise with some dramatic approximations [3]. Specifically, the approach taken by all of the commercial companies (and that we take as well) is to try to estimate the general geographic regions where your ancestors lived (and in a select small number of cases their ethnic identifiers) some indeterminate time in that past, probably something like a few hundred years ago.
Does this all sound a bit vague? It should because it is. The precision suggested by these reports is an illusion — there’s plenty of wiggle room in the definition of “general geographic regions” and “some indeterminate time in the past” to allow for very different interpretations [4].
But the key is this: if we replace an impossible goal of perfectly understanding the geography and ethnicity of your ancestors with the more realistic goal of getting a general understanding about some of them, we can now make some progress. This might seem a bit disappointing, in that we’ve abandoned the exactness and objectivity that seem promised by a “genetic test”.
But in many cases even an approximate understanding can be quite meaningful. Millions of people around the world have purchased these tests. Some have uncovered aspects of their family history that were kept secret for fear of discrimination (indeed I’m one of them). Some have discovered hospital mixups that led to puzzling mismatches between their cultural and genetic ancestries. Still others have confronted the genetic legacy of slavery in their own genomes.
So these types of inference, despite their important limitations, are nothing to scoff at. In the next post, I’ll discuss the exact methodologies used for ancestry inference by the commercial companies, and then the software we developed for dna.land.
References:
[1] Though the Caucasus is a geographic region, the word “Caucasian” is used in the United States as an ethnic identifier approximately synonymous with “white”.
[2] You might also be interested in whether you actually inherited any genetic material from each ancestor, but let’s avoid opening that can of worms for now and assume the properties of your geneaological ancestors are the same as those of your genetic ancestors.
[3] It’s plausible that by including things like written records you might be able to do reasonably well for recent ancestors, as is done by Ancestry.com without DNA. But this of course relies on all your ancestors having lived in places where written records are available and reliable.
[4] Note the different ancestry proportions reported to me by 23andMe and AncestryDNA. Most people think of these differences as different algorithmic solutions to the same question, but it’s entirely possible that the algorithms used by the two companies are answering slightly different questions! For example, it may be possible that the 23andMe algorithm is looking at slightly more recent ancestry on average than the AncestryDNA algorithm (I actually think this is indeed the case, for what it’s worth). On this general topic it’s worth reading this great post by Debbie Kennett comparing ancestry composition results across companies.
  • Ancestry


Like what you read? Give DNA Land a round of applause.
From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.



  • Go to the profile of DNA Land

    DNA Land

    Know your #genome. Help #science. Non-profit.


  • Top on Medium
    Sex, Beer, and Coding: Inside Facebook’s Wild Early Days in Palo Alto

    Go to the profile of WIRED
    WIRED


    Top on Medium
    A New Device Can Hear Your Thoughts

    Go to the profile of Rachel Slade
    Rachel Slade



    Top on Medium
    I created the exact same app in React and Vue. Here are the differences.

    Go to the profile of Sunil Sandhu
    Sunil Sandhu


    Responses



    Posted by david at 22:59
    Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest

    No comments:

    Post a Comment

    Newer Post Older Post Home
    Subscribe to: Post Comments (Atom)

    About Me

    david
    View my complete profile

    Blog Archive

    • ►  2019 (66)
      • ►  June (1)
      • ►  May (2)
      • ►  March (4)
      • ►  February (20)
      • ►  January (39)
    • ▼  2018 (739)
      • ►  December (37)
      • ►  November (43)
      • ►  October (25)
      • ►  September (25)
      • ▼  August (45)
        • DATA MODELING; BASIC TERMS AND CONCEPTS- REVIEW
        • E-COLLOQUIUM INFORMATION DESIGN & ARCHITECTURE(BOO...
        • INF. TECHNOLOGY: STATE-OF-THE ART - TUTORING PORTA...
        • PAK: GUJRAT "DOOLEY SHAH KE CHOOHEY" دولے شاہ کے چوہے
        • DUBAI-INFO-01
        • ANCESTRY DNA RESULTS: THE SURPRISING WAY SALIVA B...
        • US: MUSLIM YOUNG GIRL BODY SCREEING OF PORN ORGAN...
        • CHINA ZOO: FRIENDSHIP OF KIDS OF LION, TIGER, DOGGY
        • US: TRUMP IMPEACHMENT: HAS IT BEEN EVER SO ?
        • NEHRU FAMILY: THE TRUTH OF NEHRU FAMILY
        • SCEINCE TECHNOLGY: WHETHER INDIA CAN LAUCH LIVE HU...
        • US IMMIGRATION EB-5 BY INVESTMENT
        • SOCIAL CLASSIFICATION BY DRAWNS: CONTRAST COLONESE...
        • HUMANITY: RELIGION:: WHO WAS ANNA THE PROPHETESS I...
        • Humanities › Religion & Spirituality: WHO WAS JESUS?
        • Humanities › Religion & Spirituality: WHAT DOES TH...
        • RUSSIAN SATELLITE DANGEROUS ACTIVITIES ARE BOTHER...
        • BRITAIN SNATCHED HOW MUCH WEALTH FROM INDIA DURING...
        • GOOGLE RECORDS EVERYTHING WHAT APPS CAN CATCH.
        • PAK: UNBEARABLE RESIDENTIAL CITIES INCLUDE KARACHI...
        • FRANCE: BIRD CROW WILL PICK SIGARTTE PIECES IN TH...
        • PAK: PIAC BURRIED CORRUPTION IN TRILLIONS
        • IND: PADESTRIAN PILGRIMAGE TO HIMALIYAN TEMPLES
        • PAK: GROUD REALITY OF ELECTION 2018 POLLING SCEEN...
        • HEALTH: HUMAN TASTICALES: ON USING TIGHT UNDERWEARING
        • TRUMP AGRESSING TURKEY NETO ALLY WITH HITLER LIKE ...
        • PAKISTAN BACKS SAUDI ARABIA IN DIPLOMATIC TUSSLE W...
        • PAK: KELASH GIRLS IN COKE STUDIO
        • JAPAN AUGUST BOMBING BY AMERICA: NAGASAKI LITTLE ...
        • SOUTH INDIA: CERTAIN LEADERS ARE BURRIED NO CREME...
        • TECHNOLOGY E-COMMERCE - BUSINESS MODELS
        • EASTERN AUSTRALIA:: EFFECTED BY SEVERE DRAUGHT (N...
        • EVOLUTION: WHETHER ‘BRAIN-POWER’ OR ‘HEART-IMPULSE...
        • IMRAN KHAN"S PET DOGS WIKIPEDIA
        • SHEDDI MEMBER PARLIAMENT SINDH - FIRST LADY TANZEE...
        • FIRST SHEDDI MEMBER PARLIAMENT (SINDH) LADY -TANZEELA
        • 9/11 OSAMA LADEN FROM CHILDHOOD TO ISLAMIC JEHADI
        • IND: SIKH FAMILY COFFIN MOVEMENT DOOR TO DOOR, P...
        • AMERICA TECHNOLOGY ABUSE EFFECTING MID ELECTIONS 3...
        • CHINA, IRAN, SAUDIA, AND ISRAEL ENEMITY VS CHINA; ...
        • PAK: BALUCHISTAN; MASTOUNG HOW SUICIDE BOMBAR FAM...
        • SCIENCE GENETIC: WHAT DOES ANCESTRY TELL?
        • SCIENCE GENETIC: WHAT IS ANCESTORY?
        • IND: ASSAM MASSES IN ASAM DECALRED BY SINGLE STROK...
        • CHINA PORT: A NEW 'PICNIC POINT' FOR KARACHIITES?
      • ►  July (45)
      • ►  June (197)
      • ►  May (139)
      • ►  April (67)
      • ►  March (37)
      • ►  February (46)
      • ►  January (33)
    • ►  2017 (1003)
      • ►  December (21)
      • ►  November (28)
      • ►  October (34)
      • ►  September (31)
      • ►  August (44)
      • ►  July (46)
      • ►  June (27)
      • ►  May (38)
      • ►  April (69)
      • ►  March (104)
      • ►  February (550)
      • ►  January (11)
    • ►  2016 (10)
      • ►  December (6)
      • ►  November (4)
    Simple theme. Powered by Blogger.