Use Family SNP Data to Phase Your Own Genome


Photo by: liber

So I’ve already written a post about the challenges of phasing genotype data, but now I’m here to help you accomplish that task. Let’s go through a checklist of what will be needed:

  • Your personal SNP information (through either 23andMe, Navigenics, deCODEme, etc.)
  • The SNP information for your parents (preferably through the same company/microarray platform)

If you want to use my specific method you also need:

Of course, you can implement your own version of this phasing protocol with basic familiarity in a programming language or (more tediously) with some macros and if statements in Microsoft Excel.

How to Phase Your Genome: A Conceptual Overview

Phased v UnphasedWith information from both parents, it is possible to phase your genome (for the vast majority of SNP calls). We rely on the fact that for most situations, you can identify exactly what was inherited from your father and exactly what was inherited from your mother.

For example: If at a particular position, your genotype call is AT, your father’s genotype call is AA, and your mother’s genotype call is TT, then you know that the A must have come from your father, and the T must have come from your mother. Simple! We will refer to situations where phase can be determined as informative.

The chart to the right outlines exactly Informative SNPs Chartwhich situations are informative. The good news is that every situation is informative with the exception of one: when both parents and the child are heterozygous. Here, we are unable to say for certain what allele was inherited from each parent.

A sample implementation of how to phase a child’s DNA is illustrated below:

Phasing Data

Implementing this Phasing Strategy: I’m Here to Help

If you have all the files mentioned above and would like to phase your genome, then I am more than happy to provide you with a Java archive that will allow you to accomplish this task. Even more, I will provide detailed instructions as to how to use this archive (it’s really simple, I swear).

You can download the Java program here as a zip file. Once finished, unzip the contents into the same fold. You should see Launcher.jar and PhaseME.jar. To launch the program, click Launcher.jar (I thought that was pretty obvious), and the GUI pictured below should appear:


The input files need to contain four columns in this order: rsid, chromosome, position, genotype. The genotype data needs to simply be AA,  AT, TT, etc. without any slashes or quotation marks.

Before running the program, you do need to make sure that your data is the same length and contains the same SNPs between both parents and the child. I have not incorporated any checks into the program for this. My recommendation is to use an IF statement in Microsoft Excel (version 2007) to make sure that all three files line up. Also, make sure that only one row of headers exists in the files.

Finally, select the files to be compared (father, mother, child), and select an output location and choose a name for the outputs. There will be 23 outputs with the following filenames: <yourchosenname>.chr<chromosome>.phased.txt. Each chromosome has its own output that shows you the haplotype inherited from the father and the haplotype inherited from the mother. This program just ignores Y and MT data. However, the program does have the ability to recognize whether the child is male or female, and it assigns the X chromosome haplotypes accordingly.

Let me know about any problems with the program (ex. If it does not produce any output), and I will check to see (1. If your input files are the problem, 2. If the program is the problem).


35 thoughts on “Use Family SNP Data to Phase Your Own Genome

  1. 1. thanks
    2. 22 files were created, there is no results for X chromosome
    3. how we know that the processing is finished?

  2. Hi Leon,

    I’ve checked the program, and it turns out that for 23andMe data (which I assume you used) the X chromosome does not process properly since they only report 1 letter for male chromosomes (and the program expects two letters…even for males, they would all look homozygous on the X chromosome).

    I have not put in a bar to monitor the process of the program, but when the Process button becomes depressed it is finished.

    I’ll work on a version that handles the 23andMe X chromosome, but if you would like to try on your own, edit your 23andMe data so that for males, when only one allele is reported for the , ex. A, it is viewed as an apparent homozygosity “AA”.


  3. Thanks very much for putting this together, Alex. I have written a little BASIC program to work with my family’s data, and the results are basically (ha!) similar except for a couple of variations:

    1) I flag “Mendelian inconsistencies,” which could indicate genotyping error, or more interestingly, microdeletions when several inconsistencies are clustered close together.

    Father CC
    Mother CC
    Child CT

    2) I make no attempt to phase the genotypes when one of the triad has a no-call, and I’m sure I could derive some haplotypes if I added some rules. I see you handle some no-calls well, but there are exceptions. For instance, I saw one where G was listed for the paternal haplotype:

    Father AG
    Mother —
    Child AA

    3) I go one step further and derive the maternal and paternal alleles that are *not* passed on to the child. These aren’t truly haplotypes, but they do cover long stretches of a chromosome. I found this useful for my purposes, where my son did not inherit an autosomal dominant trait from me, and I could compare my other alleles with cousins who did inherit the familial hearing impairment.

    4) I didn’t work with the X chromosome data myself, but I don’t seem to get any output for the X, even though I doubled the base call for the males.

    I look forward to exploring all of your current and future tools!

  4. Thanks for your comment Ann. Your basic program sounds very interesting, and you’ve spotted some good holes for me to plug in my app.

    I realize that for null calls, I was assuming that the data would be NN in my programming since the data I used had that. However, in my next version I’ll be sure to include –.

    Identification of hemizygous microdeletions was actually one of the next entries that I’m working on. For that, I’m putting together a program that provides some visual output along with an explanation of methods. Maybe that might be another interesting tool for personal genome/family genome analysis?

    Thanks for reading!

  5. Trying your program now, no output files created?
    Will this work on Affy platform?

  6. Hi Lesley,

    It should work with an affy platform. I’m wondering if your input files are 1. all the same length, 2. Formatted properly. If you want I can check them for you.


  7. Great article and straight to the point. I am not sure if this is actually the best place to ask but do you guys have any ideea where to employ some professional writers? Thanks in advance 🙂

  8. I’ve been working on writing my own program to phase my mother’s data (but her parents are dead), though since I don’t know as much about it, I’ve been going about it in a sideways fashion of taking regions where me and my parent match a given relative, and then comparing me and both my parents’ genomes there. Anyways, I’ve tried to give your program a go, but it’s not creating any output. I made sure the data files had exactly the same SNPs (had to do it for my own program’s purposes as well), so I’m wondering if it’s an issue with formatting…

  9. Oh, you already did exactly what I did. Even Ann has done the micro-deletion/inconsistencies thing. Reinventing the wheel is always fun 😛

  10. Anyone working with FAMILYTREEDNA data? More specifically, utilizing Y-DNA raw data AND familyfinder (autosomal data) to learn more about the mtdna?

  11. Sasha Grey’s Anatomy Sasha Grey is in two group scenes in this highly produced 2006 blockbuster. Here’s a young woman without implants, without much makeup and without
    even a typical porn starlet body.

  12. Program won’t run. I’m using Windows 7-64bit. Is this the problem?

    Result after clicking: asks is I want to save.

    Thanks, Greg

  13. What i do not understood is in reality how you’re no longer really a lot more smartly-liked than you might be now. You’re
    so intelligent. You realize thus significantly relating to this subject, produced me individually consider it from numerous various
    angles. Its like women and men aren’t fascinated unless it is something to accomplish with Woman gaga! Your individual stuffs nice. All the time take care of it up!

  14. Amazing blog! Do you have any tips and hints for
    aspiring writers? I’m hoping to start my own website soon but I’m a little lost on everything.
    Would you propose starting with a free platform like
    Wordpress or go for a paid option? There are so many choices out there
    that I’m totally overwhelmed .. Any recommendations? Thank you!

  15. This is really interesting, You are a very skilled blogger.
    I’ve joined your rss feed and look forward to seeking more of your great post. Also, I’ve
    shared your web site in my social networks!

  16. As a family group, you start to doubt anything your family member
    has ever told you. Online possesses numerous books (of varied quality) which will try to advise somebody inside taking away these kinds of dangers on your personal.
    (Suffice to express, Sony won’t be putting up any roadblocks to buying

  17. Link exchange is nothing else but it is only placing the other person’s blog link on your page at appropriate place and other person will also
    do similar in favor of you.

  18. I was wondering if you ever thought of changing the layout of your blog?
    Its very well written; I love what youve got to say. But
    maybe you could a little more in the way of content so people could connect with it better.
    Youve got an awful lot of text for only having 1 or two pictures.

    Maybe you could space it out better?

  19. Hi there, every time i used to check website posts here early in the break of day, as i enjoy to gain
    knowledge of more and more.

  20. I know this web site provides quality based articles or
    reviews and extra material, is there any other web page which provides such
    data in quality?

  21. The motor fleet insurance market is dysfunctional.
    She said that while his prison sentence for dangerous driving was two years.
    As a child, Ion returned to the United States African Squadron — has
    been visited by every vessel of that Squadron, with one exception.

  22. Hi there, this weekend is good in support
    of me, for the reason that this occasion i am reading this enormous informative article here
    at my residence.

  23. You will want to use a powerful vacuum to suck
    up all the eggs and larvae together with all the food sources that will allow the
    beetle larvae to thrive. It is also possible for the carpet to be quick-dried.
    Verify their business license and ask for a copy of their insurance and bond that you can
    keep in case of a problem.

  24. I was recommended this website by means of my cousin.
    I am now not positive whether or not this put up is written via him as no one else
    recognize such distinctive approximately my problem. You are amazing!

  25. Of approximately a thousand separate species of bats worldwide, there are
    44 known species found on the North American continent.
    The majority of the time, when your ex boyfriend texts you,
    it is with the pretense of casual, innocent conversation. Despite this single obvious benefit, bats are often thought
    of in popular culture with reference to Bram Stoker’s fictional vampire Dracula, even though the majority
    of bats do not drink blood.

  26. Unquestionably believe that which you stated. Your favorite reason appeared to be on the web the easiest thing to be
    aware of. I say to you, I certainly get irked while people think about worries that they plainly don’t know about.
    You managed to hit the nail upon the top and defined out the whole thing without having
    side-effects , people could take a signal. Will probably be back to get more.


  27. Phen375 diet pill is best slimming pill or we can mention that
    weigh loss pill obtainable in marketplace, its clinically tested and have proven information of effective weight reduction. Sometimes it happens normally that some individuals metabolism
    work slower than normal, in that case Phen375 works little slower.
    Reducing unwanted lbs greatly lowers an individual’s chance
    for establishing hip, knee and back problems.

Leave a Reply

Your email address will not be published. Required fields are marked *

This blog is kept spam free by WP-SpamFree.