Select Page

I’ve been brought in to work on several cases in which incest has been suspected in a family. Some of these cases involve a woman having a child with a brother or uncle.

The first thing I like to ask someone when they discover that there’s incest in their family is if they feel they need to talk to a counselor. People shouldn’t have to navigate those feelings on their own if they don’t want to.

Most of the cases I’m aware of result from a person running the ‘Are Your Parents Related Tool?’ (AYPR) at GEDmatch.com. If there’s a significant number of total centiMorgans (cM) in that result, GEDmatch will provide you to a link for a counselor. I used to think that this was a link to a traditional counselor, but I eventually learned that it only provides contact information for genetic genealogists who have experience with cases like this. While this is helpful, I believe that there should be a second link to a traditional counselor in the event that a person wants credentialed emotional support, and not just help navigating genetic genealogy.

After talking to a counselor, people are likely going to look for answers. On top of knowing that, for example, your parents are closely related, imagine how it feels to not know how they’re related.

The AYPR tool analyzes a person’s DNA kit and shows them the total cM they have with respect to runs of homozygosity (ROH). This blog post revealed a handy rule-of-thumb for improving upon the result of the AYPR tool. (I also have an article about how you can further improve upon that.)

However, it’s usually more useful to just look at the resulting total from the AYPR tool and see what relationships could result in that amount of cM or percentage of ROH. One can do that by looking at the tables I present below.

People who work on incest cases have developed some skills with regard to identifying possible relationships when a new case comes up. This should in no way give them confidence that they have all of the answers or can even arrive at the answer by using very limited datasets and frankly limited experience given the nature of this work. There is no substitute for the tables below. One could work on millions of cases and meticulously compile the data from each, but the resulting dataset would be very flawed compared to the numbers below. Why is that? Working on a case, by it’s very nature, is a result of not knowing how two parents were related. So one analyzes the numbers and then makes a determination, which is their best guess for a possible relationship that parents have to each other. Only sometimes is this determination able to be subsequently verified through traditional genealogical methods. This also doesn’t rule out additional relationships that may exist. If a dataset is compiled from all of the cases in which a determination was made, then one would get reference points for what ranges of total cM a certain relationship might normally produce. But this is circular. How can data for unknown relationships then be used to determine other unknown relationships? The answer is that it would have to be done very carefully, that it would require thousands of cases, and that all of the data would have to be error free. This is impossible. What’s necessary is a scientific way of finding out exactly what the averages could be as well as a highly accurate method for determining the ranges.

I find myself in the position of having a highly accurate model that can calculate the likelihood of different relationships. (A word to those who might not understand this model: Barring simulation user error or misinterpretation of the results afterwards, the averages given by the model cannot be wrong. As for the ranges, since the model is trained to achieve standard deviations from peer-reviewed literature, the ranges will be better than what could be found in datasets, assuming that there ever will be data for some of these scenarios.)

In order to help more people than the ones I’ve worked with, and to help people find answers more quickly, I’ve started providing some statistics that I’ve already calculated, just as I do with my other models. I’m also adding additional scenarios as they come up or as I think of them.

If you are here because incest in your own family, I do hope that you’ve already gotten any support that you feel you need. Below you will find tables that show what the AYPR results will look like for various scenarios.

Figure 1. Statistics for comparison of one’s own genome to itself, such as can be done with the ‘Are Your Parent’s Related?’ tool at GEDmatch.com.

Figure 2. The same statistics as in Figure 1, except converted to cM. I usually don’t report statistics in cM, since percentages are universal and cM would require a separate table for each platform. However, since one is likely to only find ROH numbers at GEDmatch, this is likely the only table that’s necessary. These cM values were obtained by multiplying the numbers in Figure 1 by 35.87 (and dropping the percentage units). These numbers can be compared directly to the AYPR tool.

If you have discovered that your father is actually a close relative of your mother, you may not be able to tell exactly how they’re related by looking at the regions over which your chromosomes copies are identical. If your father is your mother’s brother or her son, the averages of those values are the same: You would expect to have about a quarter of your segments identical to each other. However, if you know that one of the two options is the case, Figures 1 and 2 could assist you if your identical regions add up to less than 9.3% or more than 45% of your genome. In those cases, and even for values that are several percentage points further towards the inside of that range, you could deduce that your father is not also your maternal uncle–it would be the case listed ‘brother’ instead.

Below are some other relationship possibilities that have come up. A person can compare a percentage of shared DNA to a known relative and see if it’s possible based on the tables below. Percentages can also be converted to cM, usually by multiplying by a value in the range 68-72, depending on the total cM for a genotyping site. For GEDmatch, the value is 71.74. However, when converting FIR or ROH percentages to cM, you have to use half of the conversion rate, so 35.87 for GEDmatch. Also please note that X-DNA values, such as what would be reported at 23andMe, must be removed from your data before comparing to the tables below.

The ranges of shared DNA can be very useful. Sometimes there is a clear distinction between the possible relationships, where one scenario is highly likely and all others are very unlikely. A lot of the time, more than one scenario is probable, but usually one is slightly more probable.

Figure 3. Shared percentage of HIR (when not noted), HIR + FIR, or FIR DNA for various relatives when one’s parents are siblings to each other. In cases in which FIR values are present, HIR only values are not included, as they are less useful than the other values. However, one can calculate the HIR values by subtracting half of the FIR values from the HIR + FIR values. Parents in this scenario always have HIR only values of 50%, as usual. There is only one scenario in the above table in which oneself is not compared to another relative. That’s the last row, in which one’s child is compared to their half first-cousin, who is the half first-cousin, once removed, of one’s child.

Figure 4. Shared percentage of HIR (when not noted), HIR + FIR, or FIR DNA for various relatives when one’s father is also their mother’s father. All other information is the same as in Figure 3.

Figure 5. Shared percentages for your 1C1R, whose father’s parents are your maternal uncle and that uncle’s close relative listed in each row. Each simulation consists of 500k trials, which is important for understanding the minimum and maximum values listed.

I hope you’ve found these results useful. More will be on the way.

Feel free to ask me about modeling & simulation, genetic genealogy, or genealogical research. To see my articles on Medium, click here. And check out my nifty calculator that’s based on the first of my three genetic models. It lets you find the amount of an ancestor’s DNA you have when combined with various relatives.