Saturday, December 9, 2023

Types of DNA evidence and how much you need


This blogpost is an attempt to summarise the typical types of DNA evidence that might be used to support genealogical conclusions when using genetic evidence.  It is intended to be used in conjunction with the DNA Case Study Template developed by Danielle Lautrec. These documents are to be discussed at the SAG: DNA Tools in Practice Meeting on 16th December 2023. It should be noted that these are my views and others may disagree.  As always, I welcome your feedback.


How much evidence is needed?

In 2021 I first wrote about the DNA Research Methodology.  This research methodology promotes slight variations depending on whether you are looking at close relations, such as 3rd cousins or closer, or more distant ancestors.  We need to adopt a similar approach when confirming genetic relationships, with additional evidence needed and more rigour applied when looking at distant relationships beyond 2nd great grandparents, or where there is no paper trail.





Not all DNA conclusions need the same level of supporting evidence, we need to think of it as a continuum.  Often it will be the impact the decision will have on others that will drive how much evidence is needed, particularly if the person or their close family are still alive.  There will also be differences if your ancestor left behind a strongly documented paper trail, or if you have started with a blank sheet!  





When assessing close relationships, genetic conclusions need to be ‘beyond reasonable doubt’  particularly where there is no existing paper trail and they affect living people.  


In contrast, given the passage of time, conclusions regarding more distant relationships may not always need to be ’beyond reasonable doubt’.  However, as DNA inheritance is random these decisions still need to be supported by sufficient genetic evidence to support the hypothesis on ‘the balance of probabilities’ consistent with Genealogical Standards.  


Naturally, genetic evidence cannot be used in isolation. Any conclusion reached needs to be reasonable and defensible, supported by other contextual information and genealogical records.  Continue to gather more evidence to move up the continuum.  



Examples of DNA evidence

The following table summarises the types of evidence you might use when documenting your conclusions and are expanded upon later in this post.  There are probably more types of evidence that could be used but I suspect these are the main ones.  

Evidence examples

3rd cousin and closer

More distant relationships

1. Individual DNA matches - Assess likely shared cMs range for relationship

  • Can be confirmed with documented paper trail, if relationship and range met
  • Segments for close cousins can be utilised to build pedigree segment base for ancestors (only indicative without triangulation).

  • Indicative only - need additional confirmation including shared match groups and triangulation.
  • Segments for close cousins should be utilised to ‘walk back’ the segments to more distant ancestors.

2. Triangulated DNA matches/groups - Assess likely shared cMs range for relationship

  • Can be confirmed with documented paper trail, if relationship and range met.
  • Segments for close cousins can be utilised to build pedigree segment base for ancestors (triangulation ensures segments confirmed).
  • Helps to build hypothesis for no paper trail.

  • Those with ‘tree and segment triangulation’ can be confirmed with documented paper trail, provided triangulation fundamentals are met.
  • Segments need to be verified by ‘walking back’ the segment, confirming earlier generations first. Utilise visual phasing data when available.
  • Triangulated groups help to build hypotheses for no paper trail.

3. Shared match groups and cluster reports

  • Tree triangulation within shared match clusters such as DNAGedcom.  
  • Tools such AncestryDNA Thru-Lines and Theories of Relativity at My Heritage strengthens the evidence.
  • Helps to build the hypothesis for close cousin relationships.

  • Indicative only - shared matches <30cMs need to be examined very carefully when determining shared match groups.
  • Can be used to support hypothesis where there are members of clusters also in ‘triangulated segment groups’ with ‘tree triangulation’.

4. WATO charts

  • Relationship probability tool best used for predicting close relationships.
  • For predicted probabilities to be more reliable, matches need to be 40cMs or greater.

  • WATO charts can be useful as a visual aid but as distant matches are usually <40cMs probabilities cannot be relied upon.
  • Helpful for documenting matches and overlaying other evidence that support the hypothesis, ie. where triangulated groups or clusters exist.

5. Ethnicity reports

  • Percentages more useful where unique ethnicities exist. 25% = likely grandparent, 12.5% = likely great grandparent etc.

  • Indicative only - Smaller percentages can be a mix of both sides and need to be used with caution.

6. Y-DNA and mtDNA

  • Best for confirming more distant relationships on patrilineal and matrilineal lines - ideal for testing hypotheses.



1. Individual DNA matches - shared cMs

All DNA sites have relationship predictors, for this post I am using the one from AncestryDNA.  Another popular predictor is the Shared cM Project Tool at DNA Painter.

For each of the following three matches, I have identified our most recent common shared ancestors, but can I confirm the genetic relationship with just match details alone?




Match A - Joanne:
* The paper trail for Joanne and me is strongly documented, our relationship being 2nd cousins, sharing great grandparents (Coat-Bradley);
* The shared cMs of 227 over 10 segments in 69% of cases is suggestive of a 2nd cousin relationship;
* Given the paper trail and the genetic evidence points to a 2nd cousin relationship (3rd cousins or closer), we can confirm the DNA match;
* The 10 segments shared with Joanne will be very helpful to further analyse more distant relationships and it is recommend that close matches be requested to upload to a chromosome site if possible;
* Until each segment is analysed individually there is no guarantee that all 10 segments came from the same common ancestor.  In fact, as Joanne shares great grandparents with me, each of these segments will probably have been inherited from different 4th or 5th great grandparents from either one of the 'identified MRCA couple';
* Segments from close cousins are the starting point for 'walking back the segments' and confirming your more distant ancestors.


Match B - Jennifer:
* There is 'no paper trail' for Jennifer and me, as our relationship is on my unknown great grandfathers line;
* The shared cMs of 54 over 3 segments in 37% of cases is suggestive of a relationship as close as a 3rd cousin;
* I have undertaken detailed genetic analysis not documented in this post and I believe my great grandfather to be one of three brothers of Prussian descent.  Based on Jennifers shared matches, her connection is also probably on this line;
* If my hypothesis is correct, my genealogical research suggests Jennifer is most likely a 'double' cousin, sharing two sets of common ancestors (one at the 4th cousin level and the other at 5th cousin level);
* If I had only identified one of the cousin relationships, I would have found that a fourth cousin relationship was likely in only 4% of cases - a trigger that further research was needed;
* Given there is no paper trail and the genetic evidence is beyond 3rd cousins or closer (ie. more likely a distant relationship), we cannot confirm the DNA match without further evidence;
* Ideally, the 3 segments shared with Jennifer need analysis at a chromosome site if possible;
* Until each segment is analysed individually there is no guarantee that all 3 segments came from the same common 'ancestor couple' or whether they split between the 'two ancestor couples';
* Shared match groups and clusters for this match need to be carefully examined, to ensure they are not inadvertently allocated to the wrong ancestral couple group;
* This match can only be marked as 'tentatively confirmed' to both ancestor couples.


Match C - Peter:
* The paper trail for Peter and me is strongly documented, our relationship being 4th cousins once removed, sharing my 4th GGP's (Richards-Coggan);
* The shared cMs of 21 over 2 segments for a 4th cousin once removed relationship is only found in 11% of cases, which is at the lower end.  In contrast, many closer alternate relationships are suggested.  Is the genealogy wrong?
* Whilst we have a paper trail the genetic evidence is low.  As the relationship is also distant (greater than 3rd cousins) we cannot confirm the DNA match without further evidence;
* The 2 segments shared with Peter need analysis at chromosome site if possible;
* Until each segment is analysed individually there is no guarantee that both segments came from the same identified 'ancestor couple';
* Based on the analysis being solely at AncestryDNA this match can only be marked as 'tentatively confirmed'.


2. Triangulated DNA matches

Chromosome analysis sites have an advantage in providing segment details for our matches.  'Segment Triangulation' is a process applied to identify matches who have tested their autosomal DNA and all match on the same chromosome, in the same segment area and all match each other.  It is applied to all chromosomes 1-23 and is evidence of a shared common ancestor.  

These groups of matches are called 'triangulated groups'.  We can confirm the match if we can also identify 'Tree Triangulation' for at least 3 matches in the triangulated group, provided triangulation fundamentals are met (for more information on triangulation concepts refer to the resource list at the end of this post).  

The 23rd chromosome has unique inheritance properties which can be useful for narrowing down relationships. Whilst all companies test the 23rd chromosome (also called 'X'), you can only view these matches at GEDmatch, 23andMe and FTDNA.


Match B - Jennifer:
* Jennifer has not transferred her results to a chromosome site;
* However, her son has tested at My Heritage and shares 36cMs on one segment on Chromosome 13;
* A triangulated group has been established, consisting of several matches who have ancestors from Germany;
* Whilst the genetic evidence confirms the group shares a common ancestor, without 'tree triangulation' we are unable to determine which of the two possible shared ancestor couples is the relevant one;
* As the segment inherited by the son is the same size as the largest of Jennifers 3 segments, we can conclude that the other 2 segments may be quite small and potentially the shared ancestor for those segments may be a long way back (possibly back further from one of the 4 likely 3/4th GGP's);
* Our confidence level that this is a confirmed DNA match may have increased given the genealogy of the triangulated group on chromosome 13, but without 'tree triangulation' with a third match it can still only be marked as 'tentatively confirmed';
* Without further evidence we still cannot assign the match/segment to either one of the two sets of possible shared ancestors 3GGP's (Noll-Zimmerman) or 4GGP's (Wedding-Rawolle).  For the purposes of my ongoing analysis the match/segments can only be assigned to the lowest level of confidence, in this case my 2nd GGP's (Noll-Wohling).



NOTE Current issues at MH prevent adding a triangulation image to this post.  
It will be updated when available.


Match C - Peter:
* Peter uploaded his results to GEDmatch which enabled analysis of our match at the chromosome level;
* GEDmatch indicates we share 13.4cMs on Chromosome 13 and 15.6cMs on Chromosome 14, a total of 29cMs.  Whilst AncestryDNA estimated we shared 21cMs over 2 segments, it is not unusual for variations like this to occur due to difference in reporting rules;
* By examining other matches who matched on the same chromosome, I identified another match descended from the same couple I shared with Peter, my 4th GGP's (Richards-Coggan).  This match also matched in the identical segment area on both chromosomes;
* My analysis has identified 'Segment Triangulation' AND 'Tree Triangulation' on both segments providing evidence of shared common ancestors, my 4th GGP's (Richards-Coggan), for Peter (4th cousin once removed) and a 5th cousin once removed;
* Given the paper trail is robust for both matches and the genetic evidence is confirmed by 'Segment Triangulation' (distant relationships), we can confirm both DNA matches.


3. Shared Match Groups and Cluster Reports

Shared match groups at AncestryDNA are very useful for classifying your matches into grandparents groups and more distant ancestral groups.  They are most useful for classifying matches to the grandparent and great grandparent levels.  Difficulty arises for more distant cluster groups as we are unable to view how matches match each other, unless we have access to individual match lists. 

Another useful tool is DNAGedcom which can provide a visual representation of cluster matches and also displays which matches are in common with each other.  The images used in this section were extracted from DNAGedcom with matches down to 20cMs consistent with AncestryDNA shared matches.  

Remember however that DNAGedcom provides us with details of 'shared matches' not 'shared segments' so these reports are always 'clues'.  We can hypothesise about what they are telling us based on the number of shared segments for each match and using other known 'triangulated segment' data.


Match A - Joanne:
* In this case we know the relationship, have a strong paper trail and have DNA confirmed the 2nd cousin match (3rd cousin or closer); 
* Unfortunately Joanne does not have her results on a chromosome site, however her brother is on FTDNA but not AncestryDNA;
* Shared matches with Joanne at AncestryDNA show 38 shared matches, that could come from any of the 4 ancestral lines associated with our shared great grandparents;
* Of these matches I have identified shared ancestors for 27 of them, split between all 4 possible ancestral lines.  10 remain unidentified;
* The DNAGedcom report suggests 3 key main cluster groups and based on the matches included are consistent with our 4 shared 2nd GGP's.  The cluster report is supported with many interrelationships common for a close relationship like 2nd cousin.  Only 24 matches appear in the DNAGedcom report suggesting that the other 13 matches may not also share other matches over 20cMs with Joanne;
* 13 shared matches have their results on a chromosome site, all of whom are triangulated with Joanne's brother, on multiple chromosomes at GEDmatch, providing further evidence that our DNA confirmation was correct;
* The triangulated data for Joanne's brother adds further weight to the evidence that the shared matches appearing in these groups and clusters are also likely to be related on the same ancestral line.





Match B - Jennifer:
* The hypothesis for this match is that Jennifer is a 'double' cousin, both a 4th and 5th cousin, related to me via two sets of shared ancestors Noll-Zimmerman (3GGP's via Noll-Wohling 2GGP's) and Wedding-Rawolle (4GGP's via Noll-Wohling 2GGP's and Wohling-Wedding 3GGP's);
* There is a high level of confidence regarding the largest segment match due to identifying a triangulated group on Chromosome 13 with Jennifers son, but without 'Tree Triangulation' the shared ancestral couple has not been identified;
There are 18 shared matches with Jennifer at AncestryDNA, all of whom are consistent with being connected on my Prussian line, ancestors from my 'mystery' great grandfather, his parents Noll/Wohling (2GGP's).  13 matches have identified MRCA's, 2 x Noll-Wohling; 3 x Wohling-Wedding; 6 x Noll-Zimmermann; and 2 x Wedding-Rawolle; 
* The DNAGedcom shared match cluster report shows Jennifer in Cluster 22 with nine others.  7 of these share the ancestors of Noll-Zimmerman (3rd GGP's). A number of matches in Cluster 22 also share with others from Clusters 9 and 11;
Jennifer shares with two matches from Cluster 9 sharing the MRCA of Wohling-Wedding (2GGP's); 
Jennifer shares a match from Cluster 11 sharing the MRCA of Wohling-Wedding (3GGP's);
Jennifer shares a match from Cluster 14 sharing the MRCA of Wedding-Rawolle (4GGP's);
* The many shared matches within the various cluster groups appear to support the hypothesis;
* Cluster 22 provides further evidence to support the relationship of Jennifer to the shared ancestral couple of Noll-Zimmerman;
* It can be concluded that the shared matches with Jennifer in Clusters 9, 11 and 14 are all potentially segments from the Wedding line.  This further supports the shared ancestral couple of Wedding-Rawolle and could be an indication of shared segments 'walking back up the line';
* It still remains unclear from looking at clusters alone which ancestor group is the MRCA for the triangulated group with Jennifers son on Chromosome 13.  Looking at the sheer numbers of confirmed matches one might suspect the largest segment of 35cMs would be coming from the Noll-Zimmermann group, also being the closest ancestral couple.  However, when looking at shared matches for known Wedding-Rawolle descendants it seems more likely to be a segment on the Wedding-Rawolle line (but this is by no means certain); 
* Assuming my hypothesis regarding the identity of my great grandfather is correct, then we do have documentary evidence that two different relationships would exist between Jennifer and me;
* Whilst we do not have segment data to definitively confirm either of the genetic relationships, the cluster reports provide a much higher level of confidence that both connections to the MRCA's of Noll-Zimmerman and Wedding-Rawolle are also highly likely genetically.
* Where do we now sit on the continuum?  To answer my great grandfather question I need to rely on much more evidence than this match alone, so perhaps it is sufficient for my purposes given I have a lot of other supporting data.  However in other circumstances, it may be quite critical to resolve where each segment was coming from to be able to answer your research question.  
As a side note, when I first started researching my great grandfather, it was this match that gave me the most heartache, but in the end it was the one that helped the most to finally narrow down the possibilities between two families. 






Match C - Peter:
* In this case we know the relationship is 4th cousin once removed, have a strong paper trail but the DNA evidence was initially low;
* The connection was later DNA confirmed through segment triangulation, on two chromosomes but with the same match;
* At AncestryDNA there are only 6 shared matches with Peter.  Of those, 5 matches have identified MRCA's, 1 sharing my 2GGP's (Coat-Richards) and 4 sharing our 4GGP's (Richards-Coggan) the same ancestors I share with Peter.  Whilst we have no segment data for all but one of these, these matches provide additional evidence to support our conclusions about our genetic relationship, increasing our confidence level;
* The DNAGedcom report places Peter in Cluster 45, it includes 4 of the identified matches for our shared  4GGPT's (Richards-Coggan) including the match who triangulates on both Chromosome 13 and 14;
* Cluster 44-47 are forming a super cluster, with a number of shared matches between the groups.  Peter only has one match shared with Cluster 44, but these are additional clues we can use to build our DNA evidence;
* Cluster 44 contains matches who share my 3GGPs (Richards-Richards) on same the ancestral line as my connection to Peter, via Susannah Richards (no known relation to her husband John);
* The match that Peter shares in Cluster 44 (green arrows) is the same as at AncestryDNA who shares my 2GGP's (MRCA Coat-Richards).  This suggests that the match in Cluster 44 matches me on more chromosomes than Peter, but the one she shares with Peter is a Richards-Coggan segment.  Other segment matches for her in Cluster 44 may have been inherited from John Richards' line, Richards-Lloyd;
* My Coat-Richards match in Cluster 44 (red arrows), shows two shared matches with Cluster 45, which we can hypothesise as likely Richards-Coggan segments.  The first we know is Peter and the second turns out to be the same match that is on GEDmatch and triangulates on Chromosomes 13 and 14.  
* This suggests that our matches in Cluster 45 (who match Peter and the other one in the C13/14 TG) probably triangulate on the same segment area on at least one of those chromosomes.  Whilst these matches have not uploaded to a chromosome site, the additional evidence provided by the DNAGedcom cluster report increases the level of likelihood of the DNA relationship with Peter to highly likely;
* Whilst Peter shares with other matches who share the Richards-Coggan in Cluster 44, there is one he does not share with.  Whilst this match shares with others in the cluster who are Richards-Coggan descendants it suggest their shared segment is probably on different chromosome, not C13/14 where Peter shares.  It should also be remembered that without seeing segment data it could be coming from a different ancestor entirely, particularly in close communities;
* We started out with an individual match to a distant MRCA with a strong paper trail but our genetic evidence was low and we could only mark the relationship as 'Tentatively Confirmed';
* By next using segment triangulation, we were able to mark the relationship as 'DNA Confirmed', but there was only one other match with 'tree triangulation' in the triangulated group;
As we can also see now, by working between the triangulated segment and the shared cluster data, utilising DNAGedcom cluster reports we were able to gather much more supporting evidence to support the DNA Confirmation, increasing our confidence level to 'highly confident'.




4. WATO Reports

The 'What Are The Odds?' tool from DNA Painter is a fabulous resource that uses DNA and probability estimates to predict potential relationships.  It works best for closer relationships and for predicted probabilities to be more reliable, matches need to be 40cMs or greater.  A low match, or even the absence of a match can correctly rule out hypotheses that might otherwise have been possible.

The first image shows an example of how a half second cousin might fit in to the tree.  This could be used to support a hypothesis regarding potential relationships.  In this example predicted probability is '1' which indicates the relationship is possible, but other relationships are also possible.  More evidence needs to be gathered, both genealogical and genetic, particularly if there was no paper trial.  





The second chart takes this analysis further for more distant relationships on the same ancestral line.  The underlying probability statistics in WATO don't go below 40cMs and distant relationships have many segment matches under that amount. Whilst not useful for probability predictions for distant ancestors, WATO is a great way to plot your matches to give you a visual representation of the size and spread of matches and what additional evidence you have gathered.  I use it as the base for my analysis documentation in powerpoint and add evidence of clusters and triangulated groups to the page.  Notes can also added if there are anomalies that need to be looked at.  

Remember at the distant cousin level, isolated matches are not evidence, just clues.  To provide additional evidence they need form part of either a segment triangulated group, or a shared match cluster.




5. Ethnicity Reports

Ethnicity reports are great examples of supplementary evidence that can help validate or confirm your conclusions regarding relationships, particularly where unique ethnicities are involved.  The following image shows how Rob relates to his highest maternal and paternal match.  Clearly his Serbian heritage comes from his paternal side. 




6.  Y-DNA and mtDNA Tests

For more distant ancestors on either patrilineal or matrilineal lines Y-DNA testing and mtDNA testing is a great way to confirm hypotheses about these relationships.  The following source citations are examples from Wikitree:
  • Paternal relationship is confirmed through Y-chromosome DNA test results on Family Tree DNA Tester #1, FTDNA kit # XXXXX, and his 3rd cousin, Tester #2, FTDNA kit # XXXXX, match at a Genetic Distance of 5 on 111 markers, thereby confirming their direct paternal lines back to their most-recent common ancestor who is John Coat, the 2nd great grandfather of both Tester #1 and Tester #2.
  • Maternal relationship is confirmed through mtDNA Full Sequence test results on Family Tree DNA between Tester #1, FTDNA kit # XXXXX, Tester #2, FTDNA kit # XXXXX, and Tester #3, FTDNA kit # XXXXX for King Richard III.  The full sequence match confirms their direct maternal lines back to their most-recent common ancestor who is Cecily Neville York, the mother of Tester #3 King Richard III, the 16th great grandmother of Tester #1 and the 18th great grandmother of Tester #2.

Further information and examples of my approaches to DNA confirmation and sourcing in my Family History Program (FTM) can be found in the resources section below.




More Resources:



Building Evidence Summary Table:






Veronica Williams
First published - 9th December 2023
Last updated - 30 January 2024