Tuesday, November 23, 2021

DNA Research Framework 4 - Use genetic research to prove and expand your pedigree

   There are five parts to the DNA Research Framework:

  1. Understand DNA Basics
  2. Know what you are working with
  3. Combine genetic and genealogical research
  4. Use genetic research to prove and expand your pedigree
  5. Continuous review

Within the framework we apply a DNA research methodology to ensure we systematically and methodically review our results, to improve our productivity and success rates.

The following blog posts provide more detailed information about the DNA Research framework, applying the DNA Research Methodology and building your DNA analysis skills:

This post contains reference material relevant to Module 4 
The ISOGG site also has a lot of useful material refer - ISOGG Beginners' guides to genetic genealogy.  You can find earlier material relevant to earlier modules here:

Use genetic research to prove and expand your pedigree - Total cMs approach
This module focusses on using your genetic research to prove and expand your pedigree.  It considers what relationships can be 'confirmed by DNA' (including when there is no paper trail) and the Genealogical Proof Standard.  


Use genetic research to prove and expand your pedigree - Chromosome analysis
To confirm your pedigree beyond 3rd cousins or where there is no documented paper trail  (with the exception of parent/child and sibling relationships) you need to undertake detailed chromosome analysis, using segment data. We use triangulated segment data to provide evidence regarding our hypotheses for cousin matching to confirm our shared ancestors.  Examining other matches in these groups can often help expand your pedigree back further generations and potentially break down brick walls.  

When using your DNA analysis to confirm relationships don't forget the requirements in the 'Genealogical Standards (GS)', there are five key elements to consider.

  • Reasonably exhaustive research
  • Complete and accurate source citations
  • Critical tests of relevant evidence through processes of analysis and correlation
  • Resolution of conflicting evidence
  • Soundly reasoned, coherently written conclusion

Most genealogical questions cannot meet the 'reasonably exhaustive research' test without the consideration of DNA evidence.  The GPS includes standards about each of the following aspects associated with DNA:

  • Planning DNA tests
  • Analysing DNA test results
  • Extent of DNA evidence
  • Sufficient verifiable data 
  • Integrating DNA and documentary evidence
  • Conclusions about genetic relationships
  • Respect for privacy rights

Remember, to confirm segments as belonging to a particular ancestor, you must ensure that three or more matches share an identical AND overlapping segment, i.e. a 'triangulated segment' that is 'identical by descent'.  There also needs to be sufficient genealogical distance between the matches in the group to demonstrate how the segment was passed down (i.e. not just close matches such as parent/child/siblings). 

When assessing likelihood of relationships it is important to:-
  • Assess the probable shared cMs for each individual match and their predicted relationship; 
  • Consider the relationship in the context of other matches who descend from the same common ancestors;
  • Use statistical analysis tools such as DNA Painter WATO (What Are The Odds) to assess probabilities for your hypotheses.  It should be noted that using total shared cMs alone will not be sufficient for distant matches as the WATO tool does not use matches below 40cMs in its underlying statistics;
  • Remember that WATO is a great visual tool to demonstrate relationships of matches beyond 3rd cousins, but it needs to be supported by additional evidence when matches are below 40cMs.  This should take the form of segment analysis including (ideally) multiple tests and triangulated groups, demonstrating that the chromosome segment has been 'walked back' through multiple generations;
  • When drawing conclusions, make sure you examine and review any conflicting evidence and document the reasons for your decision.
Mapping confirmed chromosome segment data can be a useful to assist in determining possible ancestral groups for new DNA matches and can demonstrate how confirmed segments have held up over multiple generations.   

Chromosome analysis techniques can be taken further with more advanced applications, such as phasing, inferred mapping and DNA reconstruction.  These techniques should not be attempted unless you fully understand the underlying DNA theory covered in Modules 1 and 2.





In closing

I hope you have enjoyed our chromosome analysis journey applying the DNA research methodology.  There is much to learn.  It can be challenging but equally rewarding.  It's an iterative process, requiring continuous review.  Go back over what you have learnt since module 1 and apply the methodology in practice, it will soon become second nature.

Some of our DNA mysteries may take decades to resolve, like my George Courtney - I've been at it so long I've dedicated a whole blog to him!  The key thing is to be systematic and methodical and eventually you should be able to solve your mystery.







REFERENCE MATERIAL

DNA Confirmations

Chromosome Mapping

You may also be interested in reading these blogposts relating to DNA research for my own family:

These and more can be found on 'The Genemonkey, my ancestors and DNA questions' blog @Wordpress.


Veronica Williams

Originally published: 23 November 2021
Last updated:  10 December 2023


Sunday, October 24, 2021

Tools: Clustering for Chromosome Analysis

Clustering tools are a great way to quickly identify groups of matches with a relationship to each other.  

As we know shared matches are 'clues', whilst shared segments are 'evidence' of a shared ancestor.  All cluster tools identify matches that are likely to be clues of common relationships.   The best types of clusters in my opinion, are shared segment clusters, however every cluster analysis you do is likely to give you new ideas about how your matches might relate to you.

This post is aimed at identifying where clustering can be undertaken and whether they provide shared match or shared segment clusters.  Click on the heading hyperlinks for more information, plus please refer to the blog posts at the end of this page for more information about using each of the different tools.

  

DNAGedcom Client 

Using the cluster tool at DNAGedcom requires a subscription which can be taken out for a minimum of one month.  The DNAGedcom client provides clusters based on the Collins Leeds Method for AncestryDNA and FamilyTreeDNA.  

As AncestryDNA does not provide segment data, the clusters generated are 'shared match' clusters.  Generating clusters with matches >30cMs will generally given an indication of shared ancestry, however clusters generated with less than that amount may give false leads with matches potentially sharing different more distant ancestors.   


DNAGedcom has an another advantage when working with AncestryDNA data.  It allows you to identify other shared matches that are below the 20cMs threshold, provided they share >20cMs with the match you have in common.  This can be confusing, hopefully this cheat sheet helps. 

The FamilyTreeDNA clusters generated by DNAGedcom are based on shared segments and most matches in the cluster are usually triangulated groups.  Due to this, smaller cMs defaults can be used and each cluster will provide clues of a shared common ancestor.  Additionally, shared matches are also shown in grey.


  

GEDmatch 

There are now 2 cluster tools at GEDmatch, they require a Tier 1 subscription which can be taken out for a minimum of one month. 

The first tool is the AutoTree Clustering tool which is based on shared segments and each cluster of the same colour is usually a triangulated group.  Similar to the DNAGedcom tool for FTDNA, smaller cMs defaults can be used and each cluster will provide matches of interest sharing common ancestors.



In Oct 2021 a second tool was released called AutoSegment.  It provides triangulated data.  Refer to the Roberta Estes blogpost for more information on this great new tool.





The free tool at My Heritage is an example of a 'shared match' cluster, so whilst the matches in each cluster have something in common, they may not all descend from the same common ancestor.  The advantage of this cluster tool is that it also provides segment detail.  In the example below interrogation of Cluster 3 (yellow) revealed four distinct triangulated groups, each providing separate clues regarding the identity of the common ancestor.  The main limitation with this tool is that you are unable to change or refine the search parameters. 

Genetic Affairs

Genetic Affairs is a third-party tool created by Evert-Jan Blom that provides great tools for use with 23andMe, FTDNA, GEDmatch and My Heritage DNA data.  Evert also developed the free My Heritage cluster tool and the new GEDmatch AutoSegment tool.  

The Genetic Affairs site is another subscription site that provides a range of tools to assist with your analysis.  Running the tools at Genetic Affairs provides more flexibility and in particular for the My Heritage cluster also shows interrelationships, something not provided by the free tool.

Genetic Affairs provides many other useful tools on its site, including AutoKinship, AutoTree and AutoPedigree.


Shelley Crawford Clusters 

Connected DNA used to provide fabulous network maps of shared match clusters for AncestryDNA, FamilyTreeDNA, and 23andMe.  However, as at October 2021 Shelley is taking a break and no orders are being taken at the moment.  We hope she will be back soon.


DNA Painter Cluster Auto Painter 

You can upload clusters created from DNAGedcom (FTDNA only), My Heritage, GEDmatch and Genetic Affairs to the DNA Painter Cluster Auto Painter to visualise the segments and make notes about the results of your analysis.  This allows for a more visual approach to analysing your segment data.  The image below has been generated from the output of the AutoSegment at GEDmatch.



Shared Clustering Tool

The Shared Clustering tool by Jonathan Brecher works from AncestryDNA shared match lists. Rather than providing a single list of matches ordered only by the strength of the match, Shared Clustering divides that list into smaller clusters of matches that are likely related to each other.  Downloading directly from AncestryDNA has been disabled, but files extracted using DNAGedcom can be uploaded.  It only works using Windows.



RootsFinder 

RootsFinder is a family tree building and DNA analysis website. The premium level has DNA features for a subscription fee.  The triangulation (cluster) view allows you to view your matches in clusters – otherwise known as a network graph.



Ideas for exploring your cluster

* Explore the matches in the cluster and check if the shared matches are also sharing the same segments;

* Are there any triangulated groups within the cluster?  Explore these matches first.  Remember you can expand each triangulated group by checking your segment data for others not appearing in the cluster report, who may not have met the cluster criteria.

* Are there any 'bridge' matches in the cluster?  Use these to help you to find others who match in the same segment area at other sites.

* If the cluster includes matches not triangulating with the core group, explore those segment areas as these may provide additional clues to the possible relationship of those in the cluster. 

* Do the genealogy!  Build research trees for your matches and revisit your own pedigree to search for the what is in common between the cluster group - surnames, locations, ethnicities?

* Once you have identified a 'side' or MRCA make notes on your master list and at the DNA site for all the matches in the triangulated group. 

* You may also wish to allocate a reference number to your Cluster for future reference.  Remember all the tools have their own numbering systems which constantly change with each report.

* Make sure you are systematic with your notes so that next time you generate a cluster report you can easily see matches that have been worked on before.

* Don't forget to consider the other side of the chromosome - use what you now know to mark the segments on the opposing side.   Explore the other side of the 'specific segment area' for more triangulated groups.  By looking at both sides concurrently you can then mark others not triangulating as potentially false positive matches, saving analysis time in the future.

My group numbering system has changed many times over the years, it has been adapted from the Jim Bartlett method.  It doesn't need to be as complex as this, but it needs to be meaningful for you.



More information:

Check out these links for useful blogs that may help you interpret how the different cluster sites work:

* The Leeds Method, Dana Leeds, 2018 

* What to do after Clustering (You Tube), Dana Leeds, 2024

DNA Sleuth, 2019

Walking the Clusters Back, Jim Bartlett 2019

* Cluster Auto Painter, Jonny Perl 2019

* Shared Clustering - A great tool! Jim Bartlett, 2019

* Connecting the Dots, My Heritage 2020

* Fast Ways to Cluster your DNA Matches, Family Locket 2020

* Genetic Affairs Tools and how to use them, Roberta Estes 2020

* Walking Back the Clusters, Veronica Williams 2020.  Demonstration of applying Jim's method.

* Using DNA tools to solve a family mystery, Vicki Hails 2020

* Annotating a Cluster Auto Painter Map, Jonny Pearl 2020

Auto Segment Triangulation Tool at GEDmatch, Roberta Estes 18 Oct 2021

* RootsFinder Network Graphs, Family Locket Oct 2021

RootsFinder - Making Triangulated Network Graphs, Family Locket Dec 2021



AncestryDNA

Whilst AncestryDNA does not provide chromosome information it is always best to start your analysis there, due to the large numbers in the database and its many pedigrees.  AncestryDNA Clustering can be done using the DNAGedcom CLM tool, Shared Clustering Tool and the DNA2 Tree app.  You will often find Ancestry testers at other sites, particularly GEDmatch and My Heritage (a growing database with lots of family trees).  Always look for 'bridge matches' between the sites as they can help to expand your research pool and help you tie your triangulated groups together with broader shared match clusters.


Veronica Williams

First published: 24 Oct 2021

Last updated: September 2024


2024 NOTE:  Since September 2023 many downloads have been unavailable due to restrictions at the DNA companies, this includes the very handy DNA2Tree Tool.  GEDmatch has remained available throughout this time and finally FTDNA returned their reports in late August 2024.  We hope things will change at My Heritage and 23andMe soon.  

Tuesday, October 19, 2021

DNA Research Framework 3 - Combining genetic and genealogical research

  There are five parts to the DNA Research Framework:

  1. Understand DNA Basics
  2. Know what you are working with
  3. Combine genetic and genealogical research
  4. Use genetic research to prove and expand your pedigree
  5. Continuous review

Within the framework we apply a DNA research methodology to ensure we systematically and methodically review our results, to improve our productivity and success rates.

The following blog posts provide more detailed information about the DNA Research framework, applying the DNA Research Methodology and building your DNA analysis skills:

This post contains reference material relevant to Module 3 
The ISOGG site also has a lot of useful material refer - ISOGG Beginners' guides to genetic genealogy.  You can find material relevant to earlier modules here:

Combining genetic and genealogical research - Total cMs, the broad approach
This module covers techniques and tools to help you combine your genetic and traditional genealogy research, using your genetic data as the starting point for more targeted research.

Make sure you have maximised your findings from your AncestryDNA and have attempted to identify all your closest matches up to 3rd cousins at all DNA sites.  You should have also applied the AncestryDNA grouping process or Leeds method to identify likely match groups out to your 16 x 2nd great grandparents (where possible).  

Ideally, you want to be fairly certain that your closest matches support your pedigree out to your 2GGP's. When working with your DNA matches you need to ensure you are working from a solid base and that the pedigree you have researched, is in fact your true genetic ancestral line too.  If you have no matches on some of these lines and/or a group of unknown matches sharing large amounts of DNA with you, you may need to question whether your documented pedigree is accurate. 

The concepts associated with developing research trees from this information have been extensively covered in the AncestryDNA course and will not be covered in depth in this module.  Refer to Mossie's Musings for more information.

Recap on the key concepts from Modules 1 and 2:


Combining genetic and genealogical research - Chromosome analysis
To confirm your pedigree beyond 3rd cousins or where there is no documented paper trail  (with the exception of parent/child and sibling relationships) you need to undertake chromosome analysis, using segment data.

As a matter of course, it is recommended that any close matches at AncestryDNA who are not on any chromosome site should be approached and encouraged to upload their results to either GEDmatch, My Heritage, FamilyTreeDNA and LivingDNA (preferably all), all are free.  The following blog post can be a good one to send to your matches. I usually offer to help them interpret the results once they upload, but this depends on how critical the match is for my analysis.

In chromosome analysis our guiding principles are to:
  • Start with genetic evidence (such as known ancestral or triangulated segments and groups);
  • Limit our research to matches in that specific group for each segment area to identify the shared ancestor (increase productivity, don't go down rabbit holes based on common names etc with matches that don't triangulate);
  • Inform our research by 'clues' (such as shared matches, cross platform comparisons, cluster tools);
  • Understand patterns emerging from each triangulated group on specific ancestral lines;
  • Use each specific segment group to inform our knowledge of our genome and pedigree, pushing back segments to more distant ancestors by 'walking back the chromosome' (Module 4).

The general chromosome analysis process is:
  • Firstly, identify 'triangulated segments' on a particular chromosome of interest (between 3 matches) using 'segment triangulation' techniques (Modules 1 and 2).  Remember, by ensuring the segment triangulates with others it is validating the segment, making sure the match is 'real';
  • Next, seek to identify more people who triangulate on the same chromosome in the same segment area (between 4 or more matches), forming 'triangulated groups' (Module 2);
  • The matches in triangulated groups have all inherited a common segment of DNA, so it is 'evidence' that they all share a 'common ancestor';
  • Look for the pedigrees of your matches and try to identify what's common between those in the triangulated group, it could be names, locations, ethnicity etc.  Use that knowledge to research and identify possible common ancestors between you and your DNA match using 'tree triangulation';
  • If only limited information is provided by the match, try searching for their ancestors in other public trees. 'One World Trees' such as Wikitree, Family Search and Geni allow you to easily search for connections if your tree is already published there or search on Ancestry or My Heritage as they have the largest repositories of family trees;
  • If needed, 'develop research trees' to expand the genealogy of your matches to find the common ancestor (AncestryDNA course);
  • Once you have identified and confirmed a common ancestor for at least 3 descendants (not closely related) in the triangulated group, you can mark the segments for the others in the group to a particular ancestral line or ancestral couple.  NOTE: Other members of the group may connect via different ancestors in the line of descent, but will be connected either via a descendant or ancestor of the shared ancestors you identified for the initial 3 matches;
  • Whilst strict triangulation can only be achieved by looking at matches on the same DNA site, make sure you compare matches in the 'identical segment locations' on all DNA sites for 'bridge matches' (matches at multiple sites) as these can be additional 'clues';
  • Create a chromosome map of confirmed segments for future analysis - DNA Painter is ideal (refer to Activity 3 exercise below).  This is covered in more detail in Module 4 .  
  • Remember, to confirm the segment as belonging to a particular ancestor, you must ensure that all matches share an identical AND overlapping segment, i.e. a 'triangulated segment' that is 'identical by descent'.  There also needs to be sufficient genealogical distance between the matches in the group to demonstrate how the segment was passed down (i.e. not just close matches such as parent/child/siblings).  This is covered in more detail in Module 4.
NOTE:  Particular care needs to be taken when dealing with endogamous communities and where there is pedigree collapse as there may be more than one shared common ancestor.  

These posts from Module 2 may be good to review:


How to decide where to start

Where you start will depend on your genealogical goals.  In my own genealogy, I am interested in confirming my pedigree back to my 5th great grandparents (6th cousin level, 254 ancestors).  I want to ensure I have the right ancestors in my tree and autosomal DNA is generally considered able to identify segments of DNA back to 5th cousins.  I find the predictions are often one generation out, hence working to 6th cousins!  I also have a couple of problematic 2nd great grandfathers.  Different strategies are needed for both.

The following suggestions are made to give you ideas about how or where to begin.  It doesn't really matter where you start, every analysis will help to identify the origins of your DNA segments.  As long as you record your findings for each segment, the historical information will help you expand your research and pedigree over time.  Chromosome Analysis is a long term endeavour - you may not find anything today but each match in the future will add more clues relating to that segment and help identify your shared ancestors.  Every small segment you identify, can lead you to more discoveries and it can be a snowball effect.

  • Do you already have 3rd cousin or closer matches where you have identified your MRCA?  Do 'one on ones' to find the segments they match on and mark those segments to the appropriate side, maternal, paternal or both (close relations, children, siblings), noting the shared ancestor or ancestral couple.
  • Matches of interest?  Do 'one on ones' for find the segments they match on, then interrogate those segment areas for likely triangulated groups who will share a 'common ancestor'.  This is the best place to start for those large unknown matches.
  • Interested in an ancestor of specific ethic origins?  Identify the ethnic origins of your ancestors using chromosome maps available at FTDNA or 23andMe and Admixture reports at GEDMatch.  Whilst not very useful for those of us with predominantly the same ethnicity, people with a specific ancestor of a very different ethnicity such as Asian, African etc often find these maps show the specific segments they inherited from that ancestor.  AncestryDNA has recently introduced its Sideview™technology, however care should be taken when using the chromosome painter ethnicity data.  It is still in BETA and from my observations the ethnicity predictions are not reliable. However, Sideview™technology for allocating parental sides is proving to be remarkably accurate.


Use match information to expand your research

Don't be satisfied by just identifying the shared ancestor between you and your matches, use the information to identify more matches, discard false positive matches and expand your pedigree.
  • Known 3rd cousins or closer?  Interrogate the segment areas for these matches - everyone who also matches you and this match on the same segment will form a triangulated group and you already know your closest shared 'common ancestor'!   Do more research, aim to push 'EACH triangulated segment' back a further generation by continually reviewing the group to find more common ancestors.
  • Known maternal/paternal segments?  Remember there are two sides to every chromosome.  Once you have identified any match on a specific side, provided that 'segment area' is triangulated you can use that information to determine the other side of the chromosome.
  • Know one parent but not the other? Don't despair if you are looking for your father but can only identify matches on your mothers side.  Use known maternal matches to identify maternal segment areas, then look at all your matches on the reverse side of the chromosome, who don't match your maternal cousin.  These are most likely paternal matches, or false positives.  
  • Have three siblings tested?  Visual Phasing is a process that can be applied to the results of 3 siblings to determine what segments each inherited from their 4 grandparents.  This is an advanced technique but can be very useful for researching more distant ancestors (we will touch on this technique in Module 4).


Doing the genealogy....

No matter which way you approach DNA research, whether you stick with the broad approach or get into chromosome analysis techniques you can't escape having to do lots of traditional genealogy to bring paper trails and genetic evidence into alignment.  Both your tree and the trees of your DNA matches probably need work and don't go back far enough to find the common ancestor.  Follow these key steps:
  • Build your pedigree out as much as possible for all known lines, ideally at least to 3rd great grandparent level, preferably more ie 5th great grandparents (Module 1);
  • Ensure tree completeness, working from the base, aim to achieve as close to 100% at each level with 'paper' evidence for your pedigree (Module1);
  • Work with your closest DNA matches first to ensure that they are supporting your known pedigree out to 2nd Great Grandparents, if possible (Module 2);  
  • If you have no matches on some of these lines, and/or a group unknown matches sharing large amounts of DNA with you, you may need to question whether your documented pedigree is accurate, review your assumptions (Module 2);
  • Continually review and expand your known pedigree and tree completeness based on information your DNA analysis is telling you (Module 3);
  • Review pedigree information provided by the match or found by searching one of their ancestors on other pedigree sites and use it to try and identify your MRCA, particularly for all 3rd cousin and closer matches.  This is essential if you wish to be successful in chromosome analysis where we are often dealing with much smaller segment matches and more distant common ancestors (Module 3);
  • Develop research trees for your matches to help connect your two lines, common names, locations etc (AncestryDNA course);
  • Working from the base, aim to achieve DNA confirmations as close to 100% at each generation level so you know you are continuing to work from a solid base (Module 1).   This will help you 'walk back the chromosome' and build reliable chromosome maps for your autosomal DNA (Module 4);
  • Don't forget for more distant ancestors utilising Y-DNA or mtDNA analysis may be a more effective way to confirm ancestry along your patrilineal and matrilineal lines.  Given the additional costs for these tests, they may be better utilised to confirm a hypothesis rather than used as a fishing expedition, particularly if you recruiting others to test for you.





Suggested resources:


Word of caution

* Remember segment matches of 20-30cMs and below can end up being many, many generations back.  You may not always find your common ancestor even when researching back multiple generations.

* Amassing clues from multiple triangulated groups may help you solve the puzzle over time.  For most people this a long term research exercise - and not for the fainthearted!


Suggested activities

The following two activities are aimed to help you apply 'Module 3' in practice:
Cross Platform Exercise: and 



Veronica Williams
Originally posted: 19 October 2021
Last updated: 6 September 2024











All DNA Sites - Working from chromosome data to find your MRCA

In Module 3 of 'Combining genetic and genealogical research' we discussed ways to work from your chromosome data to maximise your chance of finding your MRCA.  The purpose of undertaking this exercise is to consolidate your understanding of the principles and processes we discussed.  By examining results at all DNA sites and working between them, you will increase your knowledge of specific segment areas which may lead to more successful outcomes.

In Module 2 we discussed all the DNA companies that provide chromosome data analysis tools including GEDmatch, FTDNA, MyHeritage and 23andMe, including how to download match and segment data and triangulation techniques.  As the suggested activity for Module 2 was focussed on the MyHeritage site, make sure you have also practised applying the same techniques at the other chromosome data sites where you have either tested or uploaded your DNA data.

Use your identified triangulated groups across all sites to solve the puzzle of how matches might fit into your pedigree. I recommended analysing everything you know about a match and trying to identify the likely ancestral line/group they belong to, before making contact.

Before moving to Module 4, make sure you understand the theory behind two sides to every chromosome, plus how to identify triangulated segments and groups. You should also have decided upon your method for retaining details of the DNA matches you have researched, the segments they match on and the results of your analysis, to avoid rework and improve your productivity.


Using triangulated groups to help solve the puzzle


Which match to work on for this exercise?

Perhaps there is a perplexing match on a DNA site that provides chromosome data and you don't know how they fit, or choose a known relative.  We will call this match - Match A.

  • It's best to choose a large match, ideally one who matches on a number of segments;  
  • Apply what you have already learnt in Module 2;
  • Examine their match at the testing site you found them, try to identify a side, check if they are in any broader triangulated groups with others on EACH of the different chromosomes/segments;
  • Run any cluster reports you can depending upon where you found the match - remember to check if the cluster report is showing 'shared matches' or 'shared segments';
  • Find any pedigree information (provided by the match or found by searching one of their ancestors) and use it to try and identify your MRCA;
  • Build research trees to expand the known genealogy of both your match and your own tree;
  • Update your preferred recording method with the results of your analysis;
  • It doesn't matter if you find the MRCA or not - it is always useful to go to the next step by looking at the implications of this result on all the other sites. 

NOTE:  Choosing a match you already know can help to focus on names and locations that you are familiar with and you may find it easier to identify how others in each of the TG's might connect.  If working with a known match, particularly if they are a close relative, remember others in the TG may be connecting further back on only ONE of your shared ancestors ancestral lines.


Combining Your match Data

* Download the matches and segment data for your kit from each of the chromosome data sites where you have your DNA uploaded - GEDmatch, My Heritage, FamilyTreeDNA and 23andMe (Module 2).  NOTE: As at August 2024 this is limited to GEDmatch and FTDNA due to current data download issues.  However, if you have any old reports from the other companies you can incorporate them. 

* Combine all reports into one new 'master' list, how you do it depends on your chosen method of storing and maintaining your data (Module 2).   

* If you are using the spreadsheet method, you might like to first add a column to the start of every report from each of the 4 companies and insert the testing company name for every match in the spreadsheet - before combining them into one list. 

 * Alternatively, you could make each companies report a separate colour, so when you combine them you can easily recognise the different sites.

* If preferred, it is possible to work with the separate spreadsheets, but it may become more time consuming to have to go back and forth between each one in the analysis process.

*  Sort the report by chromosome number, then segment start position (smallest to largest). It might look something like this:


You can also limit the size of your spreadsheet by setting a default cut off, ie 15cMs, but be mindful that some lower matches may have trees that could provide the answer you are looking for!

Sorting by name will help you see if people are duplicated across the sites.


NOTES:  

* If you have any AncestryDNA match lists, it can be useful to add these into your master list, to help identify common names, across platforms.

* If you are finding manipulating spreadsheets troublesome, uploading your matches into DNA Painter using their Bulk Import Tool may be a good alternative (under the settings icon in the chromosome map).  Whilst the matches can't be sorted into a chronological name list, the visual presentation suits many researchers and you can just work on one chromosome at a time.

* Once you have come to grips with the underlying theory of chromosome analysis I would recommend using the GDAT database (Genealogical DNA Analysis Tool).


Going to the next level with chromosome data

You will have already applied the Module 2 research techniques at the testing site where you found 'Match A'.

* Identify each of the segments you share with Match A - it should look something like this:


* Then for each chromosome go to your 'new master list' and find other matches who overlap in the same segment area.  NOTE:  It won't be exact, but find the overlapping segment area that includes the segment area where you match 'Match A', even if the area does not include the entire segment - just make sure the 'overlapping' segment area is at least 7cMs.

* Examine all the names and segment matches to see if there are any duplication in matches across sites - these matches are key to working across platforms where you cannot directly compare matches.  NOTE: I call these 'Bridge matches' as they are matches that will allow you to connect your groups across sites.  J Burke in the list below appears to be a 'Bridge match', appearing to be on FTDNA and My Heritage.


* Apply the same process we applied in Module 2 for each of the testing sites, identifying potential triangulated groups AT EACH testing site for the identical segment area.

* Can you determine the 'side' or MRCA for any matches that are on multiple sites?  NOTE: Whilst matches on different sites cannot be considered 'triangulated' DNA segments in the strict sense of the word because they can't be compared to each other, they can be used as 'clues' to help you work with the specific segment area.  This applies only if they are actually also 'triangulating' with other matches on the different DNA sites.

* If you can identify sides at one site and you can find a 'bridge match' you can use that match to start to hypothesise about the TG segment area.  NOTE:  Over time, as you work through your match list you should start to see patterns forming for start and end locations along each chromosome.

* Develop hypotheses for sides, or ancestral groups based on the combined knowledge you have gained from examining the segment area at all sites. 


Can AncestryDNA help?

* Whilst AncestryDNA doesn't provide segment data, it has the best collection of pedigree information available.  It is often disappointing that so many of their matches don't upload to other DNA sites where we can access the chromosome data.

* You should have already applied the broad approach to classifying your matches at AncestryDNA using the Leeds or Groupings processes (refer - AncestryDNA course).

* The new Pro Tools feature, Enhanced Shared Matching, can be useful in identifying links between matches and identifying family groups.  Make sure you review these as part of this exercise.  You might be surprised by finding relevant pedigrees with lower cMs matches you may not have examined in the past.

* You should also have already approached your largest matches and any matches of specific interest to encourage them to upload to a chromosome site.  Not only will the knowledge of the specific chromosomes help your research, it provides additional information to potentially connect your Ancestry groups with your triangulated segment groups.

* When analysing your chromosome data be on the lookout for common names that might suggest your AncestryDNA match might also be on a chromosome site.

* At GEDmatch, look for kits starting with 'A', these are old AncestryDNA kits who are likely to also be on your matchlist there.  Sometimes, you won't be able to find them, this could be due to the Timber algorithm at AncestryDNA eliminating some segments so they don't appear as a valid match.  If you match only on the X or 23rd chromosome, they will not be reported as matches at AncestryDNA, but they will show up as matches at GEDmatch and 23andMe.  NOTE:  FTDNA does also provide X data, but will only report it as a match if you also have a match on chromosomes 1-22.

* If you have found a match at a chromosome site who is also in your AncestryDNA match list, go back and try to find them at AncestryDNA.  What 'group' have they been allocated to?  Consider how this might apply to your Triangulated Group if you have identified one.  NOTE:  Remember that the grouping technique at AncestryDNA is working on 'shared matches' - they are only 'clues' not 'evidence' and may lead you down the wrong path, particularly for lower cMs matches. Be careful about what conclusions you draw for matches under 30cMs.


Run clusters to identify groups

* In Module 2 we discussed a number of sites who provide cluster information which can help to identify groups of matches who are likely to be related to each other.  Run as many different types of clusters as you can to help identify matches you may have missed.  Make sure you distinguish between 'shared match' and 'shared segment' clusters when drawing conclusions.

* Are there more people in the 'shared segment' clusters than you have identified so far in your Triangulated Groups?  Examine them and see if they can provide more information.

* Sometimes clusters based on 'shared matches' may reveal additional triangulated groups.  Explore everyone in the cluster using the process we discussed in Module 2.

Check out my 2021 blogpost Clustering for Chromosome Analysis for more tips and information.


Do the genealogy....

* Examine all available pedigree information for your shared matches in each triangulated group at all the sites, looking for common names and locations;

*  Explore pedigree tools to help you automate the searching process within pedigrees;

*  Develop research trees, expand your matches trees to find the common ancestor;

* Whilst we started looking for the relationship between you and Match A, if none can be identified, find other relationships between matches in each of the triangulated and cluster groups.  If you can identify pairs within triangulated and cluster groups, it is likely your shared ancestor is somewhere back from the ancestral couple those matches share.


Rinse and Repeat....

* Hopefully by now you will have explored multiple TG's in the segment areas on chromosomes where you matched 'Match A'.  

* If 'Match A' was a known ancestor, let's say on the 'maternal' side, you now know that because chromosomes have two sides - any matches who don't match 'Match A' must be the opposing side so are 'paternal' matches (or false positives). 

* Go back and repeat the steps to investigate the matches and TG's on opposing side that you have not yet explored, once again comparing matches in the same segment area at all the sites.

* Don't forget to record your findings in a way that will be preserved.  This can help speed up the analysis process when yet get additional matches in these segment areas down the track.

* Continue to look at both sides of the chromosome concurrently, this helps you to be more productive in working through your genome and will save time in future analyses.


Remember:

* Every unknown match is an opportunity to find out more about your family;

* Don't ignore known matches - every known match is an opportunity to expand your pedigree; and

* Continually look for the patterns that help put the puzzle pieces together!


Refer to the Blogpost for Module 3 for links to relevant material associated with this exercise.


Veronica Williams

Originally published: 19 October 2021

Last updated:  30 Aug 2024


2024 NOTE:  Chromosome data downloads are currently restricted at My Heritage and 23andMe.  However, GEDmatch and FTDNA are available for use with this exercise.