Genemonkey explains....: September 2021

Wednesday, September 22, 2021

DNA Research Framework 2 - Organising your DNA data and determining match groups

There are five parts to the DNA Research Framework:

Understand DNA Basics
Know what you are working with
Combine genetic and genealogical research
Use genetic research to prove and expand your pedigree
Continuous review

Within the framework we apply a DNA research methodology to ensure we systematically and methodically review our results, to improve our productivity and success rates.

The following blog posts provide more detailed information about the DNA Research framework, applying the DNA Research Methodology and building your DNA analysis skills:

This post contains reference material relevant to Module 2

The ISOGG site also has a lot of useful material refer - ISOGG Beginners' guides to genetic genealogy. You can find earlier material relevant to Module 1 here.

Know what you are working with - Total cMs, the broad approach

This module is designed to help you understand how to identify/manage/organise your DNA data so you know what you are working with.

Recap on the key concepts from Module 1:

Concepts: Inheritance, Roberta Estes 2020
What is a match? Roberta Estes, August 2021 (subscribe to this new series!)
How many matches do I have, Roberta Estes 2021
Relationship predictions - cousinship (ISOGG)
Relationship predictions - shared cMs (v4 Blaine Bettinger 2020

Before venturing into the labyrinth of chromosome analysis, make sure you have maximised your findings from your AncestryDNA (and results from other sites) and have attempted to identify all your closest matches up to 3rd cousins. You should have also applied the grouping process (Leeds Method or similar) at AncestryDNA (and other sites) to identify likely match groups for each of your 16 x 2nd great grandparents.

Ideally, you want to be fairly certain that your closer matches appear to support your pedigree out to your 2GGP's (as much as possible). When working with your DNA matches you need to ensure your are working from a solid base and that the pedigree you have researched, is in fact your true genetic ancestral line too. If you have no matches on some of these lines, and/or a group of unknown matches sharing large amounts of DNA with you, you may need to question whether your documented pedigree is accurate.

Filling in the Blanks - The Legal Genealogist 2019
8 ways to use ancestral trees in DNA Painter, Jonny Perl 2020
Matching at AncestryDNA and what it means, Roberts Estes 2021
AncestryDNA Grouping process, Christine Woodlands 2020
Leeds Method - Dana Leeds 2018
Tips for Triangulating, Diahan Southard 2025

Know what you are working with - Chromosome analysis

After working with your DNA results broadly for a time you will probably want to delve into analysing your results at the chromosome level. To confirm your pedigree beyond 3rd cousins or where there is no documented paper trail (with the exception of parent/child and sibling relationships) you need to undertake detailed chromosome analysis, using segment data.

Download your segment data:-

Downloading segment data and why you should - Roberts Estes, 2021
How to download your DNA match lists and segment files - Roberta Estes, 2022
23andMe
FTDNA Chromosome Browser
GEDmatch Segment Search - Tier 1 $'s
My Heritage
Third party tools - DNAGedcom Client, YourDNA.family (currently only 23andMe but plans to expand in the future to include other sites);

Decide how you are going to organise your data and keep track of research undertaken. These are the 3 main methods:

DNA Painter. The visual method suits many, but DNA Painter lacks the ability to manipulate data, draw upon historical notes, sort like groups in different ways etc. Even so, the site has lots of fabulous features so get your account now! This video outlines its features - Introduction to DNA Painter.
The spreadsheet method - Jim Bartlett's Blog of 2014 can give you some ideas about how to design your spreadsheet, he updated it in 2021. If you are serious about working with results in the longer term, you will want to retain any work you do and not fall into the trap of constantly duplicating your research. This necessitates capturing a lot of data as you go. Beware - over time this spreadsheet will become extremely large and you need good technical skills to be able to manage it. Jim has made further enhancements to his spreadsheet since the introduction of Pro Tools in 2024.
In January 2022, Danielle Lautrec also published a useful blogpost about how she uses Excel in her DNA analysis.
Genealogical DNA Analysis Tool - This is a 'built for purpose' database, specifically for analysing your autosomal DNA results. It keeps everything organised and all in one place. Before starting to work with this tool you need to understand the underlying theory of chromosome analysis.

Apply the research methodology to your analysis process:

DNA Research Methodology - Veronica Williams, January 2021

Top 10 Hints for Autosomal Research - Veronica Williams, September 2016

Understand the difference uses of the term 'in common with'. Distinguish between shared matches and shared segments.

One chromosome, two sides, no zipper - Roberta Estes 2015 (for use with FTDNA, slides are dated but the concepts are well explained)
Shared ancestral segments and false positives (IBD, IBP, IBC/IBS) - Roberta Estes 2017
Fully Identical and Half Identical Regions (FIR/HIR), whoareyoumadeof.com 2021
X marks the spot - Roberta Estes 2012
Pile up regions, Family History Fanatics, 2021
Recombining Relations, The Legal Genealogist 2021
Determining Sides, a Segment-ology TIDBIT, Jim Bartlett, 2023

Focus on triangulation to identify match groups who share a common ancestor and don't waste your time chasing likely false positive matches:

Triangulation: ISOGG - 2 types: 'tree triangulation' and 'segment triangulation'.
Triangulation: The what, why and how - Roberta Estes, 2021
How to Triangulate - Jim Bartlett, Segmentology. May 2015
The Benefits of Triangulation - Jim Bartlett, Segmentology, May 2015
Triangulating your genome using My Heritage - Jim Bartlett, Segmentology, Dec 2020.
Identify your ancestors - Follow nested segments - Roberta Estes, Jan 2022
My Heritage - Why don't the segments triangulate? , Family Locket, May 2022
FTDNA Matrix Tool, February 2025
Help, my segments are so sticky! - Kevin Borland, 2020

Once you have mastered the underlying theory of 'triangulated segments' and 'triangulated groups' you may wish to experiment with other tools that provide quick ways to identify matches who might be related to you in the same ancestor group. Consider using clustering tools to help you quickly identify groups of interest to explore, but remember to keep in mind whether the clusters are based on 'shared matches' or 'shared segments'.

GEDmatch - You need to subscribe to Tier 1 tools to access the clustering tools.
GEDmatch Auto Segment Analyser
- Read Roberta Estes' review here, 19 Oct 21.
My Heritage - Free
DNAGedcom Client - Collins Leeds Method for AncestryDNA and FTDNA
Genetic Affairs - 23andMe, FTDNA, GEDmatch, My Heritage.
DNA Painter Cluster Map Tool
Triangulation in action at DNA Painter - Roberta Estes, 2020
Genetic Affairs - Auto segment, includes identifying personal 'pile up' regions.
Clustering for autosomal analysis, Veronica Williams 2021

2024 Note: Some of these systems are currently unavailable due to the restrictions on downloads, check each site for the latest information.

Suggested activities

A number of activities have been developed to help you apply 'Module 2' in practice.

In the 2021 and 2023 series we had this exercise using the My Heritage site, however due to temporary disablement of downloads at My Heritage it cannot be completed at this time. Once downloads return (or if you still have old reports), it would be advisable to re-visit this exercise.

For the 2024 program the activity has been modified. A new exercise has been developed using GEDmatch data to practice identifying triangulated groups and false positive matches. The My Heritage exercise has also been adapted and is now limited to identifying triangulated groups, however it introduces a different tool to help extract data, called Pedigree Thief.

* GEDMatch - Exercise to identify triangulated groups and false positive matches.
* My Heritage - Modified exercise to identify triangulated groups using Pedigree Thief.

If you are not very familiar with the My Heritage site this video may be a good introduction. Alternatively, contact SAG Education for recent webinars presented at the Society of Australian Genealogists for both My Heritage and GEDmatch.

Advanced reading:

Jim Bartlett has written a four part series on the distribution of Triangulated Groups, which you might find of interest. This follows his earlier discussion on Triangulating your genome using My Heritage, in December 2020.

* Part 1 Distribution of Triangulated Groups, 1 Oct 2021

* Part 2 Distribution of Triangulated Groups, 5 Oct 2021

* Part 3 Distribution of Triangulated Groups, 8 Oct 2021

* Part 4 Distribution of Triangulated Groups (spreadsheet), 13 Oct 2021

2024 NOTE: Unless you have old data available to you, the suspension of data downloads at My Heritage since late 2023 will impact on your ability to complete Jim's activity at this time.

Veronica Williams

Originally posted: 22 September 2021

Last updated: 2 August 2024

Tuesday, September 21, 2021

My Heritage: Exercise to identify triangulated segments and likely triangulated groups

In Module 2 of 'Organising your DNA data and determining match groups' we discussed ways to organise data and a process for working with My Heritage to identify match groups. Before moving to Module 3, make sure you understand the theory of segment triangulation and have established a method for retaining records of your DNA research, to avoid rework and improve your productivity.

The purpose of undertaking these exercises is to consolidate your understanding of how to determine triangulated segments, identify potential match groups and whether a match is likely to be a false positive. By utilising the tools at My Heritage you can practice analysing your matches manually to ensure you appreciate the underlying theory. Remember, whilst shared matches are 'CLUES' of a shared ancestor, only shared and triangulated segments are 'EVIDENCE' of a common ancestor.

My Heritage is a DNA testing site that allows uploads from other testing companies. It's free to upload, but if you are not a My Heritage subscriber you will be required to pay a small 'one off' unlock fee to access DNA tools. My Heritage has two downloads that will be useful for chromosome analysis. These are the basic reports that you should start with:

* DNA Matches list

* DNA Matches shared segments list

My Heritage also has a clustering tool which can assist you to determine where to start your research, but this is best utilised once you understand more about triangulation. The clusters are an indication of matches in common, however understanding segment triangulation and how triangulated groups are formed will assist you to analyse the clusters more fully, helping to identify the key matches in the cluster more likely to share a common ancestor.

First steps

The following activities are suggested to help you apply 'Module 2' in practice. First you will need to download 2 reports from your DNA matches page. Click the 'three dots' icon and request the top two reports, the 'DNA matches list' and the 'Shared DNA segments report'. They will be emailed to you as .zip files. Download to your computer and open the zip file, creating 2 new .csv files. To open a zip file – right click on the zipped file – click on extract all – save the extracted files to a folder on your computer.

Exercise 1: Using the Broad Approach by Total cMs - My Heritage www.myheritage.com

Utilising the 'DNA matches list' you downloaded from My Heritage for your DNA kit, sort the Total cMs shared column (largest to smallest).
Make yourself familiar with the data contained in the downloaded spreadsheet.
Add 3 additional columns for side (eg P, M, Both), MRCA (Most Recent Common Ancestor) and notes. Delete or Hide unnecessary columns for manageability (optional).
Give your spreadsheet a working title, ie Total cMs Broad Approach - My Heritage.
Analyse at least your top 10-20 matches - those with the highest total cMs (ideally >40cMs).
Do you know if they are maternal or paternal (or both - ie share both sides, close relations) - mark your spreadsheet with the known side. Don't worry if you are not able to allocate sides at this stage.
Think about the likely relationship, how many generations back in your tree might you expect to find the MRCA? (In Module 1 we discussed the Shared cMs tool to predict relationships? You might like to compare those estimates to those in the My Heritage prediction column).
Notate your spreadsheet with the known or likely MRCA couple, if you can.

You may end up with a sheet that looks something like the one below - in this example I have highlighted my first unknown match in green. This match shares 60cMs, over 2 segments, the largest segment being 51.6cMs.

In Exercise 2 we want to examine an 'unknown' match who only shares on one segment (to make it easier to interrogate) so match #34 would be unsuitable. If your first unknown match also shares more than 1 segment, you will need to drill down further in the list to find another potentially suitable match to examine for this exercise.

Exercise 2: Using Chromosome Analysis (Specific Approach) - My Heritage www.myheritage.com

For this exercise we are going to utilise the 'Shared DNA segments report' and the My Heritage website.

Step 1. Prepare the spreadsheet:

Open the 'Shared DNA segments report'.csv file you downloaded from My Heritage.
Save it as a separate sheet and give it a working title, ie 'Chromosome Analysis - Specific Approach MH'. We will be amending this version, but we also want to retain the original 'Shared DNA segments report'.
In your newly created 'Chromosome Analysis - Specific Approach MH' sheet, the columns we will be using will be Match Name, Chromosome, Start Location, End Location, Centimorgans.

Delete or hide any unnecessary columns for manageability (optional).
You may also wish to format the Centimorgans column to display the numbers showing one decimal point so that it is easier to read (ie 168 is the default number, re-format the column so it would appear as 168.0).
Freeze the header row so that you can easily see and sort each column heading later.

In your newly created 'Chromosome Analysis - Specific Approach MH' sheet, add columns for side (eg P, M, Both, I/F), TG (Triangulated Group), MRCA (Most Recent Common Ancestor) and notes.
In your newly created 'Chromosome Analysis - Specific Approach MH' sheet, sort the centimorgans column into descending order to identify the matches with the largest segments, colour code any over 40cMs so you can identify them later (if you don't have any >40cMs just reduce your threshold);
Go back to the original 'Shared DNA segments report' file from My Heritage. Examine the list and find the first 'unknown match' in the list who only shares on one segment (select one of your coloured matches if they fall into this category).
In your 'Chromosome Analysis - Specific Approach MH' sheet, you can resort the matches by name to see how many chromosomes they match on. Otherwise you can always go back to the 'DNA matches list' (Exercise 1 by Total cMs) which lists the number of shared segments. (NOTE: My first unknown match highlighted in green above shares 2 segments so is not suitable for this exercise).
In your 'Chromosome Analysis - Specific Approach MH' sheet, highlight your unknown match, in a different colour to the one you used for the cMs column.
In your 'Chromosome Analysis - Specific Approach MH' sheet, resort the sheet by Chromosome and start and end locations so that they appear in order.

Navigate back to your unknown match (Control/Command F - type in the name).
See how many matches there are overlapping on the same segment shared by your unknown match.
If there are a large number, double check that these are not in a known false positive region, or pile up areas. The chromosome map at DNA Painter is probably the easiest place to check this. If it is in a known false positive region, use a different match for the exercise, otherwise it may be too time consuming for what we are trying to demonstrate. Make a note in the notes column.

Your sheet should look something like the one below.

Step 2. Identify the 'unknown match' you want to start with from Step 1 - Match A:

Login to My Heritage site and navigate to your DNA Matches list.
Find the 'unknown match' in your DNA Matches list on the MH site. We will call this match 'Match A'. Double check they only share on one segment and appear to have at least one triangulated match with other matches (look for the symbol in their shared match list).
Decide on a name for the triangulated group you hope to identify for Match A from this exercise - it could be TG001-Side A (or whatever name you choose - it can reflect maternal/paternal if you know that information - eg M_001 or something similar).
Go back to your 'Chromosome Analysis - Specific Approach MH' sheet that you created in Step 1.

It should already be sorted by chromosome, then start and end locations.
Find 'Match A' and put the TG number you created in the TG column.
Identify the start and end locations of 'Match A's' shared segment area - eg Chromosome 18 from 11.4 - 59.0 and see how many matches have overlapping segments in this segment area. It may be helpful to colour code this group for easy identification of all the matches that you are now going to check at My Heritage to start the sorting process.

Return to your DNA Matches list on the My Heritage site. For the next part of the exercise you need to look for these two symbols on your My Heritage DNA match page, check where they appear before you start step 3.

Part 3. Identify likely triangulated groups from shared matches with triangulated segments.

On the 'My Heritage DNA Matches' page, open up the comparison page between you and Match A, by clicking the purple 'Review DNA Match' button. Double check that you only share one segment with you selected match.
As you look down the 'Shared Match List' on the My Heritage site, You and Match A will share at least one triangulated segment with anyone who has the TG symbol showing on the right. For this exercise - because we know Match A only shares one triangulated segment with you, all the matches with the triangulation symbol MUST BE matching on the same chromosome, so you and Match A and each match showing the symbol are forming a single triangulated group. You can click the TG segment symbol to check.

We are now going to identify every shared match that has the triangulation symbol, this is evidence the 3 people have a triangulated segment (you, Match A and the match with the TG symbol). It is highly likely everyone who triangulates with you and Match A will also form a larger triangulated group, all being connected via a common ancestor.
Review the list of shared matches looking for every shared match who has the 'triangulated segment symbol'.
If you know a side or the expected line for the common ancestor, you may wish to add an appropriate coloured dot as you work through this exercise for the triangulated matches ONLY.
For each match with the TG symbol, go back to your 'Chromosome Analysis - Specific Approach MH' sheet and notate the TG number in the TG column.

Continue to work through the 'Shared Match List' on the My Heritage site, looking at all the shared matches between yourself and 'Match A' (keep pressing 'Show more DNA matches' when you get to the bottom of each page) and identifying those with the triangulated symbol and going back to the spreadsheet to notate the TG number the same as we did before. You can click on the TG symbol to see which segment is triangulating, but they all should be on the same chromosome and in a location common to Match A because we chose a match sharing only one segment. NOTE: Make sure you stay on the match page for 'Match A' and don't accidentally start reviewing shared matches of a different match by mistake.
The triangulated symbol (shown in the shared match list) is telling you that you have at least one triangulated segment. Because we chose a match who only shared on one segment there will be no need to keep double checking the chromosome, as they should all be triangulating on the same segment. (This saves time and we will check the whole group later). For matches sharing more than one segment you will need to check the TG, this will be a much more time consuming exercise.

As you work through the list, for those matches that have a tree, look to see if you can identify common names, locations or (if you are very lucky) the shared MRCA. Notate your master spreadsheet as you go. It is up to you whether you record these notes on the 'DNA matches list' or the 'Shared DNA segments spreadsheet', the MH site or all three!

When you have exhausted all matches with the TG symbol, sort your spreadsheet by the TG column and review how many matches are noted as all being in the same triangulated group?

Colouring code the names in the TG may assist in knowing you have reviewed them (optional);
All members of this TG should share a common ancestor.
After completing the rest of this exercise, come back to this group and look for common names or locations and try to find any genealogical connections within the group.

Your 'Chromosome Analysis - Specific Approach MH' sheet should now look something like this.

NOTE: In some cases you might find that you have very large numbers of matches appearing in the overlapping segment area. If so, you can reduce the number of matches you compare in the second part of the exercise to a more manageable level by reducing the segment size to say >10cMs (this is a judgment call by you). If you do this, you will not be able to discard all false positive segments as you will not have done the complete comparison exercise.

Remember what we are trying to demonstrate for a valid triangulation is that A matches B, B matches C and C matches A, we need to do 3 comparisons. When we extend that out to a group, we need to do many more comparisons, ie for a 4th match, we need to ensure A matches B, A matches C, A matches D, B matches C, B matches D and C matches D, so there would be 6 comparisons. As the group gets larger the more comparisons we need to do to ensure every member is part of the group.

Part 4. Review the other side of the chromosome - starting with Match B (see below):

After reviewing all the triangulated segments for 'Match A' at My Heritage, go back to your 'Chromosome Analysis - Specific Approach MH' and re-sort it by chromosome number and segment location.

Look for the colour coded area indicating all matches that had overlapping segments with 'Match A'.
Find the largest segment match in the list that has not been marked as belonging to the first TG group. We will call 'Match B'. If possible, find one that matches only on one segment to make the comparison process easier.
If Match B matches on more than one segment, you will need to check each potentially triangulated segment and record only those on the opposing side of the same chromosome as Match A.

Repeat the process in Part 3 for this match. Because 'Match B' did not triangulate in the first exercise, they must be a match on the other side of the chromosome, or are a false match. Make sure they triangulate with at least one other match. If they don't choose someone different.

Decide on a name for this second triangulated group - it could be TG001-Side B (or whatever name you choose - it can reflect maternal/paternal if you know that information - eg P_001 or something similar).
Notate each match with the TG number on your shared segment spreadsheet in the TG column.
If you end up with smaller TG subgroups on the alternate side (different overlapping areas) you can number them B1 and B2.
Colour code the names of the group in another colour so you can see who is included.

Sort the 'Chromosome Analysis - Specific Approach MH' by the TG column and it should look something like this.

Part 5. Review what's left:

Where there are TG's on both sides of the chromosome, any matches who do not match either side WITHIN THE SAME SEGMENT AREA can be marked as false positives, notate the side column F (False) or I (IBS/IBC).
Remember in this example we chose a match who matched only on one segment so if they don't triangulate with either of the two groups they are a false match. That is not to say if they shared on multiple segments some of the other segments might be valid segments.
In any remaining segment area where there is a TG on only one side of the chromosome, the opposing side should be left blank as we cannot determine anything about the validity of those segment matches at this stage. The alternative is to mark them to the opposing side, until further work is done, but you need to remember that many of them will end up being false positives.
Your final list might look something like the one below.
Retain the spreadsheet for future reference and build on it with further research.

In this example the segment area for Match A was 11.4 - 59, whilst Match B shared 9.7 - 24.2. We found two triangulated groups, one on either side of the chromosome. As we were able to identify the MRCA for some of the matches we were able to allocate them to maternal (TG_M_001) and paternal (TG_P_001) groups.

Because we did not investigate the second side of the chromosome between 24-59, this means all the matches who did not triangulate with TG_P_001 in this segment location area must be either maternal matches or potential false positives.

Part 6. Go back to the end of step three and explore your triangulated group for the MRCA (optional).

Is the whole group triangulated?

You can now go also back to the 'triangulated match tool' on the My Heritage site and check to see if all matches in the group you have identified (TG001-Side A) triangulate with each other. If the first segment was quite long, you may find the triangulations are in sub groups, like in the diagram below. You will need to play around with your match comparisons to identify the subgroups, but eventually you might see this sort of pattern.

This is demonstrating that whilst everyone is triangulating with 'Match A' (red), they are coming into the group at different levels. You and 'Match A' share the longest segment, so 'Match A' is probably a closer relation to you and your shared MRCA couple. The other matches may reflect matches to the same MRCA couple or perhaps a segment belonging to an older ancestor from one of the ancestors in that MRCA ancestral couple.

If you can identify the MRCA couple for Match A, others in each of sub groups could be segments coming from any of the 4 parents of that couple depending upon whether there was a recombination event in the segment area for Match A.

For example if Match A (red) was a 2nd cousin, they share your great grandparents. The subgroups could belong to either of those, or potentially different 2nd great grandparents depending on where recombination events occurred. Whilst I would initially call this one triangulated group for the entire length of Match A, if you identify the subgroups as belonging to more distant ancestors, you may wish to rename your subgroups as separate TG's.

Read more about reviewing your DNA matches at My Heritage.

Next Steps - the quick way!

After you have mastered identifying your triangulated groups, access the 'My Heritage Auto Clusters' tool under the DNA tools menu. Remember that these are shared match clusters and need to be examined carefully to identify triangulated groups.

Check to see if the match you selected for Exercise 2 appears in the cluster report;
Who else is in the cluster?
Which matches in the cluster are triangulated and who is only a shared match?
Explore some new groups (optional);
Don't forget to add any analysis to your master spreadsheet for future reference.

Need a bigger challenge?

Try this exercise from Jim Bartlett - Triangulating your genome using My Heritage (Segment-ology, Dec 2020). I'd suggest attempting one chromosome to start, probably one of the smaller ones, Chromosome 20 might be good as it doesn't have any known false positive regions.

Veronica Williams

First Published: 21 Sep 2021

Last Updated: 19 December 2023

Tuesday, September 7, 2021

GEDmatch - Exercises to understand the difference between 'shared matches', 'shared segments' and 'triangulated segments'

In Module 1 of 'Understanding DNA Basics' we discussed the key concepts and theories relating to DNA analysis for genealogy. Before moving to Module 2, make sure you understand the theory of segment triangulation.

The purpose of undertaking these exercises is to consolidate your understanding of the difference between shared matches and shared segments. By using the free tools at GEDmatch you can practice analysing your shared matches to check if you have any shared segments, then to identify triangulated segments with other matches in common. Whilst shared matches are 'CLUES' of a shared ancestor, shared and triangulated segments are 'EVIDENCE' of a common ancestor.

GEDmatch is a third party tool that allows you to do comparisons between people who have tested their DNA at different sites. It's free version has these 3 basic tools that you should start with:

* 'One to many' report

* 'One to one' comparison

* 'People who match both kits, or 1 of 2 kits'.

The Tier 1 subscription has more tools but it is recommended that you start with the free version and master that first.

Suggested activities

The following activities are suggested to help you apply 'Module 1' in practice, these can be done using the GEDmatch 'free' version.

Exercise 1: Using the Broad Approach by Total cMs - GEDmatch www.gedmatch.com

Run the ‘one to many - limited version’ tool at GEDmatch for your DNA kit, sort by Total cMs shared. Leave the list default at '50 matches', but adjust the match size to 40cMs.
Copy/Paste (special) to a spreadsheet, add columns for side (eg P, M, Both), MRCA (Most Recent Common Ancestor) and Notes.
Give your spreadsheet a working title 'Total cMs - Broad approach'.
Analyse at least your top 10 matches - those with the highest cMs. Do the whole 50 if possible, you will probably have close kits in the list that you already know and some duplicates.
Do you know if they are maternal or paternal (or both - ie share both sides, close relations) - mark your spreadsheet with the known side (optional - add tags in GEDmatch if you have a Tier 1 subscription). GEDmatch lists the original testing company on the 'one to many' report. If they have tested at another site, like AncestryDNA, you may find more information at the original test site.
Notate your spreadsheet with the MRCA couple name, if known.
If you don't know anything about a match in the list, run the ''People who match both kits, or 1 of 2 kits' report between you and the match. See if you share matches with any of your known relatives, this may give an indication on the 'side'.
If you recognise any matches from AncestryDNA, check the 'sideview' feature and use that as a working theory on side - eg paternal/maternal. If you mark a match based on either of these tools, make sure you add a comment in the notes field.
Think about the likely relationship (remember the Shared cMs tool to predict relationships) and how far back in your tree you might expect to find the MRCA? The 'Gen' column also gives you an estimate.
Add any other notes you think might be helpful later. To help spreadsheet manipulation later try to be consistent in how you update your notes field, develop a system.

By using the output of the 'one to many' as your starting point, your spreadsheet might look something like the one below. Don't worry if you are not able to allocate sides at this stage.

Note that in this example, John B probably has two kits on GEDmatch and they might actually be the same person. If you run a 'one to one' on the two kits, if they are the same person you should see they have fully identical regions (FIR's - green) on every chromosome.

Shared segments of close matches can be used for more intensive chromosome analysis.

Exercise 2: Using Chromosome Analysis - GEDmatch www.gedmatch.com

PART 1:

Do a ‘one to one comparison’ at GEDmatch between yourself and your largest unknown match (from the total cMs list in Exercise 1), to ensure it is a valid match (graphics and positions). Look at the graphics, taking account of the HIR (yellow) and no match (grey) regions. It is unlikely you will have any FIR's (green) unless the match is a close relation, or is related to you on both sides.
If you have no matching segments, select another match and repeat the 'one to one comparison' step.
Create a second worksheet (called 'Chromosome Analysis - Targeted approach') to track matches by chromosome. To start - create 3 header columns, Match name, GED_ID and ICW_ID. Add your selected matches name and GED_ID to the 'Chromosome Analysis Targeted approach' spreadsheet.
Do the ‘one to one comparison’ between yourself and your largest unknown match again, this time using the 'positions only' radio button. Copy and paste the data from the table into the 'Chromosome Analysis - Targeted approach' spreadsheet from column 4. Copy the match name and GED_ID down to any additional rows created by adding the table, this will depend on how many chromosomes you match on.
We won't need to use all the columns created from the GEDmatch output, but leaving it the same will save you time as you add additional data. Don't worry too much how the spreadsheet looks, you can reformat and hide some of the columns later. If you do reduce it, keep columns for at least the Match_Name, GED_ID, ICW_ID, chromosome number, cMs, start and finish locations and SNP’s. Add additional columns for Side, MRCA and Notes at the end.

PART 2:

Next you are going to run the ‘People who match both kits, or 1 of 2 kits' report at GEDmatch between you and your match. The people who will appear in this report are 'shared matches' that you and your selected match both share. When running the report, change the second default 'total matching segments' to select matches >20cMs as a starting point - depending on your total number of shared matches you may get too few or too many. You can go back and reduce the threshold down to 10 or 15cMs if necessary, just get a large enough group to be able to practice, ideally about 5-10.
Do additional ‘one to one comparisons’ between yourself and each of the shared matches (position only).
Exclude known close relatives for this part of the exercise, as there will be too many shared segments. (Make your own assessment of how many matches you look at, some people will have many matches, others only a few, so change the threshold if needed to reduce the number of shared matches to be more manageable for the exercise).
Add your name and GED_ID to the 'Chromosome Analysis - Targeted approach' spreadsheet on the next available row. Then add the GEDmatch ID of each of your next shared match under the ICW_ID column. Copy and paste the 'one on one' comparison from the GEDmatch output table to the spreadsheet from Column 4.
Continue with other shared matches, add the results of all the comparisons to the spreadsheet.
Sort your 'Chromosome Analysis - Targeted approach' spreadsheet by chromosome, start and end locations.
Do any matches share on the same segment, in the same location? Colour code each of the individual sets where it appears they have overlapping segments, in slightly different colours.
Add any 'known' matches with identified shared common ancestors to the MRCA column.

NOTE: In the output report from‘People who match both kits, or 1 of 2 kits', you may see a tab called 'Save ICW'. This will generate a list of kit ID's, it can be an alternative way to start your spreadsheet. Some users have reported that this button is not visible, if you cannot see it, don't worry it is not essential and the exercise can be completed following the instructions in Part 1.

PART 3:

Looking at your your 'Chromosome Analysis - Targeted approach' spreadsheet sorted by chromosome, and start and end locations, examine those that appear to be overlapping.
For those that share on the same chromosome, in the same segment location area, run a one to one between the two matches to see if they triangulate on the same segment area ie. all 3 matches share a common segment location with each other, including you. Expressed another way - A matches B, B matches C, C matches A = triangulation. As you move through the shared match list you may find others that triangulate on that same segment, you will need to ensure that they also match the others in the group. Make a note of this in the notes column.
Identify the matches requiring additional comparisons - in green shown below.

On the 'Chromosome Analysis - Targeted approach' spreadsheet add the details of the additional 'one on one' queries making sure you adjust columns 1-4 to correctly reflect who is the subject of each comparison.
As you add the comparison matches to the spreadsheet, you may wish to add a note to Column 1 why the comparison was made, eg 'Check TG_C02' (check triangulated group, chromosome 2), or the name of the match.
The next step is to check that these are triangulated segments.

After you have run the additional queries, your 'Chromosome Analysis - Targeted approach' worksheet might look like one of these examples. Don't spend too much time worrying about how the spreadsheet looks as we will be discussing how to organise your data in Module 2. What we are trying to evaluate from this exercise is whether our shared matches also share the same triangulated segment, with one or more of the shared matches.

Chromosome Analysis - Targeted Approach

Using 'People who match two kits' as the base spreadsheet

PART 4 :

What was the total number of shared matches generated by the tool that BOTH you and your largest unknown match shared?
Of the total shared matches, how many had shared segments on the same chromosome?
Did some matches share segments on more than one chromosome?
Of those who had shared segments on the same chromosome, how many had triangulated segments with the other match?
Did you identify any additional triangulated segments not shared with the initial match?

PART 5 :

Does your match (or shared/triangulated matches) have a pedigree listed on the 'one to many' report - GED/WIKI? By reviewing their tree, can you see how they might relate to you? (It is not necessary to go looking elsewhere for additional information for this exercise, just look at what clues you can see in GEDmatch).
Make notes of any other information and details of triangulations (some optional ideas for clues include - shared matches, pedigree, surnames, locations).
Notate your spreadsheet with the MRCA couple, if known (don't worry if you can't tell yet).
Repeat for other matches of interest (optional).

Once you have mastered these 3 basic tools and understand the concepts of how triangulated groups are formed, you can experiment with the many other tools that GEDmatch has to offer.

The 'Tier 1' paid subscription provides additional tool/reports including:

* Multi kit analysis (MKA) report. This can be useful to compare multiple kits from the 'one-to-many' report. However, the tool may be confusing if you are new to GEDmatch. MKA also includes colour coding of matches called Tagged Groups;

* Segment search;

* Triangulation report, including cross matching; and

* Cluster reports.

In all the examples we are using in this program, we are firstly doing them by the 'longhand' manual method so that you understand the DNA theory. We aim to use the 'free tools' so that everyone can easily participate and are working the same way. In Module 3 we will look at other tools that are available to help you speed up the analysis process.

Once you have completed Exercises 1 and 2, you might like to try Exercise 3 - Identifying segments of a known ancestor to push back further generations.

Veronica Williams

First published 9 Jul 2021

Last updated 10 July 2024