A method to use user-submitted DNA-data files and Geolocation to rank the probability of having a co
For the last few months I've developed a method to find common ancestors using GIS and DNA analysis.
Using this method, by hand, I was able to track a common ancestor 7 generations ago with a distant cousin in Pennsylvania. (I now live in the states, but I'm fifth generation Argentinian). The grandmother of my newly discovered cousin was a natural child, so we couldn't use the regular process to find where were we connected. With my method I could identify what was the most likely branch of my tree, and the time and location where our common ancestor lived.
With that success, and since my bills-paying job is as a Data Scientist, I took a couple of days off to automatize the process.
Unfortunately, I don't have access to Family Search API, so I had to download the individual GEDs using a third party tool (RootsMagic).
Family Search could easily run these processes in the background and give suggestions to researches of where most likely their trees should be connected.
Furthermore, with more and more DNA data available for Family Search (as users start using this functionality) new connections could be inferred that would be impossible by just comparing two DNA files.
I published a brief demonstration of the mapping (though not the ranking) in LinkedIn:
Please, contact me if you want to know more.
Thanks!
Herman
Comments
-
Normally, I don't want to discourage using DNA to establish relationships. However, the most commonly-used DNA test (Autosomal) is only good for a few generations. Y-DNA and mtDNA trace a single paternal or maternal line.
Regardless, because Autosomal DNA is so limited, any connections beyond a4 generations really needs to be backed up with a solid paper trail.
I have three ancestral lines that came from geophysical locations and all three basically came from a single (end-of-line) ancestor. Attempting to trace the lines back further using even Y-DNA has been able to establish only one other immigrant family link among the 25 Huber families who settled in what was originally Lancaster County.
0 -
I'm not suggesting to use only DNA data to establish relationships. What I'm suggesting is using it to prioritize lines of research.
Also, while is true that one DNA match could be inaccurate after a few generations, several thousand of DNA matches, all concentrated in the same geographic area, are certainly a good indicator of common ancestry (by mere preponderance of the evidence).
My method does not find (nor claims to find) common ancestors, but helps orienting the research. A certain relationship has to be backed by some of the usual documentation.
0 -
I'm not as pessimistic as Tom Huber about autosomal DNA. Up to 6 generations or so it can be confidently used. For more distant relationships, you'd want triangulation (three people sharing the same segment, and with a unique common ancestor).
I think the idea is good, but you'd need to speak to FS Staff about the possibility of making this tool avaliable through FS.
0 -
My bad! In the post I meant 4/5 generations ago, not 7. The 7 comes from another line of research that using this method took me to the common town in Denmark, but I haven't figure out the common ancestor(s) there yet (I'm not that familiar with the Danish records)
0 -
Here I posted another example. In this case the method point me to the exact towns in Denmark my father and someone in Manti had common ancestors.
0