check out the answers here:
WHY ARE THERE SO MANY DUPLICATES
Very interesting clip. Worth watching, but with some provisos!
Firstly, I'm sure many of the records talked about (especially the user-submitted IGI ones) have still not been transferred to Family Tree.
Secondly, it makes very depressing viewing. Say I have just merged around a hundred records (probably not in one day, either): the thought that two-thirds of Family Tree probably consists of duplicates - amounting to millions of IDs - makes me doubt if it was all worth the effort.
If there are thousands of users matching my efforts, maybe we will get a little way in tackling the duplicates problem. But then another batch of duplicates is likely being added every day to counteract this!
Also, I consider (from my work in Family Tree) I have identified a far greater problem than genuine duplicates. That is, the individuals (I can't imagine how many thousands) who have disappeared as such, by being merged with another person of totally different identity.
Now, if I'm finding this on a regular basis, it really is worrying that many individuals are being lost from Family Tree just because they shared the same name - and even that is not always the case (I have had a William Payne merged with a Thomas Payne, etc.).
In view of the factors discussed, overall, I sometimes wonder if it is really worth the effort, when the problem is so huge and careless users are just compounding the problems attached to the issue.
can you clarify what you meant on his comment:
on the family trees that I most work with - most of the duplication has already been removed.
but of course -- duplicates always seem to creep in (most often from GEDCOM uploads)
note that video was created back in 2018 -- and even then was probably old info
a large amount of great work has been dome in removing the dups in most of the line I work with.
Now if we can just keep people from uploading GEDCOMS.
@Dennis J Yancey
Maybe I am getting IGI records mixed-up with the IDs created by way of these records.
Some records remain exclusively in the Genealogies section, but I agree I have no evidence to show that all the individuals created via those records haven't come across to Family Tree. In fact, this is probably a major part of the Family Tree duplicates problem. For example, as discussed in other threads, those 10 x "John Smiths" (one for every child in which he is mentioned in an IGI record) - and even more, if the event appears twice (i.e. around 20 duplicates if there were two projects based on the same records).
I understand all the "extraction" part of the IGI IDs were moved over, but don't know for sure (perhaps you can confirm) if everyone in the other parts that now make up Genealogies (including Ancestral File) really have been moved across, as implied in the video.
Regarding your comment on GEDCOMS, you may remember Joe Martel was always quite insistent (in GetSat days) that, when the stats were analysed, GEDCOMS were not the major part of the problem.
I need to review the video
IGI -- I would expect all to have been there along time ago
items from Genealogies -- stay in genealogies - unless the user goes the extra VERY TEDIOUS step of adding them to Family Tree
BUT I need to goback and rewatch the video
Thanks for your response, Dennis. I was editing my post at the time, so you might wish to re-read that, too!