The Possible Duplicates feature of Family Tree does not work and will continue to defeat the purpose
LegacyUser
✭✭✭✭
Marcia Lee Sorenson said: This is not an idea or praise, it is a complaint! I have sent this problem to feedback at least 8 or 10 times in the last 4 years, and the problem has not been fixed. In my opinion, the problem has gotten worse. This is the problem...THE POSSIBLE DUPLICATES FEATURE DOES NOT WORK! Supposedly, the goal of Family Tree as the improved version of new.familysearch or temple ready was to avoid adding duplicate individuals to the system. Supposedly, now the technology was better and was able to recognize the John Johnson and Jon Jonson and Jon Jonsen was the same person when they came from the same place and had the same parents, etc. Well, I am so discouraged with the possible duplicates feature because it does not work. One time out of 100 will the system come up with a possible duplicate when I am looking for an individual or adding a new individual to Family Tree. Yet, if I type in the same person using the find button, the person is almost always there. Actually, there is usually many of the same person in the find button. However, I have typed in a person in the find button and it will tell me there are 2000 individuals with that name. Then the find button says, "The following results do not match, but they may help." What criteria do you use to locate individuals with the find button? I just typed in Finbo Hasli in the find button because the possible duplicates says there are NO possible duplicates in the system. Actually, the second person in the Find button under "The following results do not match" was Finnbo Hasli! He was born in the same county in Norway and had the same wife name and his birth year was within 17 years of each other. After finding both Finbo Hasli's under the Find button, I went back to my Finnbo Hasli's individual page and then clicked on the Possible Duplicates feature. Guess what?!! The system said there were NO POSSIBLE DUPLICATES! How can we get the Family Search engineers to fix this problem. The whole goal of Family Tree as the improved version of new.familysearch was to eliminate the DUPLICATES. Instead, many duplicates are being added every day and will continue to be added to the system as long as the possible duplicates feature does not work.
Tagged:
0
Comments
-
Roy F Hunt said: Marcia, I personally have not had the problems with "Possible Duplicates" that you describe. But, then I am dealing with fairly common English names and there are dozens if not hundreds of possible duplicates. I have been told that the system ranks the possible duplicates from 1 to 5 stars, and only the 3-5's are presented as possible duplicates. In your case above the minor spelling difference combined with a 16-17 year difference in birth date may have been sufficient to reduce the probability of a match to the 1 or 2 star level. All that need to be done is to go back to the find screen and copy the PID then go to the possible duplicates click on Merge by ID and past the PID in and you can then merge the two records.
In any case since you have experience with your names that indicates that "Possible Duplicates" and or the "Add a person" functions are unable to find matches you should start with the "Find" function. If the record is there then it a simple matter of coping the PID and then pasting it into the add a person screen. The computer can't do everything for us. Some things just have to be done by a human.0 -
gasmodels said: just to add to Roy's comment. My biggest problem in my English ancestors is that people merge multiple different individuals with the same name just because they show in possible duplicates. The problem can go both ways, sometimes the duplicates are difficult to find and sometimes there are many presented that should not be. The current system seems to provide a reasonable balance but human judgement is necessary to quote Roy above --- The computer can't do everything for us. Some things just have to be done by a human.0
-
Gordon Collett said: I do a lot of Norwegian research and would offer a word of caution. Two Finnbo Hasli born 17 years apart in the same area are more likely to be cousins, brothers, or even father and son than the same person. Even if the wives' names are the same. Especially if the wife's name is something common like Anna Olsdatter!
Assuming the birth years are correct, those two are not duplicates and should never be merged. If the birth years are not correct, you can't expect the computer to recognize that.
What are the ID numbers for the two Finnbo Hasli? When I do a Find on that name, the only ones that come up with wife names at all both have wives named "Mrs. Finnbo" which is, of course, meaningless. I'd be interested in looking further at the two you found to see if they look like duplicates when the birth years are ignored.
(P.S. You might be interested to know that several months ago, someone wrote a post matching yours in its passionate pleading begging to have the Possible Duplicates turned off completely for any Scandinavian name because there were so many incorrect duplicates being found and she found too many cases of people combining completely different individuals because of that.)0 -
Marcia Lee Sorenson said: Thank you, Gordon,for your interest in my post. I am not sure how I got to this site as my intent was to give feedback to the FS engineers, based on the recommendation of the missionary who answered my call to the help line.
Scandinavian names are indeed duplicated over and over because of the patronymic naming pattern. However, I would think that those who combine without doing due diligence are probably either over zealous or inexperienced at doing Scandinavian research.
What irritates me so much is that new.familysearch had a workable and effective possible duplicates feature. When family tree first required us to look at its possible duplicates feature, its filters were so ineffective that as you looked for a person who was born and died in Norway, the feature would show you 5 Norwegians, 10 Danes, 8 Swedes, 5 Finns and even individuals from France or England. I did complain about that lack of filtering several times. Now I think they filter too close so that it can't even find two people who are obviously the same or even maybe the same.
As far as my Finnbo Hasli is concerned, he is definitely the duplicate of the one I found in the Find feature. If you know about Norwegian research, you know that the church records begin in the 1700's. If you are fortunate enough to find your person going back to the 1500's like Finnbo Hasli, all of the birth and death dates are estimated based on tax lists and legal records. Seventeen years apart would not indicate a father and son because the males married between the ages of 25 and 30, never 17. If you are also fortunate enough to have bygdeboks written for parishes by professional genealogical researchers, you can count on the fact that they have dug up all the information available. Finnbo is listed in the Biri Bygdebok and his children and grandchildren are also listed there. Everyone who lived at Hasli farm used the farm name to identify which Ole Olsen they were. They are not necessarily relatives, but often workers on the farm.
My main frustration is the fact that not only have I wasted tons of time researching individuals who are already in the FS system, but I have added to the duplication of individuals because the Possible Duplicates feature failed to bring him up when I first looked. As far as the Find feature goes, I have found my person down a hundred spaces on the list. Who but a dedicated researcher would take the time to look through 100 individuals to find a duplicate that a computer should have recognized in the first few options.
Thank you for mentioning the request to eliminate the possible duplicates feature based on incorrect mergers. I had not thought of that problem.
Marcia Lee0 -
Gordon Collett said: You ended up on this board because this board has two main purposes.
It is the board to post feedback about anything on FamilySearch including complaints, bug reports, and suggestions for new features or improvements. We have been reassured many times that every single post is read by a Family Search employee and forwarded to the proper people if appropriate. Often someone will reply from FamilySearch, but not always.
In it's second role, this board is a user community discussion and help board. There are a lot of very nice people on this board that can answer all sorts of questions about how to work in FamilySearch and give information about helpful tricks and common pitfalls. They are just here because they want to help. Also, with the users answering many of the common questions, the engineers can spend their time working instead of answering questions.
As far as Norwegian research goes, my wife was born in Norway. Her family is in Hordaland, a lot of it in Stord, but with one branch from Buskerud. There is one wife of a g-g-uncle from Biri, but that is it for Oppland in her family
The biggest trouble I run into with duplicates is that over the past few decades, many researchers have been thrilled to run across the bygdebøker and then copy out all the information for their ancestors. Over the years, while working in the IGI, New Family Search, and now Family Tree I have found some families on Stord with at least four sets of information, if not more, where the information has clearly been directly copied out of the bygdebok.
Then there are the extraction records in which some parishes were extracted two or three times.
So I can definitely sympathize with you. Depending on the parish I am working in, I don't stop looking for duplicates until I have found all three extraction records and all three copies of the bydgebok and the records that had temple work done in the 1920s. More often than not, the only way to find these is to modify the Find parameters over and over. I keep working because I know they are there. The Duplicates routine is usually not helpful because the information was entered in such different ways.
But I am hopeful that with the great capabilities that Family Tree has, including being able to enter alternate names that all get searched on and the ability to source things so well, that maybe the day will come that someone new to Norwegian research will start at Family Tree to see what is there and not just pick pick up a bygdebok and start typing.
As a final note, if you find a person in the Find routine that is a hundred spaces below where he should be, post the exact search you did and what was wrong with the results. That way the engineers in charge of the Find routine can investigate and figure out what went wrong and use that information to improve the routine.0 -
Ronald Tilby said: Marcia, I now understand why you think that two people born 17 years apart could be duplicates. You're using your knowledge of the time period (which wasn't mentioned in your first message) and your knowledge of the available records in the location and time period. It would be great if the possible duplicate matching logic was 'artificially intelligent' and took into account the knowledge and insight that thousands of researchers build up over a lifetime, but alas, that capability is still in the realms of science fiction.
In time periods and locations where available records make individual identification complex for experienced researchers, and all you really have are estimates and inferences regarding vital event dates and places, I don't think you can reasonably expect computer algorithms to be as smart as yourself or any other experienced researcher.
Some of the record matching computer algorithms used in previous systems committed grievous errors combining people who shouldn't have been. There are risks of the computer being wrong in both directions. I think we have to learn and work within the limitations of the current system.0 -
Marcia Lee Sorenson said: Of course we do. Thank you for adding a sensible rationale to my frustration with the possible duplicates feature of FS/FT.0
-
Marcia Lee Sorenson said: Gordon,
Forgive me for acting like a "know it all" and not thinking that you had done as much Norwegian research as me. I do realize that the bygdeboks are subject to human error depending upon the expertise and abilities of those who compiled the information. Despite that, they are a blessing that only Norway has and I feel more confident in using their information that trying to read the usually awful handwriting of the priests. Now that Norway's records are digitized it is much easier to try to cross reference the bygdebok information, but as you say, many of these individuals were added in the 1920's when information was much harder to get.
I have not been as diligent as you and have not modified the Find parameters over and over because you know they are there. What do you mean by your statement that there are three extraction records and three copies of the bygdebok? The professional researcher and native Norwegian that has helped me throughout the years has told me that almost all of Norway's parish records have been extracted. I guess I should just quit working on my family tree. That is a negative attitude and I don't think I have ever said it before. Now that I am retired and can work on FT full time, I guess I will just have to be more diligent and try and find and fix errors.
By the way, I have family lines in Stord and Sveio and many other parishes in Rogaland and Hordaland. Their bygdeboks are much better than some I have searched. However, researchers in the Oymark Parish in Ostfold just finished their bygdebok series a few years ago and there are still some human errors evident. I guess making errors helps to humble us.
Marcia Lee0 -
joe martel said: Thanks all for your replies to the issue. I love that GetSatisfaction provides a venue to voice concerns and the community to help out. It is also monitored by people at FS.
Possible duplicates has been more tuned (smaller net) than the Find (wider net). You can go deeper and deeper into results in Find that seem far away from the Possible Dups. Possible dups is a gate to certain processes so we don't want to present to the user "noise". As Gordon and Ron correctly describe, it is very hard to encapsulate human knowledge into an search algorithm that works for all times, locales, languages, circumstances... This algorithm is always being evaluated. So providing detailed data like PID and search results will help the engineers better understand what to change.0 -
Gordon Collett said: Please, please, please, don’t get discouraged or give up! Not when Norwegian genealogy is funner than ever. Not only are the records so good but they are so easy to access now. Not only are the digitized parish records so easy to work with, but the collections of transcribed records being done under the auspices of the Norwegian National Archives are growing very quickly. This helps a lot with the handwriting issue. Those transcriptions are far better than the Family Search Norwegian historical databases because they are being done by Norwegians who can better read the records than the Americans who extracted the records from microfilm back in the 70’s. They are also usually full transcriptions rather than indexes. In addition, the transcription refers back to the parish record where the original can be found. There have been records I have searched for over and over in the parish records and not been able to find then had the transcription come on line. Being able to see which page the record is on along with who is before the one I am searching for and after, has helped me finally be able to find it in the parish record
First off, regarding being told that “all of the records have been extracted.” That is not entirely true. To take Stord for example, the parish records have only been extracted from 1815 to about 1870. So there is plenty to find between 1725 and 1815 and between 1870 and 1910. Of the records that were extracted for Stord, some were sent for temple work, some never were but were just put in the Vital Records Index and still need all ordinance work. There is one entire extraction batch for Stord where all the girls have had baptisms and confirmations done but nothing else and the boys have had nothing done at all.
As another point, I find the bygdebøker and the extracted records to be complimentary. I have seen several bygdebøker where the author did not bother to include anyone who died before age twelve or so. Those children are in the extracted records. But the bygdebok does include all the children that appear in census and other records whose christening apparently never got written down because they are just not in the parish christening records and so not in the extraction records either.
Basically, I view the status of Norwegian records in Family Tree to be that of a giant jigsaw puzzle with some fairly good sized blocks completed but with a huge number of loose pieces still rattling around. In addition, half the pieces missing. I’m having a lot of fun assembling my wife’s part of the puzzle and finding a surprising amount of temple work that needs to be done despite the efforts of the extraction program.
As far as there being multiple copies of extraction records, here is a good example:
https://familysearch.org/tree/#view=a...
when you look under marriage records, you will see that there are two extracted records for Ole’s first marriage, none for his second marriage that I have been able to find, and four for his third marriage.
What I meant by there being multiple copies of some bygdebøker in Family Tree, is that there have been various researchers in past years who have diligently worked through a bygdebok and submitted the information to one of the predecessors of Family Tree. I run into them so often that they feel like old friends. Kjell Gunnar and Marvin must have both done the entire Fusa bygebok because I run into them in those records over and over. There are two other people who also have done a lot of copying out of the Fusa bygdebok because when I find duplicates from them, the only information is the person’s name exactly as it is in the bygdebok with a birth year and place, death year and place and family structure just like it is in the bygdebok. There have … [truncated]0 -
Chuckie King said: Well, it's not just Scandinavian names. I find it odd that my Ancestral Quest can find 2-3 nearly exact duplicates, when FT can find none. The original poster is right. Until FS fixes this, more duplicates will be added to the system.0
-
Marcia Lee Sorenson said: Tusen takk Chuckie,
(Thousand thanks in Norwegian),
Have you sent that feedback to the Family Search engineers?0 -
Robert Wren said: I was about to start a new threadm but ran across this one. Here's an example of a "possible duplication" problem (by NOT including one).
Could someone offer me an explanation of why these two individuals do not appear automatically when selecting "Possible Duplicates" https://familysearch.org/tree/#view=m...
The names are exactly the same, the birth YEAR is the same, but KV27-829 has TWO sets of parents, for the two DIFFERENT individuals that were 'combined' apparently through the ubiquitous "familysearch" data migration.
The recent change record shows the two different birth sources:
https://familysearch.org/ark:/61903/1...
https://familysearch.org/ark:/61903/1...
(One is actually one I have looked for quite a while - YEA)
The dates alternate back and forth with 'familysearch' inputs. I can probably 'correct' it, but "been there, done that" elsewhere - only to find it changed again, either by FS (demonstrating, again, the problems with auto entries and unsourced changes) or another individual.
In my opinion, these problems with "duplicates" actually cause a great number of ADDITIONAL duplicates being created, because the criteria seems inadequate to find them all.
I'm sure the response will be 'it will all be corrected when we stop the transfer from nFS.'
Hopefully SOON!!
FWIW, this is the latest of many feedbacks I have made concerning various tree problems, with the 'best' response was being advised of the getsatisfaction forum, which does offer a little response, but little in solutions as far as I can see.0 -
Roy F Hunt said: Robert, The two records do not appear automatically because it is obvious that they are different people. The names may and birth years may be the same but, the relationships, wives and children, are totally different. Mark them "not a match" and be done with them.0
-
Kenneth Ferguson said: I find the system works extremely well when resolving duplicate record issues. It does take a little time to learn how it works but if you follow instructions it works well.0
This discussion has been closed.