Merge is alerting not helpful

Julia Szent-Györgyi · June 13, 2022

I went to merge two profiles for the same guy. One of the profiles had his death as "after 1835", the other as "24 March 1846". The first merge screen yelled at me in red: "These 2 people died more than 3 years apart." Huh?? No, they didn't: "after 1835" covers several decades of possible death dates, and 1846 fits the range perfectly well.

So then I went to merge two profiles that had identical names, except one name had the language as Hungarian, the other as Other. I got a Merge Warnings box across the top: "These persons may be twins". Uh, no, they very definitely may not be twins, because twins have different names.

Gail Swihart Watson · June 13, 2022

Julie, I’m so sorry! I am working in the tree and in sources for many hours a day now and I find so many things frustrating that slow me down. My frequent issues are different from yours because I rarely merge with my current research. My problems are with linking sources, moving sources from one person to another, adding sources, searching for new sources and searching the tree. I get errors or blank screens almost constantly. I am beginning to lose hope that it will ever be fixed. What is going on?

Gordon Collett · June 13, 2022

I agree that the warning messages need some work.

Regarding @Gail Swihart Watson's comment, what I am hoping is going on is a bunch of behind the scenes work we are not being completely informed about.

Quite a while back in one of his presentations Ron Tanner reported that the entire FamiySearch website had to completely shift its underlying engine. I forget the terms he used. This change gave the opportunity to basically rewrite the entire site and update its code. Over the past year or we have a new merge routine, seen new search pages, a new change log, new appearance to various lists, new summary cards in some areas, new view relationship screen in some areas, new source box, and other updates. We can see that the various pedigree screens are probably close to being updated to match the new First Ancestor pedigree view. Ron stated that these are not just cosmetic changes but are part of that process of completely updating things. I suspect so many people are reporting frustration working in the site, like you are, because they have reached a critical point in which there is so much new code that making the old code work with it is becoming a waste of time and resources. With the trouble you see working with linking and moving sources, I have to wonder if we are going to see a brand new Source Linker this summer.

I do wish they would scatter a generous number of "Warning - Roadwork Ahead" signs in places where they know people are having problems due to the ground up pavement, sharp drop offs, bumps, and detours.

Cheryl Viering · June 13, 2022

I do a lot of merging, and see these warnings a lot. Family Search doesn't seem to handle date ranges well. You'll see the same warning with "about 1835" & "1846".

Putting an "about" date of birth would seem to be a good idea, as it lets others easily see if that person could reasonably be the one they are searching for. But it gets treated like an exact date, and screws up the search algorithm. It makes the person impossible to be found by searching with the correct date. Also, it seems like the qualifiers like "about" & "after" sometimes get dropped during merges.

I also get a lot of warnings about merging "twins". It's best to just chuckle at the ridiculousnous.

dontiknowyou · June 14, 2022

I also do a lot of merging and see these warnings, which I must point out appear only on pairs of profiles that are so complete that it is certain they are not twins, they are duplicates. The "may be twins" warning on duplicate records must be off-putting to newer contributors!

Julia Szent-Györgyi · June 14, 2022

As a twin myself, the "may be twins" warning on Exactly The Same Name bugs me especially badly. I mean, really? You consider it somehow reasonable and even likely that parents would give both babies the same exact name, as if they were a single person? Does it actually work like that in your universe?

CookeWilliamB1 · June 14, 2022

All FS wants to do is to ensure that you the patron are 100 percent sure that the merge you are doing is RIGHT.

Julia Szent-Györgyi · June 14, 2022

@CookeWilliamB1, what FS is actually doing is crying wolf: they're alerting to such utterly impossible things that one learns to just ignore anything in red. In fact, it's best to not even read the warnings, to avoid the spike in blood pressure caused by the infuriating ridiculousness.

Gordon Collett · June 15, 2022

I have tried as many configurations of siblings with the same name and birth information as I could come up with and can't trigger that possible twin warning, even with the two copies' name's set to different languages. Was there anything else on the records that could have given any possibility that they might possibly be two different people? I've always found it an interesting challenge to try and figure out the programming logic in this sort of strange behavior.

Julia Szent-Györgyi · June 15, 2022

The surviving profile is Kirchlehner Borbála LKK7-FQK, and the deleted one was Borbála Kirchlehner GX48-258. I cannot go back and check the name's settings on the merge-deleted one, unfortunately, but I assume it was set to either Other or German.

Hmm, perhaps the language was set to Hungarian both times, but the name got entered the wrong way 'round in the deleted one? It's an incredibly easy mistake to make when working on multilingual families...

Gordon Collett · June 15, 2022

So there is the bug. You got me to think of a variation I had not tried. Here is a family I made in Beta:

Screen Shot 2022-06-14 at 6.59.13 PM.png

These siblings with same name and same birth data and place are flagged as duplicates. When I merge, things are fine:

Screen Shot 2022-06-14 at 7.01.13 PM.png

If I change one of them to Hungarian, I see this:

Screen Shot 2022-06-14 at 7.03.00 PM.png

They are still flagged as possible duplicates because they have the same first name, same last name, and same birth date and place. However, when I go to merge them, I see:

Screen Shot 2022-06-14 at 7.04.42 PM.png

Apparently, the Merge Warning routine is ignoring the language setting and just looking at the first box for the name and the second box for the name, that is, it is comparing the first name of one with the last name of the other and vice versa.

To confirm this, I set both back to English then reversed the first and last name of one of them to get this:

Screen Shot 2022-06-14 at 7.09.21 PM.png

Notice that since the names are now different, they are no longer flagged as possible duplicates. If I try to merge by ID, I get, as you would expect, the possible twin warning.

So this is a reportable bug: the possible duplicate routine correctly takes into account the name template and name order in comparing two names however, the merge warning routine ignores the name template and name order in determining if two profiles could be twins and incorrectly posts the warning flag for identical names just happening to have a different name order due to the template being used.

One last test, I will set the first child's template to Hungarian again, then enter the name backwards:

Screen Shot 2022-06-14 at 7.24.23 PM.png

Here they look the same but the possible duplicates routine refuses to identify them as the same because one is actually John James and the other James John and so they are probably twins. Merging by ID, however, has no problem with merging them and no warning is given even though their names really are completely different:

Screen Shot 2022-06-14 at 7.27.20 PM.png

Do you want me to rework this and post a more coherent bug report? Or do you want to?

Cheryl Viering · June 15, 2022

I've seen that twin warning a lot. I don't usually look at the language of the name.

It tends to happen when a couple gets merged with duplicates, and one of their children has a duplicate shown as a sibling.

Most of the people I am working with were created at the same time, from a single source. I checked a few random ones, and they were all set to language 'other'. The duplicate was usually created by someone entering data in by hand, and attaching no sources.

Gordon Collett · June 15, 2022

Take a good look next time you see this, Cheryl, and note if the names are truly identical or not. Identical names should not be flagged but similar names should be.

Julia Szent-Györgyi · June 15, 2022

I think you covered all the possibilities, Gordon, and your comment seems to me to be a sufficient bug report, but if you feel the need to post it again separately, it can't hurt.

KennethRLee · June 15, 2022

For all those who are on this message trail, remember computers are basically dumb, or rather they are very literal. A date that says "about xxxx" is taken by the computer to mean that the date is xxxx, so as far as the computer can tell the records being merged were for 1835 and 1846, so, yeah, they were more than three years apart. As for the twins issue, the computer sees the same parents with the same birth date, so it suggests they may be twins. Could these be handled by the program? Yes, but it is not as easy as you might think, because it is very hard to program in common sense. Also common sense is not so easy to codify, even for ourselves, as you would think. I generally think of "about" as meaning within five years of the given date. Others may think within ten years. Can you really say either of us is "right"?

Note: I am not associated with the writing of the code for the merge, but have worked in IT for 40 years and this answer is based on my experience in the field.

Julia Szent-Györgyi · June 15, 2022

@KennethRLee, a date with a qualifier shouldn't be interpreted as exact by the computer. There's no point to allowing "before" or "after" or "about" if it's just going to be ignored. There's no excuse for that much laziness in programming.

A twins warning for same parents and same birthday but different names makes some sense. (Although it's a complicated question: does it alert to nicknames, such as Sally versus Sarah, or to translations, such as Lipót versus Leopold?) The problem is, I was getting it for profiles with identical names, except for the displayed ordering. It's a bug that the merge routine goes by appearance rather than function, and thus gives a possible twins alert when twins is one thing that those profiles cannot possibly be.

Cheryl Viering · June 16, 2022

The software should be able to handle date qualifiers. For example, with the GRAMPS software, "about" defaults to +/- 50 years. Family Search simply needs to define a range to valid qualifiers. And hopefully document what it is, in a way that can be found.

Actually I've been thinking that a glossary is needed. Put definition/description of words like "Find", "Search", "Merge", "Feedback". Not many people realize that feedback really means "go to message boards", or that 'find' and 'search' aren't synonyms.

Julia Szent-Györgyi · June 16, 2022

I did some mucking about in Beta.

Entering "Testy Sally" (lang=hu) 1802-1900, it finds as a possible match Testy Sarah (lang=hu) 1802-1900, and entering "Laurence Testy" (lang=en) 1800-1900, it finds as a possible match Testy Lőrinc (lang=hu) 1800-1900.

Possible Dupes flags Sarah and Sally, but not Laurence and Lőrinc, not even if Laurence is changed to lang=hu.

Merge flags both Sarah/Sally and Laurence/Lőrinc as possible twins, regardless of language settings.

Possible Dupes doesn't find John/János, nor Johannes/János, but does flag Johan/Johannes and John/Johan. It does not flag John/Johannes.

Merge thinks John and Johan are possible twins.

So it appears that the add person routine casts the widest net; it seems to use the same name-matching engines or algorithms as Records-Search and/or Find. (I think the name-matching part of the process may be shared between those two.)

Possible Duplicates is smarter than Merge: it takes the name template into account, and "knows" about some name equivalents. It does not appear to use the Search or Find engines, however, because it fails to recognize many name equivalents that those algorithms routinely identify.

Merge is the most simplistic (read: dumbest) of the processes. It looks strictly at appearance, and even a single letter of difference is enough for it to say "twins". The net effect of this is to train users to ignore its warnings: most of the time, they're ridiculously wrong. (I know brothers named Jacob and James, so Sally and Sarah as twinsy-twins is unfortunately not impossible, but John and Johan? I think even the most clueless of parents would notice something wrong there.)

dontiknowyou · June 16, 2022

The net effect of this is to train users to ignore its warnings

I agree. And this is the lede, the important point of this discussion.

kathryngz · July 20, 2022

I'd like to offer a different opinion. I work in Family Tree almost daily. I do my own family history and assist many other users with theirs. By far most of the data errors I see in Family Tree are valid and need to be addressed.

This thread has called out two errors that definitely seem buggy: the twins error, and the date range error. Yes, they need to be fixed. But to conclude that these relatively infrequent errors lead all users to ignore all data errors seems somewhat exaggerated. It definitely doesn't match my experience.

I've also done software development, and my experience matches @KennethRLee's--this type of logic is not as simple as one might think. There are many more variables than most users realize.

Finally, if I could say a word about the FamilySearch developers--those I know are amazing, talented people who are working hard to provide a valuable service to the family history community. Instead of assuming that they are careless or worse, let's see them as teammates who want to move the work forward just like we do. If we find errors, yes, let's share them--respectfully and with the assumption that they will do their best to correct the errors.

CESchultz · July 21, 2022

Never include "after" or "before" for birth or death date ranges as it will not display properly in the tree and is of little value. Just leave the field blank if this is not known or no source has a correct date.

After 1835 can mean 1836 which is ten years from 1846 and it is correct to mention this is more than 3 years apart because it is.

Gail Swihart Watson · July 21, 2022

CESchultz please note that "after" or "before" have abbreviations in the FamilySearch link below that are recommended for use. When using after or before, you have a source which confirms an upper or lower bracket on the date when an event could have occurred. I don't know how you could say that is of little value.

In the example Julia Szent-Györgyi was talking about when she attempted to merge, I assume she studied the sources and was prepared to use which ever date was more correct. A specific full date implies there should be a source, but if people have not added any supporting source to the record, then I would never use it.

https://www.familysearch.org/en/wiki/Acronyms_and_Abbreviations

CESchultz · July 22, 2022

They have a use but not in the birth or death date fields. I used them all the time in the marriage date field for instance.

Using them in the birth and death date fields produces erroneous merge suggestions and record hints.

In my experience they have never helped find the correct information for either field and only contributed to other issues. I do use date ranges all the time when record searching but I leave those specific fields blank when I do not have any sources for them.

Also when looking at the tree view of a person it abbreviates these date fields to just the year and then people incorrectly copy this information elsewhere on the Internet as the actual birth or death date.

Gail Swihart Watson · July 22, 2022

CESchultz To be honest, it sounds like you have not really done much genealogy farther back than 1900. For many people who died before 1900, there is no confirmable death date. For many there is no tombstone or Bible record. Death registers are precious far and few between. The facts may be limited to the husband listed in the 1850 Census, but missing from the 1860 census and his wife was listed as a widow. It is absolutely correct to put the husband's death date as before 1860. Other examples are marriage records where the line for consent by the bride's parents is blank. That means she was 18 or older, and an inferred birth date can be given as before 18 years prior the marriage date. There are so many examples. Use of "before", "after" or "between" are critically important in birth and death records.

dontiknowyou · July 22, 2022

I have learned not to use any date modifiers except "about"; about behaves reasonably as expected. I regret my past use of "before", "after", etc. and when I encounter them again I remove them. Live and learn.

Gail Swihart Watson · July 22, 2022

dontiknowyou Why do you make information LESS correct? Why is the display more important than quality of the information?

Should you ever stumble on a family of my second greats, Arthur Irvin Austin (1855–1936, L2HY-YL3) and Mary Belle McKittrick (1856–1926, L5GT-QX9), would you change the records of 2 of their children who died? I painstakingly worked out the approximate years of their birth from 1) gaps in the birth years of the surviving children and 2) a photo I have of the 6 children, all alive, but only 2 of whom I recognized. I also noted their deaths as before 1900 because they were missing from the 1900 Census by name, but reported as deceased.

Would you really advocate changing the dates to hard coded dates?

Paul W · July 22, 2022

@Gail Swihart Watson

I have worked on my genealogy for over thirty five years, so have plenty experience in dealing with records of all time periods. I also hate the use of "before" and "after" inputs as they are completely meaningless and, outside of the Person pages, give the impression that these are dates that have been established.

Many users enter "after 1910" because that is the last trace they have of a relative they found in the census of that year. Maybe (as suggested earlier) they died many years later, but from a pedigree view page it appears they literally did die in 1910.

I now only use "about" - and, even then, only if I am convinced an event was within a short period +/- the year I have inputted. Anything else is not genealogically sound and can thoroughly mislead those who choose to carry out much of their work from Landscape, or other pages away from the Person one, where the true situation might be that little bit more obvious.

CESchultz · July 25, 2022

@Gail Swihart Watson I have worked extensively with genealogy pre-1900s and before and after serve no purpose outside of searching records. If you are writing up a bio about somoene and need to put something you can mention it there too but the date fields in Family Search do not handle those well and it can produce erroneous merge suggestions and record hints. As I and Paul W mentioned FS tree view abbreviates these dates showing the person born or dead on that exact year. After working on hundreds of thousands of records they have never helped and I always remove them.

Gail Swihart Watson · July 25, 2022

I'm sorry we all agree to falsify information just for the sake of the precious pedigree view. Information as accurate as possible serves no purpose. I'm amazed. Good grief. Tell me why you all care how it looks in tree view more than how accurate it is. I want to know why accuracy serves no purpose. I'm sorry, but what this tells me is you are not detailed oriented. Take a look at my family group of ancestors, Arthur Irvin Austin (1855–1936, L2HY-YL3) and Mary Belle McKittrick (1856–1926, L5GT-QX9). There are 2 children of Arthur and Mary Belle whose dates of death are unknown, other than they were dead by 1900 AND I have a curious undated photo where all 6 children are all alive. I did spend quite a lot of time figuring out who everyone was. I'm very curious to know what is a BETTER way for those 2 unfortunate children than my death dates of "before 1900".

Julia Szent-Györgyi · July 25, 2022

I agree with Gail that the failure of the tree and other abbreviated views to label modified dates is a fault of those views, not of the modified dates, and should not prevent us from using modifiers to accurately capture exactly what we know and don't know about "when". If I haven't found a death date, but I know the person was alive in 1911 but deceased by 1923, why shouldn't I enter "between 1911 and 1923" for the death date? (Well, apart from the fact that FS doesn't actually have "between", only "from ... to ...", which is, um, rather dumb, but anyway, my question stands.)

Paul W · July 25, 2022

I (and I'm sure others) have requested that at least some indication should be shown on pedigree pages to show a person was not really born, or did not die, in those years - if that's what it says on the Person Page. Even a simple "c1800 - c1880" would show these dates had not been firmly established, so (from a pedigree view) it could be clearly seen they were still in need of some research.

Even so, I still stand by my view that "after 1910" is totally meaningless. How can this possibly be useful, when the individual concerned might have died some thirty to forty years after that date?

I just don't understand how many users feel that every field available has to be completed. They even put (say), "of London, England" in the birthplace field, just because the person married in London - probably some 20-30 years (or longer) after their birth. They could have been born anywhere, at any time, so why input speculative suggestions, as though they represent facts?

Sadly, it seems to be beyond many users' comprehension that, if details of an event are completely without evidence, they can (and should) just leave the field blank.

Merge is alerting not helpful

Answers

Categories