Why are Anna and Anne not equivalent In Family Tree Find?
Why does the fuzziness of the Find routine in Family Tree not consider Anna and Anne to be the same name:
vs.
The only difference in the searches is the change from Anne to Anna.
Using a wild card helps:
Strangely enough, not using a wild card and just dropping the final vowel helps:
Using Ann finds Anne, Ann, and Ane.
(The person I was looking for was Anne Marie Knutsdatter born 1893 at Stord)
Answers
-
The same reason Ferencz and Ferenc are treated as different species, even though they're Exactly The Same Thing, just using different versions of the spelling rules (pre- and post-WWI, roughly, respectively)?
Find brings up all sorts of people named Louis when I search for Lajos, and Stephen, Stephan, Steve, etc. when I search for István. It breaks on the obvious equivalents. I suspect this is because nobody is in the habit of entering Anna as an alternate name for Anne, or Ferencz as an alternate name for Ferenc. What we're seeing as "the search algorithm knows this equivalence" is actually "this equivalent has been entered as an alternate name". In other words, I don't think Find's name matching is fuzzy at all. It's quite literal-minded, in fact.
0 -
Just to say this is nothing unique to FamilySearch. If I use a "Similar Sounding Variations" or "Name Soundex" option on different websites I am provided with a long list of names that appear to have nothing whatsoever in common with the inputted (sur)name. However, what is usually not produced are names with just one letter that is different - including even an added "s" at the end of the name! (E.g. no DANIELS results for a search on DANIEL, and vice versa.)
0 -
The phonetic matching algorithm that FamilySearch uses is clearly under development, because over the years searches I run repeatedly have changed what close matches I get. I am all for more development. In particular I want "sounds like" to be expanded to include "looks like".
For example, transcription errors are very common for certain handwritten letters that tend to be almost indistinguishable: L and S and Z, M and W, I and J, C and O and G, n and u. Et cetera. There are also spelling variations in the original documents that are exact equivalents. For example, Dutch i and ij. FamilySearch treats i and ij as identical even when exact match is turned on.
1 -
From the finding I have done on FS, it appears that using wildcards turns it into an exact search. You will get no matches that don't exactly match your parameter.
The search algorithm not allowing for extra letters is especially problematic for me. In 18th century German areas, it was common to write the woman's name in the possessive. That usually meant adding an 's' or 'n' to the end. So I have to use a wildcard with every search to account for possible extra letters. And that creates the problems of not having a fuzzy search.
1