Use Levenshtein distance algorithm to accept and correct misspellings for place names
Let's take the example of Albuquerque, an American city of half a million people. And let's say some users want to add a death place of Albuquerque for their ancestors. If they try Albuquerqe, the system won't find a matching standardized place name. If they switch to Albuqerque, it's also not a match. If they haven't given up by this point, they might find out that neither Albequerque nor Abaquerque is recognized by FamilySearch.
I think this problem is happening because of how FamilySearch has implemented a controlled vocabulary for its place names. I actually think that having standardized place names is a great idea. But if a user can come up with 10 out of the 11 characters in a place name, and they have these letters in the correct order, how helpful is it to reject their non-standardized submission without offering an standardized spelling that could match what they intended to write?
I think an implementation of the Levenshtein distance algorithm would resolve this problem. The algorithm could evaluate the place name offered by the user and quickly identify the closest match from the existing list of standardized places. That would be a great way to meet users halfway when they already have most of the answer for a place name and just need help getting the letters in the right order, or adding a missing letter or two, so that their submission will match the standardized format.
Comments
-
I don't know what algorithm FamilySearch is using to assist users while entering places and standardizing them. But in this Albuquerque example - IF the user pays attention while entering "Albuq" they will indeed see the suggested standardized place "Albuquerque, Bernalillo, New Mexico, United States" along with about 20 other standardized suggestions from all over the globe (or at least that is what I am seeing).
Of course this means the user won't even need to enter 10 of the 11 characters - but supposes they pay attention to the suggested locations as they are entering them.
0