Home› Welcome to the FamilySearch Community!› Suggest an Idea

Use Levenshtein distance algorithm to accept and correct misspellings for place names

Sterling Fluharty
Sterling Fluharty ✭
August 14, 2022 in Suggest an Idea

Let's take the example of Albuquerque, an American city of half a million people. And let's say some users want to add a death place of Albuquerque for their ancestors. If they try Albuquerqe, the system won't find a matching standardized place name. If they switch to Albuqerque, it's also not a match. If they haven't given up by this point, they might find out that neither Albequerque nor Abaquerque is recognized by FamilySearch.

I think this problem is happening because of how FamilySearch has implemented a controlled vocabulary for its place names. I actually think that having standardized place names is a great idea. But if a user can come up with 10 out of the 11 characters in a place name, and they have these letters in the correct order, how helpful is it to reject their non-standardized submission without offering an standardized spelling that could match what they intended to write?

I think an implementation of the Levenshtein distance algorithm would resolve this problem. The algorithm could evaluate the place name offered by the user and quickly identify the closest match from the existing list of standardized places. That would be a great way to meet users halfway when they already have most of the answer for a place name and just need help getting the letters in the right order, or adding a missing letter or two, so that their submission will match the standardized format.

Tagged:
  • Place Names
  • Standardized places
2
2
Up Down
2 votes

Active · Last Updated August 14, 2022

Comments

  • genthusiast
    genthusiast ✭✭✭✭✭
    August 14, 2022 edited August 14, 2022

    I don't know what algorithm FamilySearch is using to assist users while entering places and standardizing them. But in this Albuquerque example - IF the user pays attention while entering "Albuq" they will indeed see the suggested standardized place "Albuquerque, Bernalillo, New Mexico, United States" along with about 20 other standardized suggestions from all over the globe (or at least that is what I am seeing).

    image.png

    Of course this means the user won't even need to enter 10 of the 11 characters - but supposes they pay attention to the suggested locations as they are entering them.

    0
Clear
No Groups Found

Categories

  • 30.2K All Categories
  • 24.4K FamilySearch Help
  • 127 Get Involved
  • 2.7K General Questions
  • 444 FamilySearch Center
  • 466 FamilySearch Account
  • 4.5K Family Tree
  • 3.4K Search
  • 4.7K Indexing
  • 642 Memories
  • 6.6K Temple
  • 326 Other Languages
  • 34 Community News
  • 6.6K Suggest an Idea
  • Groups