Home› Welcome to the FamilySearch Community!› Ask a Question› Search

Auto-standardization error: Hungary to Uganda

Julia Szent-Györgyi
Julia Szent-Györgyi ✭✭✭✭✭
March 12, 2022 edited July 30, 2024 in Search

There's no way that the enormous problem created by auto-standardization can be fixed piecemeal like this, but this is my parents' hometown, and it is very definitely NOT in Uganda, so I'm reporting it as instructed.

Search result with hundreds of them: https://www.familysearch.org/search/record/results?count=100&q.anyDate.from=1000&q.anyDate.to=1900&q.anyPlace=B%2C%20Wakiso%2C%20Eastern%2C%20Uganda%20&f.collectionId=1787825

Randomly-chosen individual example: https://www.familysearch.org/ark:/61903/1:1:66D3-1RWM

Death Place   B, Wakiso, Eastern, Uganda

Death Place (Original)   B Gyarmath

The correct place is Balassagyarmat, Nógrád, Hungary, which in these records is often abbreviated to B Gyarmat or similar. In older records it's sometimes two words or hyphenated: Balassa Gyarmat or Balassa-Gyarmat. (Decorative 'h' at the end optional.)

0

Best Answer

  • N Tychonievich
    N Tychonievich ✭✭✭✭✭
    April 3, 2022 Answer ✓

    @Julia Szent-Györgyi Sorry this one was not addressed sooner. We will report the inaccurate auto-standardization to engineers.

    1

Answers

  • dontiknowyou
    dontiknowyou ✭✭✭✭✭
    March 12, 2022

    Hopefully, corrections won't be piecemeal. That's why these samples need to be escalated to the engineering team.

    0
  • Paul W
    Paul W ✭✭✭✭✭
    March 13, 2022

    I agree that the problem is far too big to be addressed in this way. The engineers must have enough examples by now to realise this is a project that has not met expectations, so - if possible - should be totally abandoned. Probably too late for that, but what a mess this auto-standardization exercise has caused!

    2
  • dontiknowyou
    dontiknowyou ✭✭✭✭✭
    March 14, 2022

    My take on this situation is very different than @Paul W's. To me, this is a wonderful achievement, saving incredible amounts of repetitive labor; just some improvements are needed here and there.

    Robo-standardization has been going on behind the scenes for years. It was apparent in the occasionally inexplicable presence or absence of certain records from Search results. Now we are able to see a little more of what goes on, and can help to fine tune the mapping from place name as written to place name in gazetteer.

    0
  • Julia Szent-Györgyi
    Julia Szent-Györgyi ✭✭✭✭✭
    March 14, 2022

    @dontiknowyou, it's not "here and there". It's hundreds of thousands of errors.

    See for example Gordon's demonstration (https://community.familysearch.org/en/discussion/116744/errors#latest) using the 1891 census of Norway: 682+318+44+1000=2044 on the wrong continent, 595+24+369+1104=2092 in the wrong country -- over 4100 errors just for people with names starting with A, in just this one collection from just this one (smallish) country.

    0
  • dontiknowyou
    dontiknowyou ✭✭✭✭✭
    March 14, 2022

    4100 out of how many?

    To me, what this says is the gazetteer needs to be higher priority.

    0
  • Julia Szent-Györgyi
    Julia Szent-Györgyi ✭✭✭✭✭
    March 15, 2022

    4100 out of 282000, which is a 1.5% failure rate.

    Or take the "Hungary, Jewish Vital Records Index", which is a partially-published ongoing indexing project. There are currently 29339 results for a surname beginning with K. 527 of them have a birth, marriage, death, or other place outside of continental Europe, and another 60 have events in countries that have no overlap with Austria-Hungary. (For example, Spain or Greece.) That's a solid 2% failure rate.

    0
  • dontiknowyou
    dontiknowyou ✭✭✭✭✭
    March 15, 2022

    Sounds like a basic data validation step has not been implemented: screen each collection for place names out of scope for the collection. That will catch almost all of this cruft.

    The Search result filters are making it easier for us to spot these batches of incorrectly localized records, aren't they?

    0
This discussion has been closed.
Clear
No Groups Found

Categories

  • All Categories
  • 42.7K Ask a Question
  • 3.3K General Questions
  • 570 FamilySearch Center
  • 6.7K Get Involved/Indexing
  • 640 FamilySearch Account
  • 6.5K Family Tree
  • 5.2K Search
  • 997 Memories
  • 2 Suggest an Idea
  • 473 Other Languages
  • 62 Community News
  • Groups