Auto-standardization error: Hungary to Uganda
There's no way that the enormous problem created by auto-standardization can be fixed piecemeal like this, but this is my parents' hometown, and it is very definitely NOT in Uganda, so I'm reporting it as instructed.
Search result with hundreds of them: https://www.familysearch.org/search/record/results?count=100&q.anyDate.from=1000&q.anyDate.to=1900&q.anyPlace=B%2C%20Wakiso%2C%20Eastern%2C%20Uganda%20&f.collectionId=1787825
Randomly-chosen individual example: https://www.familysearch.org/ark:/61903/1:1:66D3-1RWM
Death Place B, Wakiso, Eastern, Uganda
Death Place (Original) B Gyarmath
The correct place is Balassagyarmat, Nógrád, Hungary, which in these records is often abbreviated to B Gyarmat or similar. In older records it's sometimes two words or hyphenated: Balassa Gyarmat or Balassa-Gyarmat. (Decorative 'h' at the end optional.)
Best Answer
-
@Julia Szent-Györgyi Sorry this one was not addressed sooner. We will report the inaccurate auto-standardization to engineers.
1
Answers
-
Hopefully, corrections won't be piecemeal. That's why these samples need to be escalated to the engineering team.
0 -
I agree that the problem is far too big to be addressed in this way. The engineers must have enough examples by now to realise this is a project that has not met expectations, so - if possible - should be totally abandoned. Probably too late for that, but what a mess this auto-standardization exercise has caused!
2 -
My take on this situation is very different than @Paul W's. To me, this is a wonderful achievement, saving incredible amounts of repetitive labor; just some improvements are needed here and there.
Robo-standardization has been going on behind the scenes for years. It was apparent in the occasionally inexplicable presence or absence of certain records from Search results. Now we are able to see a little more of what goes on, and can help to fine tune the mapping from place name as written to place name in gazetteer.
0 -
@dontiknowyou, it's not "here and there". It's hundreds of thousands of errors.
See for example Gordon's demonstration (https://community.familysearch.org/en/discussion/116744/errors#latest) using the 1891 census of Norway: 682+318+44+1000=2044 on the wrong continent, 595+24+369+1104=2092 in the wrong country -- over 4100 errors just for people with names starting with A, in just this one collection from just this one (smallish) country.
0 -
4100 out of how many?
To me, what this says is the gazetteer needs to be higher priority.
0 -
4100 out of 282000, which is a 1.5% failure rate.
Or take the "Hungary, Jewish Vital Records Index", which is a partially-published ongoing indexing project. There are currently 29339 results for a surname beginning with K. 527 of them have a birth, marriage, death, or other place outside of continental Europe, and another 60 have events in countries that have no overlap with Austria-Hungary. (For example, Spain or Greece.) That's a solid 2% failure rate.
0 -
Sounds like a basic data validation step has not been implemented: screen each collection for place names out of scope for the collection. That will catch almost all of this cruft.
The Search result filters are making it easier for us to spot these batches of incorrectly localized records, aren't they?
0