Overzealous place name standardization can lead to off-the-map results
Many of my Swedish ancestors have "research help" hints pointing to Orkney, Scotland. They are actually from Örkened parish, Kristianstad, Sweden. (Remember what Emerson said about foolish consistency?)
See https://www.familysearch.org/tree/person/research-help/KHQC-MDR
Comments
-
The original document that I looked at has "Örk" for the birthplace. The index for that record has kept "Örk" as the birthplace.
The Record Hint has standardised the "Örk" in Birthplace to "Orkney ... etc" because "Ork" is an alternative name (abbreviation actually) for Orkney.
The problem is that standardisation process is done automatically with no opportunity for someone to say that the chosen entry is nonsense. I have previously complained that automatic standardisation is itself a poor tactic. This is another illustration.
"Örk" is in the source and it would be bad practice for the index to contain anything else.
"Ork" is a legitimate abbreviation for Orkney.
Automatic standardisation is always going to decide that "Örk" is Orkney.
The only way forward that I can see is to stop this automatic background standardisation altogether and let the human decide whether the hint is legitimate and what the standard value should be. Presumably the situation can't get any worse if we did stop automatic background standardisation because it's found the index data correctly (I assume).
0 -
The automatic process needs to improve.
0 -
Jordi - I'm not sure that an automatic process can be improved sufficiently. Yes, no doubt there are tweaks that can be done, but a lot of the issue is with reality, not with software. Given that (say) there are two places called Weston, Cheshire, England in real life, any automatic system is on a loser. I did get the Places team to set them up as "Weston, Cheshire, England" (for the one with the most records - I hope!) and the other as "Weston by Runcorn, Cheshire, England".
But I'll bet that any census birthplaces for the one by Runcorn will just be recorded as "Weston" and therefore end up as "Weston, Cheshire, England" - which is absolutely the wrong side of Cheshire.
Can I come up with a reasonably fool-proof way of getting the right one in? Tricky... I suggest a lot more work needs to be done at the indexing stage for starters but that where multiple choices exist, somehow, in some devious fashion, we should deal with both options and stop choosing the first (sometimes) on the list. But in all honesty I'm not even sure I can totally define the problem.
0 -
hmm yeah tricky business.
0