I have a problem with the Search>Records function
This problem occurs at URL https://www.familysearch.org/search/record/results?q.anyDate.from=1856&q.anyPlace=North%20Carolina&q.givenName=arch&q.surname=holly
As you can see, this is a search for First Names-Arch, Last Names-Holly, Birth-North Carolina, Date From-1856. This is a real person with census records in Decatur and Hardin Co. TN
The search returns a reference to Archibald Holley, 1870 census, District 13, Grainger Co. TN among others.
Clicking to go to the census image for this reference takes me to the 1870 census, District 13, Hardin Co. TN. The image is correct Archibald Holley was indeed in Hardin Co. in 1870 but the search result shows him in Grainger Co.
This is a general issue for census searches showing up as Grainger Co. when they should be Hardin or Decatur Co. I have seen this for many searches in Decatur Co. This is the first that I have seen that was incorrect for Hardin Co. showing as Grainger but I suspect there are others. Other counties may have similar issues.
Can this search issue please be addressed?
Best Answer
-
How do I report these mismatches for correction?
The point of some previous comments is to point out that you can report these one-by-one here in Community - as you have done in this thread - but you won't be able to keep up with the auto-standardization routine that may continue to change these incorrectly (sorry no more information on when or how routines may run).
How do I know they are accepted for correction?
Once you report a problem here in Community you will know it is accepted when a moderator says it is - as below:
How do I know they are corrected?
You will know it is corrected by performing the same search and seeing it has changed to the correct place (sorry as mentioned by moderator above - no timeframe can be given).
0
Answers
-
Ah yes, place standardization is the culprit again.
This is not a search error. The problem is that an algorithm has gone through all the place names and picked one - usually not the right one. At least, in this case, it's a place in the correct state. Often it's not the correct country. See this thread about New Jersey burials incorrectly shown as in nearly any place other than New Jersey. https://community.familysearch.org/en/discussion/130585/error-report-robo-index-collection#latest
You can tell when it is a place standardization algorithm problem by the word "original" next to the correct location.
1 -
What is "place standardization?" Is it not part of the search algorithm? Can this not be corrected?
0 -
@N Tychonievich or @Mike357 please.
0 -
@Jerry Butler_2 Place standardization is a tool FamilySearch uses when an index came from another website. Its purpose is to standardize the place as per FamilySearch's standards. Sometimes this computer tool messes up the place when it standardizes it. When that happens, we pass the example on to the group that can fix it. We'll add this one to the items they have in queue to fix. I am seeing these get fixed every day. But, be aware that the queue of items if fairly long, so it will be some time before you see this particular one corrected. Meanwhile, you can attach the record as a source and note the error in your "reason to attach" notes. Or you can decided to simply use the image as your source instead--thus bypassing the errors in the index.
If you want to use the image as the source, click to open the image of the census page. In the top-right click Source Box and add the image to your source box. Now you can use the item in your source box as a Family Tree source. Here are Help Center articles with additional details:
0 -
@Jerry Butler_2, Áine used the short label. The fuller label is "automatic standardization" or "auto-standardization". I often call it "auto-corruption".
Up until a few years ago, all searches on FamilySearch were the basic, old-fashioned text-based kind: "show me everything that starts with these letters and contains these other letters", and variations ("these letter-combinations are often equivalent"). That's what indexes were (and are) set up to generate data for.
This was working pretty well, but it's kind of slow and processor-intensive. There's a different sort of search that's much faster and easier on resources, if the database is set up for it. I don't know what the official name for it is, but I call it entity-based search. It requires that the data it's searching through be in the form of entities: specific database entries with specific properties and categorizations, such as "Tennessee", which is in the category "United States", and contains all of the counties of Tennessee below it. When someone makes a query for, say, "Hancock, Tennessee", the algorithm doesn't look for places starting with the letter H, but for daughter places of the entity labeled "Tennessee", and serves up all of the Hancock county results that match the other search parameters.
The method obviously works better for some types of data than others. I don't think it can ever be made to work for people's names, for example. The fields that FamilySearch has decided to apply it to are dates and places. They therefore needed to associate all of the date and location fields in their database of indexed records with entities in their databases of dates and places. They used automated processes to create these associations.
The bots worked mostly OK on dates. They lost some information (when dates were in the database in a different language, for example), and created all sorts of nonsense out of incomplete dates (such as month and day converted into first-century years), but in general, the database's dates are only partially corrupted now.
Places are a different story. I don't know what process was used to pick an entity to go with any particular text string, but it is clearly apparent from the results that there was no data validation step involved: place entries in a collection of Canadian records got associated with the same-named place in Australia, birthplaces in Hungarian records are now associated with places in Africa, and so forth and so on.
1 -
Thanks for all the responses and explanations. I am not sure this describes my issue or not. The search works correctly based on the birthplace and year. The issue is that some of results are labeled with the wrong county.
How do I report these mismatches for correction? How do I know they are accepted for correction? How do I know they are corrected?
0 -
@Jerry Butler_2 The report has been made. I tagged 2 mods who deal with this issue, and N Tychonievich responded that the item had been queued for correction.
Hi Julia - I almost used your favorite description when I posted my first reply.
1 -
Thanks!
1 -
@genthusiast and @Jerry Butler_2 One correction to genthusiast's comment that after corrections are made, we are just in danger of a computer making the same error again. I am pretty confident that is not the case. Engineers use the corrections to "teach" the software to not make the same mistake again. It is worthwhile for you to report the incidents you find where auto-standardization went awry. As the engineers work through each one, they correct all incidents of a given error within a record collection. So, your report fixes the error for many other users as well.
@Julia Szent-Györgyi I know you disagree, but the auto-standardization process has done more standardization correctly than incorrectly. In my own research and in the research of most of our users, places are showing correctly most of the time in the record collections. It is always wise to not trust indexes completely, but neither are they completely untrustworthy.
1 -
@N Tychonievich I am glad corrections will stick and that computers 'learn' the first time (as opposed to me...).😉
0 -
@ChrisChalcraft The basic reason is because the date search criteria is not programmed to restrict the date to the range you enter (at least that is last I heard/recall). You will need to use the filter bubbles to restrict the date/range.
If you want to supply your Search URL that might be more helpful to give you a more detailed answer.
0 -
Then why does it even ask for a date range??
0 -
The search you are displaying in the image is for Any Life Event between 1400 and 1500 for the Name "John" Smith (a common name). Of the 340,370 results those showing at the top clearly fall outside that range. So why are they showing? Well because the Search algorithm wants to 'helpfully' display results for "John" Smith - the names matching are a priority for scoring of the results - especially for the collections those results are from (not all collections may rank search results with the same criteria priority). So not a very good explanation... But again basically means a wide net was cast for the results - excepting you specified "John" as exact criteria, so all of the top results match that priority criteria ... now you need to use the results filters to narrow those closer to what you would like. If you knew more criteria limiting information - such as when or where John was born you should enter the information in the Birth Life Event. I have especially noted in prior posts the seemingly more random results when using Any Life Event.
0 -
@ChrisChalcraft, the search parameters are joined with a logical "or", not "and": inputting John Smith between 1500 and 1600 technically matches John Jones in 1700, William Smith in 1900, and William Jones in 1550. That last will not actually show up in the list, because the primary name inputs are weighted too heavily, but another consequence of this weighting is that John Smith with any date comes before, say, Jno Smith in the desired date range.
To apply a logical "and" to your search, you have to use the filters.
1 -
@Julia Szent-Györgyi I don't think I've ever heard that Search algorithm uses inclusive OR (do you have a document reference?) - I guess I'll have to try the search above to see if John Jones is one of the results...
Interestingly in the Family Tree mobile app on Android phone - this search returns 10k+ not 340k and the first result does match the timespan/date range. The last page of results still appears to have only John Smith ... Not John Jones ... Perhaps one should try Any Life Event searches exclusively from the mobile app?
0 -
Chris's screenshot with 340k results is Search - Tree (aka Tree - Find), not Search - Records. (The only way to tell them apart is by the results: if everyone has PIDs, it's Find.)
As I wrote, the primary name input is treated somewhat differently from the other search parameters: in practice, John Smith will match Smith or J Smith or William J Smith, but not John Jones.
2 -
Thank you for your response and suggestions. I just find it frustrating that in the search pane it asks for a date range but does not respect it. And there is not "exact search" check-box like there is for other parameters. I have come to a bit of a brick wall on my "Chalcraft" line where there are several so-named families in the neighboring parishes that all used (and reused) similar names and even a few with the exact same name who apparently relocated and then did the discourtesy of dying in the same year as each other. So, after picking all the low hanging fruit I am now doing a process of elimination approach where I collect all the data for a given full name and/or surname in all the neighboring parishes, exporting them to excel and reconstructing the families for analysis. It's not as bad as my sample "John Smith" but it could be a lot easier if the search allowed more date range limitations.. I guess I haven't been using those filters at the top and need to familiarize myself with them. Cheers! Chris
1