Search Result Filter By Year Problem
There have been several discussion here about how the auto-standardization routine converted originally indexed place names into completely wrong place names and how that has made place names as shown in search results very suspect and difficult to use.
I ran across today a problem with dates that makes filtering by a date range problematic. Since I have not seen a discussion or report of this before, I thought I would bring it up here. I do hope the engineers are aware of this an working on a way to repair it.
I was just poking around in the Search routine and tried filtering a results list by marriage date and was very surprised to see this:
Filtering for the sixteen people married in the year 6000 AD and going to the first one that has an image, I find a "marriage date" of 6073 (https://www.familysearch.org/ark:/61903/1:1:6D3V-3C3Q ). Looking at the image for this marriage that took place in 1916, I can't see anything on the page to account for the 6073 number. Scrolling through a few more images on the film I see that all of them have goofy marriage dates in the index.
Are all these strange dates indexing errors? If so, how did they ever make it past a review and through to the final database? Are these another post-processing problem? Looking at the numbers of events in each century, this is admittedly a small percentage of records in this list of over 69 million results, but it still throws into doubt all the results. Is there any hope of these ever getting cleaned up?
Or do we just use such things to continually hammer home to everyone that while indexes are good finding aids, they can never be fully trusted and you always, always, always need to check original images?
@Gordon Collett, very interesting. I've done a similar search and got (generally) the same numbers that you show. I don't have definitive answers, just a couple of observations, based on a very shallow "dive."
Looking at the numbers of marriages by year-range, I notice that for the most part, we are looking a very few potential issues - meaning that any marriages outside of 1500 - 2000 are likely suspect. Setting aside the range of <100, for a moment; then, we are looking at only very small numbers. Those could well be indexing errors, although I haven't tried to support this statement.
What is interesting is the rather large number of marriages found with dates < 100. Some of these, I found, are records from partner sites; for some, the marriage date is not the principle event (e.g., marriages dates in an obituary). What's interesting here, is that the index record ( also referred to as the record details page) does not show a marriage date; although, one can be found in the original obituary (with a proper date). I would take, from this, that the marriage date, in such a record, was not in index template. I also find, in this year-range, records from various marriage collections (e.g., state and county marriage records) where you would expect to see marriage dates in the record details page -- but from the very(!) few that I looked at, there was no marriage date provided (nor found in the original record). So, I wonder how it then shows up in the search results.
It took me a moment to remind myself that for the 69 million records returned for the search, only a small subset were actually marriage records (probably less that 8 million)
With respect to the example you provide, I admit that I see nothing that drives me to any real conclusion; it could simply be an indexing error, although such an unusual error to cause one to wonder why it wasn't caught in review.
"Is there any hope of these ever getting cleaned up?' - I suspect that before that could be answered, we would have look at these data points within the context of Collections, and then see what we find.
I absolutely agree with your last statement, regarding the value/purpose of indexes. As a rule, I just always assume that an index is wrong in some element.1
Thanks for your thoughts.
An additional concern that your comments brought up, was that the records between 1500 to 2000 really could be just as suspect as the dates out of that range. The fact that the ones outside that range are clearly wrong does not mean the ones inside are clearly right.
So something else that people need to understand, is that if one is certain a record, in this case a marriage record, must be at a certain location and time and it is just not coming up in the search, they might have to ignore the year in the search results or search without the year and look at every couple of the right names or may just have to go back to the original film and go through it one record at a time.
Could this be part of the reason that despite repeated requests, the search engine never strictly adheres to the date range put in the search parameters?1
@Gordon Collett, yes, all you say is true. I carved the range 1500-2000 out of consideration simply because it's a tougher bird to pick at; requiring a much more careful examination.
Your comments reinforce the need to be careful when using indexes -- finding no results from a search probably shouldn't be taken as an indication that the record doesn't exist; but,, more likely, that we may need to reexamine our search strategy and criteria.
With regard to searching using date ranges, it appears that we should not expect the search engine to treat date ranges strictly; evidently date ranges cannot be forced to be strictly observed, as would be the case for names and locations when selecting the option to match records exactly.1