Extraneous records in search results
If I search for birth records in Blair Atholl, Perthshire, Scotland using the exact parameter to limit to one parish for a specific year prior to 1800, I get only records for the specific year.
If I search for birth records in Blair Atholl, Perthshire, Scotland using the exact parameter to limit to one parish for a specific year after 1800, I get records for the specific year plus about 200 records from the 1700's mixed in.
Is this a feature or a bug? Please explain.
Answers
-
As you can see, only 1 of the first 5 is in the requested year, 1805. The other 4 aren't even close to 1805.
0 -
As you can see, the Birth on those four records has somehow been set to "0005". A faulty indexing process has led to the search algorithm including these records as (possibly) being for the year 1805. Had the Birth field been left blank the search would have been based on the Christening input. However, anything in the Birth field of a record overrides the date in the Christening field.
In summary, there appears to be no direct relevance to the search being made on a particular date range. Had those four records not been "formatted" in the correct way they would not have been included in the results.
I have seen other such examples, for which the origin of the problem lies in incorrect data being added (not always to the Birth field) that produces these numbers - whether "0005" of whatever. To be corrected, a FamilySearch employee would need to be able to reprogram the whole collection, so the Birth fields are cleared. As I believe the problem has existed in other records for several years (and no doubt reported) I am pessimistic that any changes will be made, meaning the problem will persist.
0 -
So why does this happen for years following 1800, but not for years prior to 1800?
In the process of looking at the early years for this parish I found many births on microfilm that do not appear in FamilySearch. I offered to give the information that I found to FamilySearch, but have never been contacted. I learned that they do not track the status of problems that have been reported. I would like to be able to share this information with others because these records are for someone's ancestors not just mine.
0 -
Don't ask my to explain how the programming works as I am not that computer savvy!
However, I did a search on the Isabel McDougal in the screenshot. Whether searching on 1787 or 1805 as the event year she still appeared as a "result". The program appears to be reading the "0005" (shown in the Birth field) as 1805 and also picks up the 1787 in the Christening field.
If the Birth field does not contain "0005" (or similar), just the records for the actual year are produced, so whether they are pre or post 1800 is not part of the issue.
0 -
I noticed today that if you search for 1805, then the extraneous records all have May dates. If you search for 1810, then the extraneous records all have October dates.
The last 2 digits of the search year match the 2 digit code for the month. This explains why the extraneous files don't show up for 1800. There is no month corresponding to 00.
Do the records for the 1700's have a different format than the records for the1800's?
Or is there an error in the search software - comparing a year to a month?
The records with the 4 digits displayed in the birth date in the search results, correspond to baptismal records on the microfilm which do not have a birth date. The birth field in the search results contains a 4 digit code corresponding to the month in the baptismal date and that is being compared to the lowest 2 digits of the search year. Comparing 2 digits of a year to a 2 digit month code makes no sense.
0 -
I don't think the error is at the search end. I think the index entries with unknown years have been mis-interpreted by an automated process as if the month number were a year in the early 1800s.
FS has been working on changing over from text-based searching to entity-based searching for dates and places (i.e., the fields that can be represented in the background by a single number [a date] or pair of numbers [latitude and longitude]). A major part of that endeavor is associating standardized forms (which are pointers to those numerical entities) with all of the date and location fields in all of the indexes in the database. They used automated routines to do this. This is not a surprising decision, given the sheer size of FS's databases, but unfortunately, said routines Got It Totally Wrong a whole lot. As in, "entire microfilms assigned to the wrong continent" levels of Totally Wrong, and "new swaths discovered daily" levels of a whole lot.
In my opinion, FS needs to revert all of the automated standards and start over, with much more human input into the revised process, but so far, there hasn't really even been any official acknowledgement that a problem exists, never mind the problem's enormous magnitude.
1 -
It would be nice for someone to check the search code used for this specific problem and see if it is merely off by one field, or if there is something more extensive happening.
Unfortunately it appears that there is no system for reporting and tracking problems through to their solutions. If you cannot report a problem, then how can there be any expectation that any problem will be solved?
0 -
Given that FS doesn't have a search field for month (or day), and that "Scotland Births and Baptisms" has single-field dates, the problem simply cannot be the search being "off by one".
If you type just "5" into a date field in Family Tree, the drop-down offers 0005, 1805, 1905, and 2005, in that order. I believe the auto-standardization routines were based on the algorithms that generate those drop-downs; most of the errors I've seen have been the top choice if you typed in the originally-indexed string. It makes sense if nearly-BC dates were eliminated for the bot, leaving 1805 as the top choice.
0