Home› Welcome to the FamilySearch Community!› FamilySearch Help› Search

Report: a proportion of records in a film was assigned bad dates. Prevention and fixes...

JuanZuluaga3
JuanZuluaga3 ✭
October 2, 2021 edited October 4, 2021 in Search

[Edited on 10/03.]

Film 005263680 contains 2̶2̶4̶1̶ 1494 christening records from a Colombia (Valle del Cauca) parish, 1935 to 1938.

[Added: 1494=2241 - 747 ; I realized that sequential numbers for baptism events on that volume start at 748. This also means that most recorded events do have assigned dates -- there are no large portions of missing records. ]

(https://www.familysearch.org/search/film/005263680?cat=1483998)

If we run a couple of reports on the range of birth dates, we get

image.png
image.png

[Added:

1) Perhaps these numbers should not add to more than 1494, but this is a relatively minor concern.

2) A volume of Baptisms from 1935 to 1938, why would it contain births from before 1920? (Catholic baptism of grown-ups was very rare at that time and place). Why would it contain births from after 1939?

[deleted....]


For some errors, perhaps the computer recognizer, at some point, treated the sequence number written by the parish as the event date.

[added: for instance, these supposedly born in the 1600s: https://www.familysearch.org/search/record/results?q.filmNumber=5263680&c.birthLikeDate1=on&f.birthLikeDate0=1600

https://www.familysearch.org/ark:/61903/1:1:68FJ-SFRB ]

Untitled Image


I do not know how to recall records that have birthyear="", to track the cause.

Fortunately, it should be easy to

  • prevent this from happening, by focusing the OCR to scan only a very specific area of the image,
  • running a report at the end of each OCR, to see if dates and places make sense.
  • correct the mistake by recalling and rewriting those with a regular expression.

--------------------

I see similar problems in other films, like https://www.familysearch.org/search/film/004442413?cat=2015238

In this example, the OCR discovers the word "Castrillón", but instead of seeing it as a lastname, the software interprets it as the birth place, associating it with a town in Spain. The software is overreaching, reading where it should not, and making unfounded associations.

Screenshot from 2021-10-02 00-06-25.png


Tagged:
  • incorrect dates
  • transcription error;
0

Answers

  • genthusiast
    genthusiast ✭✭✭✭✭
    October 2, 2021

    @JuanZuluaga3

    I find your report interesting and do not claim to understand it completely.

    When I click the link you post: ( https://www.familysearch.org/search/film/005263680?cat=1483998 )

    I see a film of 506 images for which if I click the 'Colombia, Catholic Church Records, 1576-2018' drop-down results in waypoints: Valle del Cauca> Buga> Santa Bárbara> Bautismos 1935-1938 

    If I select Buga I see other churches that are in the parish. I assume they would also have records from 1935 to 1938. So I am a little unclear about whether your question is limited to these 506 images or whether you have already included the ~75 towns in Valle de Cauca?

    I am just trying to understand what you have already included and where you get the figures for wrong birth year, etc.?

    1
  • JuanZuluaga3
    JuanZuluaga3 ✭
    October 4, 2021 edited October 4, 2021

    Hi Genthusiast,

    My initial report was just about that "Bautismos 1935-1938" tome from Santa Barbara parish, with 506 scanned pages. Film number 5263680 comes just from that tome. There are no other parishes with data in that film. [added: except if records about later marriages of those who were baptized, but I understand those are not usually recorded] . Other places of christening are probably wrong.

    The count of 2241 baptism events comes from looking at finding the last sequential number in the 506th page: 2241 [added: and subtracting from it the sequential number on the first record - 1, 748-1=747]; on the average, [added: 1494/504 ~ 3 ] 4.5 events recorded per page.

    I got tables like this one

    Untitled Image


    by doing a search on records that come from that film #5263680:

    https://www.familysearch.org/search/record/results?q.filmNumber=5263680

    Search produces 12442 results; I interpret this assuming they are one for every person mentioned in all the 2241 1494 baptism events. Since only the "principal" person in the event (the person being baptized) has a recorded birth date or christening date, the sum should approximate 2241 1494.

    Let us see at some of the 23 records that are produced with a 1800s range:

    https://www.familysearch.org/search/record/results?q.filmNumber=5263680&c.birthLikeDate1=on&f.birthLikeDate0=1800

    The 1st one that shows up, "Colombia, registros parroquiales y diocesanos, 1576-2018", database with images, FamilySearch (https://www.familysearch.org/ark:/61903/1:1:68FJ-M6KW : 1 September 2021), Pojos Mariauna, octubre de 1857 : This birth year is clearly wrong.

    0
Clear
No Groups Found

Categories

  • 28.5K All Categories
  • 22.8K FamilySearch Help
  • 112 Get Involved
  • 2.6K General Questions
  • 423 FamilySearch Center
  • 432 FamilySearch Account
  • 4.1K Family Tree
  • 3.2K Search
  • 4.5K Indexing
  • 592 Memories
  • 6.1K Temple
  • 308 Other Languages
  • 34 Community News
  • 6.4K Suggest an Idea
  • Groups