AI-indexed Italian records added
Lots of new additions to FamilySearch in the past couple of weeks, including AI-indexed church records from the Dioceses of:
- Napoli
- Grosseto
- Piazza Armerina
- Firenze
- Caltanissetta
- Como
- Trapani
- Potenza
- Ivrea
- Foggia-Bovino
- Vercelli
- Palermo
- Torino
- Acerra
- Biella
You can sort the list by date last updated to see recent changes.
Use the AI indexes with caution and don't completely rely on them. There are many spelling errors especially with the Latin. Also use your own eyes.
FS users can correct the indexing (and you should if you find an error)!
Note also that in some cases the images are not freely available online and must be accessed from a FS Center or FS Affiliate Library.
Finally note that the indexing is in process and is by no means complete.
Comments
-
I'm not seeing these as indexed. For example, this collection (https://www.familysearch.org/search/collection/2043557) appears as having been updated on 2024-10-09 after being online for many years. Where did you find this collections interface anyway?
0 -
Not all the updated collections have been indexed. I only listed church record collections which have been AI indexed. Civil record collections like the Castrovillari tribunale collection may have had a small number of records added (you can find details on the FamilySearch blog on their monthly update posts). Also, the catalog has not been updated in some time and will not be updated until the new version of the catalog is released.
The collections interface is accessed from the main record search page (browse all collections).1 -
In Abruzzo, Aquila do you know when we will get more records? Right now most if not all go only to 1865. Thanks in advance!
~T0 -
All the 4 records appear to be "Page not found"
0 -
@genealogiadavini The images are there but as you note the transcriptions don't show up. It's a work in progress (especially considering the first indexes a death occurring 22 years before birth and baptism…)
0 -
@SerraNola Any way to connect with whoever is programming the handwriting recognition models / templates for Italy? I've been comparing the results in one of my ancestral towns to the manual indexing I've already done and am noting some recurring issues that could be solved with some additional logic parameters.
1 -
@Cousin Vinny Yes, it is possible if you have data showing a recurring type of error and it can be quantified. Post what you have.
1 -
@SerraNola Sure. These are 1863 birth records from Modugno, Bari, Puglia, Italy. I am reviewing the additional years from 1861-1865 because those images are, for some reason, not available on Antenati (although I happen to have personal copies of them) and thus can only be linked to the FamilySearch transcribed records.
1. A significant number of records were processed by the AI as occurring in 1873 rather than 1863. This should be an easily fixable processing problem as the records are in chronological and serial number order and grouped by year in the film (not to mention that 1866-1874 Italian birth records are in a completely different format than 1815-1865). Examples:- https://www.familysearch.org/ark:/61903/1:1:6NHY-Y1B7
- https://www.familysearch.org/ark:/61903/1:1:6NHY-NQCB
- https://www.familysearch.org/ark:/61903/1:1:6NHY-LN58
- https://www.familysearch.org/ark:/61903/1:1:6NHY-YCPY
- https://www.familysearch.org/ark:/61903/1:1:6NHY-YCRB
- https://www.familysearch.org/ark:/61903/1:1:6NHY-NQ8F
- https://www.familysearch.org/ark:/61903/1:1:6NHY-NQ6H
- https://www.familysearch.org/ark:/61903/1:1:6NHY-BDHV
- https://www.familysearch.org/ark:/61903/1:1:6NHY-5S3M
- https://www.familysearch.org/ark:/61903/1:1:6NHY-T4RP
- https://www.familysearch.org/ark:/61903/1:1:6NHY-6C8Q
- https://www.familysearch.org/ark:/61903/1:1:6NHB-MJL3
- https://www.familysearch.org/ark:/61903/1:1:6NHY-1F65
- https://www.familysearch.org/ark:/61903/1:1:6NHY-1FFZ
- https://www.familysearch.org/ark:/61903/1:1:6NHB-MJ1C
- https://www.familysearch.org/ark:/61903/1:1:6NHY-895F
- And a separate instance: https://www.familysearch.org/ark:/61903/1:1:6NHB-SSC7?lang=en and https://www.familysearch.org/ark:/61903/1:1:6NHY-LYWB recording the birth year as 1863 when the year was actually 1864.
- Similarly, https://www.familysearch.org/ark:/61903/1:1:6NHY-TBS1, https://www.familysearch.org/ark:/61903/1:1:6NHY-TB7W, https://www.familysearch.org/ark:/61903/1:1:6NHY-TB44, https://www.familysearch.org/ark:/61903/1:1:6NHY-TBHF, https://www.familysearch.org/ark:/61903/1:1:6NHY-XHJH, https://www.familysearch.org/ark:/61903/1:1:6NHY-5GGK, https://www.familysearch.org/ark:/61903/1:1:6NHY-LYQM recording the birth year as 1875 when the year was actually 1865.
2. Months misinterpreted. Again, should be an easily fixable problem as the record images are in chronological and serial number order in the film.
- https://www.familysearch.org/ark:/61903/1:1:6NHY-Y116 should be June 1863 and not January
- https://www.familysearch.org/ark:/61903/1:1:6NHY-PLHT should be June 1863 and not January
- https://www.familysearch.org/ark:/61903/1:1:6NHB-9C3G should be January 1863 and not June
3. Sort of a subset of 2 - month misinterpreted but probably because the date the birth was recorded and the date of the birth were in two separate months. These records contain five dates:
- Date the birth was recorded (top)
- Date of birth (middle)
- Date baptism was recorded (margin, top)
- Date baptism form was sent to the church (margin, near top)
- Date of baptism (margin, middle)
In some instances a birth took place near the end of the month but was not recorded until the following month. It seems perhaps the logic in the text recognition may have been set up to record the month as given at the top of the record (when the record was created) but the day of the month given from the middle (when the birth occurred).
Examples:
- https://www.familysearch.org/ark:/61903/1:1:6NHY-13DR (date recorded in FS as 21 Aug 1863; date of birth was 31 July but was recorded 1 August)
- https://www.familysearch.org/ark:/61903/1:1:6NHY-TY18 (date recorded in FS as 31 Aug 1863; date of birth was 31 July but was recorded 1 August)
If I identify any other possible systemic types of errors I will let you know. But really impressed with the results so far; the handwriting interpretation isn't perfect but is quite good in most cases.
1 -
I appreciate your work to identify and categorize problems with the existing indexing. The errors you mention are significant, but probably not critical enough to request a rerun of the collection. The focus needs to be on enhancing future indexing accuracy. I agree with you that the results so far are favorable in comparison with some of the earlier CAI projects. Here are my thoughts on the points you presented:
- 1870's vs. 1860's - I don't know for sure, but it seems that engineering would have used chronological order in their programming if it was feasible. I wonder if they perceive too much variation. For instance, some civil registration offices might have combined records of births, marriages, and deaths by year, while others kept separate books for each event spanning multiple years. The other issue is that this is an inherent AI error in Italian records given that “sessanta” and “settanta”, handwritten with a drop down S is difficult for OCR to differentiate.
- Likewise, gennaio and giugno are similar enough that I would expect AI to generate a substantial number of errors. I attempted to verify that it was consistent but most of the time I found that AI got it right. I could find several of this error in a row, but never was the whole month indexed wrong.
- It was evident that for civil registration AI had been programmed to look for the date of birth in the middle of the document and not at the top. As your examples show, it got it wrong several times but mostly in cases outside of the norm. In the past, I have seen many Italian records (from the Campania region) in which the word “ieri” was used in place of the birth date. I wonder how AI handles that. You noted that church records have five different dates associated with the birth, but it would be useful to have sufficient examples of errors to assess any possible improvements.
My in-laws are Italian so I have a soft spot in my heart for these records and of course would like to see more accuracy. That is especially true in light of the recent new restrictions making it harder to verify the indexing.
Thank you so much for your efforts to help us improve our resources to make it easier for users to connect with their ancestors. If you are able to document a significant amount of common errors to bring to the attention of engineers, I am more than happy to do so.
1 -
@Cousin Vinny Do you find that most often the complete name is not indexed—i.e. Domenico Cella vs. Domenico Fortunato Ettore Cella? Also, have you noticed if the gender is ever indexed for the two principals on marriages? Do you think it is important? I realize it is not on the form except for comparsi/nubile and figlio/figlia.
0 -
I haven't dug into marriages yet. I think the model for 1818-1865 births probably pulls names from the baptism column in the margin, and sometimes the text of the record includes multiple given names that are not always included in the baptism recordation in the margin.
Also on gender, I note sometimes it's wrong on births but the records specifically indicate either "maschio/maschile" or "femmina" so that seems like it shouldn't be missed as often as it is. Is the program making a judgement on gender based on the given name?
0 -
@SerraNola for the indexed entries from State Archive records, the images are not accessible on FamilySearch and so the indexed entry cannot be edited by users. I'm seeing these errors on a daily basis; is there a path to actually get the errors corrected when they are noticed?
1

