Large pages split onto two or more images/batches
Which entries should be indexed when a large page has been imaged twice, one covering the upper part and one the lower part, when there are many entries in common? One image in one batch, the other image in another batch (can be seen in reference pages). There may also be side to side overlaps. I have opened the following:
Image Name004007127_00502Batch IDMSGZ-1JP
This is a very large collection of difficult to read entries, I'm not inclined to index lots of entries that have already been indexed from another image.
Answers
-
That batch only has 1 image (the blue bar shows "Image 1 of 1") so they all index on that 1. You can Batch> Return Batch if you don't wish to index it.
1 -
I am aware that the batch has only one image. The adjacent reference image contains most of the information that is in this image and will presumably be indexed by someone else. If anyone indexes this full image, there will be duplicates. That seems unnecessary and confusing.
0 -
Oh sorry ... in shared batches we can't view the Reference Images ... usually the 'first image' would index only the records up to the one spanning the 'pages' ... so the index would pick up after that record on the next image. Yes, I would need the Reference Images to help ... maybe someone at FamilySearch will respond more particularly.
0 -
Hi @David Iles
Please look at the citation below from the General Indexing Guidelines (GIG) - the last section of your Project Instructions.
Per the first bullet: You should not index records that began on reference image -1. That was the job of the indexer of that batch, and he should have consulted your batch image (his +1 reference image) to complete those records - as part of his batch.
Per the second bullet: Likewise, you should consult reference image +1 for your batch to complete any records that continue from your batch image to reference image +1.
Here is that Guidance from the GIG, which applies unless your Project Instructions (all parts and Field Helps) instruct otherwise:
What to Do When Records Span 2 Images or to View Additional Images
- If the first record on an image begins on a previous image, don't index it. The record will be indexed as part of the previous batch. Start indexing at the first complete record.
- If the last record on an image continues to the next image, index the entire record, including what continues to the next image.
- To see the next image while continuing to index information for the current image, do the following:
1. In the top corner of the image window, in the vertical toolbar, click the Reference Images
icon.
2. Below the screen on the right, click a thumbnail next to the image you are currently indexing.
3. Index the record while viewing both images at the same time.
4. To exit split screen mode, in the vertical toolbar, click the Reference Images
icon.
0 -
This still does not quite address the question but I think I see the way forward. The general guidance relates to entries/records that span two images, not the case when the record is wholly on one image and wholly on another image of a different batch. So I should index only the records that are not also on the previous, referenced, image (by that I mean the image with the top part of the page). Only the extra records in the batch image of the bottom part of the page should be indexed. Likewise, whoever is indexing the image of the top part of the page should not continue to the bottom of the page on the referenced image, even though it is the ‘same’ page. (Your ‘per second bullet’ rather suggests the opposite and would lead to duplicate entries.) Perhaps the guidance not to index referenced pages should clarify this, although it’s presumably unusual.
0 -
- If a record starts on reference image -1 and continues to your batch image, per the GIG, you ignore it.
- If a record starts in your batch image and continues to reference image +1, per the GIG, you index it, bringing up reference image +1 to collect its information.
- If a record is wholly contained within your batch image, you index it and don't need to look at any reference images.
- That covers all the situations, I think.
1 -
That is clear but will create duplicate entries as records are seen, in entirety, on two images. I had thought it might be better to avoid duplication of time and effort. Presumably there are systems to eliminate duplicates later. BTW, I found the adjacent images when only searching for the year date. I will return this batch without indexing.
0 -
I don't see where the duplicates come from. This procedure is designed to prevent duplicates. The Indexer of the reference image +1 batch won't do any records that started on your batch image and you didn't do any on your batch that started on the batch comprised of reference image -1. And neither - only you - did those records wholly contained in your batch. No double-dipping. No duplicates.
1 -
One final attempt to explain. I have returned the batch so the following numbers are not exact but approximate as I do not remember the exact numbers. The original page had four columns, listing christenings, weddings and burials. Each column had 60 entries/lines/records, making a total of 240 records on the page. The page was too big for a single image to be made, so one was made of the top section, lines 1 to 45 of all four columns and a second image was made of the bottom part, lines 16 to 60. I was given the latter image and might have indexed 180 records (each was complete, though difficult to read). Someone else would be, or has been, given the first image and would also have indexed 180 records. But that means that both of us would have indexed lines 16 to 45, creating 120 duplicates, even though we had followed strictly your guidelines. I don't wish to waste time and effort so have now stopped indexing.
0 -
I get it now David. Thanks for being patient and trying one last time. I didn't look closely enough at the reference images to catch that. Yes, the General Indexing Guidelines on this issue assume that the images themselves are discrete from one another. It doesn't account for portions of adjacent images being shared/overlapping.
On the readability issue. Would the following have helped at all? Please be honest - I won't take offense if you're not impressed. You can probably make out most, if not all of what is in the original image, but would the enhanced version have perhaps lessened your fatigue - especially if you could see both at the same time as in this presentation of the data below.
For those who are following the (Rob Latour's) FREE A Viewer For Windows (AV4W) conversations, this was enhanced using the 3x3 "Base" Filter Profile with the Factor increased from 1 to 1.1. In the next release of AV4W, you will be able to press/release the middle mouse button and toggle the original and enhanced version thus seeing exactly what the filter did to each character and word in place. Here is a link to AV4W's webpage. Please watch the video if you're interested. You can view your results in AV4W while you are indexing without interfering with the Web Indexing program, and control Brightness, Contrast, etc.
https://www.rlatour.com/av4w/index.html?ar4w
0 -
Message now understood - great! How can guidance be given to indexers and reviewers about this? I have just opened a similar example and I think that there were about 100 lines and 5 columns on the original page and about 65 lines on each image, giving an overlap of about 30 x 5 = 150 records.
On the viewer, I have downloaded it and yes it might help a little in some cases but i think that when the original is fuzzy it will always need careful scrutiny and 'best judgement'. there's no 'magic bullet'!
0 -
That's true about there being no magic bullet, and the human eye and brain are amazing at seeing through the clutter. There are other, FREE more advanced auto-detection techniques that adapt themselves to a particular image I tried on your images and they went overboard, I thought. But, when a more advanced method is superior to what AV4W can do, it can be imported to AV4W and used while indexing, and peeking at the original data as needed. I've shared links to those programs as well.
On the issue of getting guidance, I hope one of the moderators follows up and weighs in. But I suspect the guidance will be to index what you see - meaning even if some of it is done again by indexers of neighboring batches. One problem may be that a Reviewer may not think beyond the GIG if you leave off the overlapping information. Everyone needs to be on the same page (pun intended). I guess the GIG could be updated to include that situation.
1