US1910Project
Answers
-
@Paul W Glad you found it. There's also a lengthy paper with several links, including some to YouTube, covering the greater project that I found after I had posted yesterday.
1 -
The morphing of the US1910Project continues..
As has been brought up, the US1910Project has proliferated as they have added the 1900 US census to this project: US1910Project and USCensusProject
Now there's a third entity involved. Have you every heard of the ROC? (R)ecords (O)perations (C)enter. There are at at least 14 of these centers operating in the United States. Missionaries are called to work there to help provide more indexed sources, update standardized locations, work on the 1950 US census AND....adding African American families listed on the 1900 US census!! So, if you come across their work, the USERID is: Community Census Project
Now there's at least three names for this extraction program:
US1910Project
USCensusProject
Community Census Project (African American families)
BTW, My husband works at one of the ROCs and was showing me how "wonderful" the app Goldie Mae was. I waited quietly until he displayed how this program added the "missing children" from a 1900 US census record to an existing FS record of the parents that had been generated from a marriage record. Then I asked him to go look for a possible duplicate set created from the 1910 US census record. He was horrified to find he'd just added the parents and three of the existing children AGAIN! He spent almost 45 minutes cleaning up his new duplicates and adding the alternative spellings from the 1910 US census.
There is not enough commonality between the records from the 1900 and 1910 US census records and EXISTING FS records to prevent duplication! 99.999999% of the women in FS do not have their married names added as alternative names. There's NO way you can match any census record to FS records where the women are only listed by their maiden names.
4 -
So just to clarify, the first two projects you listed are affiliated with RLL, and the third one is affiliated with ROC?
Also as an FYI, the Community Census Project has expanded to include Alaskan Natives.
1 -
UGH! I am seeing the "fruits" (think rotting and moldy) of the USCensusProject meddling with the US 1900 Census. I have found duplicate records that THEY created just a few weeks apart (Aug 2022 and Sep 2022) based on the 1900 and 1910 projects for a family . They duplicated both the mother (under her married name) and two children who are in both of the censuses. This is so frustrating I can't even begin to express it (although I know everyone here is feeling the same pain). Have they not learned ANYTHING from the 1910 mess not to just repeat it with the 1900 census?
1 -
According to Professor Joe of RLL, they will not stop doing what they're doing because they feel their project does more good than harm.
I would recommend to everyone to email Professor Joe regularly with all the merges that you have done as a result of the USCensusProject.
1 -
I'm Amy Harris, the program coordinator for the bachelor's program in Family History-Genealogy at BYU (housed in the History Dept). I recently became aware of this thread.
As has been noted above, the concerns you have with the census projects are under the purview of FamilySearch and the Economics Dept at BYU, so I can't help you with that. However, I thought this group could be interested in the family history training we offer within the major/minor and at the Center for Family History and Genealogy (https://cfhg.byu.edu/).
The bachelor's program combines traditional liberal arts training in historical research and writing with technical training in genealogical research and writing. Many of our students are employed at the Center for Family History and Genealogy where they work on compiling genealogical databases that are free to the public. They work under the mentorship of family history faculty and staff - all of whom have advanced historical/genealogical training and professional genealogical credentials.
For the group on this thread, you might be most interested in the Nauvoo Community Project, the Script Tutorials, the Immigrant Ancestors Project, and the Early British Census Project. I invite to explore the resources at the Center's website.
If you have questions about the BA program, the Center, or what the FH students are involved in, I'd be happy to answer those via email: amy.harris@byu.edu.
2 -
FamilySearch representatives seem very reluctant to get involved in posts relating to this issue. I don't think even one has presented details of the reputed positive benefits to FamilySearch of these projects.
If the positive data (some of which we probably would not be aware of) outweighs the negative (mass duplication of existing IDs) then why can't we receive an "official" response that justifies the current breaching of Family Tree guidelines (on checking for existing IDs first) by an organisation who should be setting a good example regarding adding names to Family Tree?
4 -
Amy is correct, BYU Record Linking Lab has a Facebook page you can check out for more info.
0 -
I can only agree that USCensusProject are more trouble than good with my family data. Just today I merged
Grace Henrietta King Person Merged November 21, 2022
Edith G Beardon Person Merged November 21, 2022
Edna Mae Bearden Person Merged November 21, 2022
Where USCensusProject created duplicate entries on Nov 11, 2022.
Someone Please stop these people
2 -
These Census projects do not "Grow the Tree," they create a mess. I too have had to correct dozens and dozens of duplicate entries. Dumping these often erroneous and always incomplete fragments of data from a single census record into Family Tree and suggesting that this "builds the tree" is like suggesting that people tossing damaged 2 x 4's on top of a foundation are contributing to "building a house". It is worse than not helpful. Skilled personnel then have to take the time to clear away the mess before they can actually build. And then people come along using programs like Hope Chest to hunt for newly added "green temples" and dump 1,700,000 (actual number reported by one user) "names" into the temple system without doing any sourcing or duplicate searches, the result is a tremendous amount of wasted effort at every level - not to mention a fragmented and inaccurate tree which discourages the engagement of the very kind of skilled researchers, members and non-members alike, needed to clean up the messes. Do the Brethren really want Family Search Family Tree to be known as the garbage "One Tree" concept - because the only metrics that matter are the numbers? Ten random piles of 2 x 4's are better than 1 well-built home? Really?
4 -
If every time we encounter a census project mess, we message them and inform them of our displeasure, maybe it will stop. If they got one message for every duplicate tree fragment they created, they might understand just how much useless mess they are making. Keep the messages nice, and leave the mean words you really would like to say in your head.
I really don't like strangers messing with my family's records. They are stealing the fun of finding a new record and turning it into a duplicate merging nightmare. Attaching family records should be left only to relatives and even in-laws, but never to strangers.
2 -
They are now creating FS/FT records from the 1920 US census.
Yesterday I discovered my first 1920 census duplicate record from the RLL with a creation date of Nov. 8, 2022:
and YES! this person already exists in FS! They are doing no favors...just more time to be spent cleaning up the database.
2 -
Yes, they recently expanded to the 1920 census as well. Professor Joe said that they have some sort of new computer program that will cut down on creating duplicates. However, I don't see that the duplicates have slowed down. I email him regularly the duplicates I've had to merge. I told him that there is no substitute for good old fashioned manual searching to see if an individual already has a record created. He's a good man who only wants every soul to be accounted for in Familysearch. He and I just disagree on the methods to do so.
2 -
So - do we now have competing CensusProjects creating more duplicates?
1 -
I don't know what PID this is for so I can't verify. But if you look at the Sources, you will probably see a different Census 1910, 1920... for each of the Source that corresponds to the various Census projects being added to the tree.
0 -
*Two competing projects creating duplicates which I have just merged. Not adding value.
5 -
Yes, I've just run into another duplicate branch in my family created by the Census Project. The names weren't even copied correctly - Eva in the census became Esther as part of the duplicate branch.
2 -
I've already gone through like, ten households at least, just in the past three days.
Anyway, a thought occurred to me: has anyone ever seen the USCensusProject account actually correctly link a census record to existing people in the tree? One would think that if they were following proper procedure (viz., searching to see if a person exists before creating a brand new person), at least some of the time that search would result in a match, but as far as I can tell, I've never seen that happen (other than as the result of a merged duplicate).
4 -
@JD Cowell They only create new Persons in families, and do not attach Sources to existing families in FT.
0 -
Can you please explain the purpose in only/always creating duplicate profiles that someone else (us!) has to merge/fix?
4 -
Have Community moderators given up on reading these posts? If not, please get a message through to the leader of the CommunityCensus Project to advise that, like the US1910 "version", the work on the 1911 England & Wales census is just creating more work for everyday users of Family Tree.
Take a look at https://www.familysearch.org/tree/person/details/GNN6-27W. Not a perfect example, as the earlier created ID does carry a different middle name, but it only took me a short time to discover this was a duplicate. Worse still, the spouse has been added (as has been reported in so many other similar cases) using the same surname as Samuel. It took me literally less than a minute to find her maiden surname was SMITH!
It's easy to blame the volunteers for creating all this mess, but the project leader should be held accountable for either lack of supervision of the participants, or for the project instructions being totally flawed. Do they really advise to check for duplicates - and not to add wives in their married names?
Oh, and non-standardized placenames are being ignored - just copied as they appear in the census index.
3 -
Paul, Most adds I've seen have had the places standardized - that is the expectation. Maybe this one place in the census data was not able to resolve to a standardized place. Yes, and since its a Census Source it takes work to go find the maiden name and hopefully these get backfilled quickly. I don't have any involvement, other than collecting user feedback, so examples are great.
0 -
Example:
GNX6-BSG created from Eleanor Cole in household of John Cole, "England and Wales Census, 1911"
As the wife of John Cole, "Cole" is clearly her married name. That does not rule out the possibility that it was also her maiden name, but the chances are slim. I have deleted the surname from the profile.
Birthplace entered (presumably copied) as "London Lambeth, London" and the resulting red "Non-standardized Place" warning ignored. The same is true of all the other family members, although the birthplaces are different: Carmathen, Carmarthenshire; Pembroke Haverfordwest, Pembrokeshire (x2); and Morriston, Glamorganshire. In each case the correct standardised option is the first one on the list, So it's not one place in the census date that doesn't resolve to a standardised place. It's sloppy work, poor instructions or lack of supervision. Or more likely a combination of all three.
As has already been said by someone else, I have yet to find one entry by "CommunityCensus Project" done correctly.
2 -
Great to know you are still around, as your work on the GetSat site was so helpful to many of us, so your regular participation here is sadly missed.
As you can see, there is very little positive feedback here about these census projects (US1910, CommunityCensus, etc.). So, please try to direct these issues to whoever is running them, whether Professor Joe, or whoever.
Also, is FamilySearch management even aware of the damage being done here? It makes it very difficult to convince everyday users to take care in not creating duplicates when these projects are producing huge amounts of them.
Sam Sulser has told us there is a positive side, from which FS management is gaining useful data. She said she would get back to us on this, but we have still not been provided with any idea of the specific benefits being produced by these exercises. All most of us can see is the shoddy work, including a lack of checks and effort to ensure these new IDs are accurately titled and are for individuals not already on the Tree.
In my example of Samuel Tungate, I had no problem in finding his duplicate, wife's maiden name (have volunteers been given FreeBMD as a straightforward source to discover these?) and establish the correct standard placename for the one sloppily copied from the index.
Most of us really care about the Family Tree project and are working hard to achieve as much accuracy as possible, as well as working on the (highly ambitious) aim of seeing there is just one ID per deceased person. We follow the FamilySearch instructions as far as possible, then along come these authorized side-projects, which are producing results that are completely counterproductive.
Surely someone can be convinced of the negative impact these projects are creating and, more importantly, put a halt on them until they there is evidence they are being properly supervised and operate in line with Family Tree guidelines?
6 -
@Paul W Thanks Paul. I miss GetSat but do follow a few threads here on Communities, especially ones that impact the user experience, like new page roll-outs and data issues like the computer automation Census projects adds to FT.
You have a pretty good handle of these projects and what management is thinking. For these Census projects we are at the mercy of Census record data which is not as good a quality as say, BMD type records. And the maiden name is a typical problem of Census. We are all caught in the balance of desiring greater tree data (persons, relationshiips and Sources), balancing quality and quantity, and throw into that desiring more users contributing and experiencing family history connections.
I can't say much detail than has already been communicated.On a whole the Census duplication is a problem, but it is on all products adding to the tree and I track duplication rates for each. The census duplication aspect is a bit different in how it is touching some user's tree more than other users, because of the locale and time period of the collection.
I have been tracking the concerns, data issues and voice my opinion to whoever will listen. The biggest help to me is to keep giving examples and how much work it takes users to fix the mess ups. In the end good data leads to good user experience, bad data to bad user experience. Thanks.
2 -
GIGO --"In the end good data leads to good user experience, bad data to bad user experience." Your assumption that the indexed records are "good data" is the first problem here. There is NO substitute for research. No matter how many records the project adds, someone still has do the research. That takes time.
I've not been posting comments about the over-the-top frustrating experience I've had with finding duplicates and the hours and hours I've spent correcting the Census Project data because notifying you or Joe Price about it takes so much time away from the research I'm doing.
A situation from this week stands out however, so indulge me while I share my frustration.
While I was doing more research on Frank Muck and Margaret Klekotta, I saw that another researcher (not a volunteer User ID) had added "Helen Habret" (GXGW-YH2) as a daughter in their family using the 1920 US Census as a source. The researcher did not look at the original census image, but only at the index. And the index is wrong! The original census image is clear: Helen is part of the family of George and Pauline Habret (sic) on the same page.
The US Census project added George (GJ48-DP3) and Pauline (GJ48-839) on 22 November 2022 without Helen in the family.
But wait! Those records for George and Pauline were already duplicates! Their Pennsylvania marriage record was used to create records for them and added to the FamilyTree database in 2013. It turns out that George's first name is Wojciech and the surname is spelled HABRAT. And just by doing a bit of research, I found out that George also went by "Adalbert Habrat". Oh, and Pauline's maiden name is "Rice" (this fact is more than a little bit important).
But there are more duplicates! On 19 April 2021 the USCensusProject added the family of "George Hobrop" (GC1S-JH6) and wife Pauline and family using the 1910 US Census.
Could the bad data get any worse? Yep! For 1910, you made up a person as a child of George & Pauline: "Ada Hobrop" (GC1S-KXQ). Although that name appears on the 1910 US Census as a daughter in their family, THE PERSON DOES NOT EXIST BECAUSE THE CENSUS IS WRONG. The entry for "Ada" is for their SON Walter James Habrat.
To recap:
*Data on the 1910 US Census is wrong
*Data on the 1920 US Census is wrong
*Data on the 1920 US Census is indexed incorrectly
* The CensusProject added a husband/wife twice and not once did the application identify that there were duplicates
*You missed a son (1910)
*You made up a daughter (1910)
*You missed a daughter (1920)
There is NO substitute for research.
4 -
The CensusProject added a husband/wife twice and not once did the application identify that there were duplicates
This.
The operator doesn't find the duplicates because the application doesn't standardize the census year and place, which is an additional complaint. From the video clips it appears the operator could standardize them, but is neither expected nor taught to do so. I imagine most project participants are actually engaged with their phones while the application on the desktop in front of them runs effectively unsupervised.
2 -
In case it helps with aggregating just how much effort cleaning up this project's byproducts takes, lately I've been starting each Person merge summary with "Another USCensusProject duplicate". Don't know if it's possible to run a search on that (probably not considering the size of the database).
If it would be more productive, I could start keeping a list of how many I've run across and sending those over... but I don't want to take even more time to do that if it's not going to make a difference.
3 -
I have been tracking Census duplication. Please provide the PID and, for bonus points, how much time you spent fixing it - either here in this thread or message me. Let me know here if you message me so I can check my messages. Thanks - - Joe
1 -
Duplicates entered by US Census Project
The surname inscribed on the headstone for the following family is GRONKOWSKI. Baptismal records for their children that are found in FS indexed sources have the same surname. The 1900 and 1910 U.S. Census records are guess-and-go spellings (and why is the project even adding new records using the 1900 US census?).
GFLY-T63 and GC82-9K9 (Friedrich Gronkowski--GV4Z-4WH)
GFLB-69V and GC82-9KW (Karolina née Poburski--GV48-ZGD)
GFLY-YF3 and GC82-2N5 (daughter Gertrude/Gretta)
GFLY-B39 and GC82-QYV (daughter Ida/Edith)
1