Is the user "TreeBuilding Project" taking the tree forward or wasting time?
Today user "TreeBuilding Project" added a source record index https://www.familysearch.org/ark:/61903/1:1:6KWF-ZSBL (which is from the "United States, Social Security Numerical Identification Files (NUMIDENT), 1936-2007") to profile GFS1-ZS6, Amelia Matthews.
Now, this is actually quite an interesting index record - it implies Amelia married 4 times (to Messrs Osborn, Leonard, Ness and Empson). Her profile already has the marriage to Roy Ness - but also a marriage to Glen Wrighter Harbaugh. So that's potentially 5 marriages implied by the existing profile and the newly attached source index.
And what has user "TreeBuilding Project" done with this (genuinely) intriguing information? For Amelia's own profile, they've added a dated custom event "SocialProgramApplication", an undated custom fact "Social Program Correspondence", a custom fact "Race" (value white) and an undated Custom Event "PreviousResidence" of "Seattle, King, Washington, United States".
They have ignored her death date of June 1994 which is there on the record.
They have done nothing with the newly implied spouses - although that could be a tangle.
They have created 2 custom events/facts ("SocialProgramApplication", and "Social Program Correspondence") that I would regard as genealogically irrelevant. (The source record created by the events isn't irrelevant - just the events.)
They've created an undated Custom Event "PreviousResidence" of "Seattle, King, Washington, United States" when the correct input should have been an undated (standard) Event of type "Residence" of "Seattle, King, Washington, United States". (This mistake I find appalling - creating a Custom Event instead of a Standard event is nonsensical).
I asked whether "TreeBuilding Project" is taking the tree forward or wasting time?
- The Custom Event "PreviousResidence" needs to be corrected to a standard "Residence", so that's wasting time for the people coming after.
- The creation of genealogically irrelevant events wasted the time of "TreeBuilding Project" (assuming it is a real person, which I beg leave to doubt).
- One big problem is that the death date has been ignored - since the source index containing the date of death has already been attached, what - if anything - will prompt a careful researcher to find that date? Potentially nothing - so more wasted time.
Then there are the implied husbands that have been ignored - to be honest, that could be a complex story and I would advocate leaving that to one side awaiting serious research into that family. Nonetheless, what will prompt such a person to look at the NUMIDENT source again? Has the opportunity for such a prompt been lost? I'd have added something to the Collaborate tab along the lines of "Apparently also married to …."
Answers
-
"TreeBuilding Project" has been attaching MANY NUMIDENT sources. I see his/her work almost daily in profiles on my following list.
It seems that this user may be part of a group of users we have discussed a few times who are, apparently, focused on the NUMIDENT record set. Since NUMIDENT is a US-centric record set, it's interesting that many users in group have names that appear to be from another country.
I've even wondered if "TreeBuilding Project" may have multiple users because I've seen so much activity by that username.
Some relevant threads:
2 -
Thanks Adrian Bruce for you comments.
Your examples are noteworthy. Another couple of examples are Custom Events such as "Residence 1949 United States" from the 1950 Census, "Residence 1935 Same Place" from the 1940 Census or "Membership" or "Residence 1940 World" in connection with the LDS Worldwide Census. By the way, I totally agree about your comments about Custom Events for Social Program entries from NUMIDENT sources. I don't believe it is genealogically significant to know when someone applied for Social Security or made a Social Security Claim. Another consideration is the fact that Family Tree uses Custom Events in "computer generated" life sketches in the About page. I recently learned that my relative served in the "Northern Sudan Mission" in 1903 (in fact was the Northern States Mission but the place in the Custom Events had not been standardized). So one thing leads to another and sometimes the end result is less than desirable. Also, someone recently made a comment how people are "copy/paste" life sketch information into other websites, which I have also witnessed.
Sorry for the rant, just my thoughts.
DL Melville
2 -
I don't believe it is genealogically significant to know when someone
applied for Social Security or made a Social Security Claim.I have to disagree with that statement. Knowing when someone applied, especially in the earlier years of the Social Security program, can give us good intel about when someone started working. Especially for women, that can be important to that person's history. And when the claim was made can also be significant, if it's long before or after standard retirement age.
4 -
I think that my opinion about the relevance of the applications is that the application event itself is irrelevant in a genealogical sense - but the generated source record can be immensely valuable for its implications.
Indeed, I am beginning to suspect that these NUMIDENT files were deemed by the whatever it is project, to be simple attachment jobs - whereas the implications of the recorded events are actually the exact opposite of simple attachments. They have, I think, all sorts of complex implications that can't be confirmed without a lot more work (such as several potential husbands). Not at all easy to do a complete job on the file.
2 -
I suppose, in the collaborative environment of the FSFT, one user/group attaching the basic record, while leaving the details to someone with more time or experience, may not be all bad. BUT, I worry that we'll see more bad choices/attachments when the project/user takes a simplistic view of same-name records, as in that O'Connor family in my earlier thread.
1 -
It should be pointed out that it's not the user making the choice about custom versus standard event: Source Linker takes it upon itself to make those decisions, and users cannot change it. They can only choose whether to add the conclusion or not.
In fact, about 90% of the complaint should be addressed to the Source Linker programmers (and/or whoever's in charge of the interface between NUMIDENT and Source Linker, or between indexed collections in general and Source Linker). The user is just blindly clicking through, accepting everything that Source Linker offers — which is a problem, but a well-known and highly-predictable one that the programmers should be designing their tool around, rather than exacerbating with dubious-value conclusions.
4 -
@Julia Szent-Györgyi - rightly or wrongly I have attempted to raise this in the New Source Linker Feedback group.
(I failed totally to link the new thread to this thread as a whole - the Community software insists that I really want to link to the last comment in the thread, not the thread as a whole. How stupid of me to want to link to the wrong thing… Sarcasm off.)
1 -
If "Tree Building Project" is a FamilySearch backed group of volunteers and not just a user with a deceiving user name, then part of the trouble with deciding whether the project is an advance or a detriment will be due to the fact that the vast majority of the volunteers might do excellent work while a couple of them are disastrous, just like with indexing.
If this is a FamilySearch effort, I would wish for a couple of things:
- A link to project instructions so we could know what the volunteers are supposed to be doing.
- That FamilySearch keeps track of what each individual volunteer does.
- The ability to message the project supervisors regarding any individual source attachment so that if there were glaring problems, they can go back to the volunteer that messed up and either give that person more training or, if a recurring problem, give that person a different assignment.
7 -
I did a quick Google search on Tree Building Project and found this:
https://www.familysearch.org/en/fieldops/roc/vroc-projects-building-the-family-tree1 -
So it is a FamilySearch project. But it appears that no information about it is accessible to users of Family Tree and no feedback button is available for users to express concerns about what effects the project in having on their relatives' profiles in Family Tree. In particular whether or not a particular volunteer is actually following the project instructions.
2 -
About 6 weeks ago, the issues with the TreeBuilding Project were raised in BYU LinkingLab's Facebook group.
(I don't have a FB account and was able to view the thread after clicking the X.)
https://www.facebook.com/photo.php?fbid=962744598985123&id=100057487747623&set=a.549755300284057
In the thread, a FS researcher was highlighting how doing careless record attachments in bulk creates a massive amount of work for researchers to clean up. Examples included how the TBP was interacting with existing orphan and duplicate profiles created by the USCensusProject, making those difficulties even more of a challenge to iron out.
BYU LinkingLab's response was their work attaches millions of records that otherwise wouldn't be enjoyed by families.
This assertion by LinkingLab hints that their overdriven programs are what stands between those records and permanent obscurity. More clearly, their response fully ignores that these records were already being attached.
If misguided treebuilding and census projects never existed: Family history would would not only proceed, the work would often be a much lighter burden than it is now.
2 -
There might, might be a place for these LinkingLab's projects, one day. A variation of them, anyway.
Eventually we'll reach the future where all the low-hanging fruit has been plucked; where the difficulty of what remains is stalling progress. Here I can see the value in directing teams of experienced researchers to unearth and work with what remains.
Contrast that with projects that set mobs of inexperienced researchers against classes of records, with the result being an absolute mire of profiles created/dominated by beginner mistakes.
Do that on a massive enough scale and the experience for the researchers can be fairly described as continual and distressing or maddening.
2 -
Sometimes I think there could be a preliminary Wild West Tree into which these automated feeds could flow, with a 'promotion' process that only allowed changes into the main Tree automatically if very specific rules were met (e.g. don't allow anything that would reduce data quality scores, don't introduce anything that even could be a duplicate), with manual confirmation by an experienced user being required to accept/reject the rest. This could all apply to gedcom imports too.
1 -
Does anyone know if the various users under the name "treehelper" are part of the same project, another BYU project, or unrelated? The treehelper name is usually followed by an underscore, a letter, and a number. I've run into a few different ones.
In the case of at least one family in my research, I've had to repair the same family unit twice - once after one treehelper visited and again after another treehelper visited.
0 -
Just an FYI. If you see the names USCensusProject, Tree Building Project or Community Census Project these are AI (Artificial Intelligence) programs that are run on records. They are not actual people. It is a computer program that performs a variety of advanced functions such as reading written language and analyze data. With the Billions of records that needs to be indexed, AI is the best way to read and analyze those records. Records that it has difficulties reading, i.e. a persons cursive handwriting, get's kicked out for a human to review and attach. There are thousands of TRAINED volunteers that review and attach records daily to the public FamilySearch tree.
In regards to the NUMIDENT public records. These records contain a VAST amount of useful information. It lists both parents and it almost always has the mothers maiden name, which is sometimes difficult to find with no parents on the record. It gives the birth and death dates of the person and in most cases the city and state where they last lived.
0 -
@KAClark2 Thank you for your clear communication of the situation.
'Read and analyze' is great, automatically update the Tree less so, judging by the many comments and multiple threads here describing the corruption and time-wasting caused by the ministrations of these algorithms.
I think you will find that the automation of updates has a bad name within the FS user community (not just because of this - see also, for example, the many placename problems).
2 -
We seem to have conflicting information about the makeup of these projects.
4 -
@KAClark2 - the use of AI to read and analyse records is fine providing it's confined to creating Historical Records. It seemed to work well for the 1950 census for whichever company it was… It's the attaching to the FS FamilyTree that is causing issues.
You mentioned the NUMIDENT files - that was where I came in with my original post. I lost a little faith in the work done by whoever or whatever when they updated the profile in question, when the Death Date from the NUMIDENT file was ignored.
4 -
@Adrian Bruce1 Thanks for the comments. Yes I agree that the AI is not perfect. It is a program that scans and then translates the information. Depending on how accurate the information is will depend on what the program can read. Yes there will be errors and mistakes when it reads documents and tries to match them up. Just as humans make mistakes. We can only hope that the AI Programs will improve as it 'learns' how to read and match the records.
I agree on the NUMIDENT information. The death date should have been changed/fixed. I tend to believe the birth and death dates from NUMIDENT and Military records before I count on a Census Record. I find those 2 records are more, but not always correct, but more accurate then other records. Keep in mind that the older the Census Record the less accurate it 'might' be. People simply did not keep track of when they were born or how old they are. It also depends on the Census Taker as to how much care they put into recording the correct information. So many variable that can make records inaccurate or seem to be inaccurate.
I tend to take my time when attaching sources and records to make sure I have, what I feel the most accurate information possible. The more accurate the information the more sources that it can locate and make a more robust profile.
Great ideas and input from everyone on this topic.
(name removed)
1 -
@KAClark2 Mod note: Community is a public online forum. For your privacy, your post was edited to remove a name that is not part of your username. Please see the Community Code of Conduct for more details.
0 -
@KAClark2 In search of some additional clarity, I'd like to revisit these statements again.
- "Just an FYI. If you see the names USCensusProject, Tree Building Project or Community Census Project these are AI (Artificial Intelligence) programs that are run on records. They are not actual people."
- "There are thousands of TRAINED volunteers that review and attach records daily to the public FamilySearch tree."
Number 1 is interesting but doesn't seem to bear much relevance to the overall topic, the final results of these projects. Most notably is the millions¹ of profiles that the USCensusProject left behind for volunteers to clean up - and the vast amount of hours that volunteers are having to spend on that (for years to come).
Can you better explain the impact that AI had on USCP output?
For number 2, it isn't clear to me who is being referred to. I've not heard that BYU Record Linking Lab has thousands of volunteers². Certainly the leavings of the USCensusProject don't seem to fit output from trained volunteers. Which specific group is being discussed here?
¹ The actual # of profiles created by the USCP (1910p, etc) would be welcome info. So would the # of profiles attached to by the USCP.
² I could be wrong. Corrections are welcome.
4 -
One question I've never heard asked is this.
Is there a reason why these super-capable projects can't revisit records impacted by the USCP, etc and do the skilled work of properly reconciling records, profiles and families?
Resolving the issues that USCP left behind is an often complex task. Placing that entire burden on rank and file FS volunteers (experienced and not) doesn't seem like the best option.
That responsibility seems like a natural fit for thousands of trained project volunteers, however. I would argue they are the best possible fit to work on that.
What is the reason this massive effort isn't being handled by the folks best positioned to handle it?
4 -
At RootsTech there was a session on FS' data quality strategy, but only in-person attendees had any access to what was discussed, despite requests for more information.
I nonetheless find it difficult to believe that this large scale hit and run subversion of FT profiles and relationships can be in line with /any/ DQ strategy.
@KAClark2 can you comment on this?
3 -
I just looked at @KAClark2's profile and see she is a recent joinee. Nothing in her other comments indicates she does or does not have ties to BYU Linking Labs or knows any info beyond some project PR.
My questions are loaded with hard history and the correct recipient for them are folks with connections to the BYULL.
If I have helped overload an innocent bystander who is still getting their feet here - I apologize.
2 -
Well, it's started for me. Most of this afternoon is spent merging recently created orphan profiles from CommunityCensus Project and CommunityCensusProject1 into existing profiles - including into profiles created by the USCensusProject.
I really, really wish it wasn't okay for these groups to heap hours and hours of needlessly extra work on us. I really really wish I wasn't looking forward to more years of this.
Their pile of new problems is growing so fast I'm seeing it hours after creation. This is beyond depressing.
4 -
Just today I hit
TreeBuilding Project (also per the OP)
CommunityCensus Project
CommunityCensusProject1
Each left duplicate/orphan profiles and nothing else. I reached to see if I can learn who/where they are.
Do you know of any others?
0 -
@No one in particular I've noticed a couple of weird accounts with what appear to be one-off usernames. I messaged one ("treehelper_a239"), which had a profile picture and location (Nigeria). At minimum I guess that could be more accountability than an anonymized project account that multiple people use, although I did not hear back from him (he created a single person 7 times and did not merge or delete the duplicates, and did not correctly set the birthdate on any of them). But that's not the only treehelper_### account I've seen.
1 -
I've been poking around in the BYU Linking Labs website
https://record-linking-lab.byu.edu/
I found a couple of interesting links:
https://www.facebook.com/BYUrecordlinkinglab/
I quote: 'The [Power Linker, see next link] tool helps attach 300,000 Numident records to the Family Tree each week.'
https://powerlinkerlite.rll.byu.edu/
This seemingly puts up a random BYULL-generated hint - note the 'Attach All' button, which I have not tried. Note also the complete lack of any context, guidance, or warning on the screen. (It also presents FT information on the screen without obliging you to log in to FS.)
It looks as if they have full access to the FT APIs (BYU are listed on the Solutions Gallery, though not in relation to this activity).
It would appear to me that they are developing their own record hinting tools (fine) and letting random volunteers use them to update the live FT data.
I do wonder how much of this is known to the FS authorities.
3 -
I had this same experience yesterday and the account sounds the same - but I'll have to go back and look up what follows the _ in the username. Thank you for bringing that to the table.
0 -
@No one in particular I queried the treehelper profiles sometime ago. See:
2