Is the user "TreeBuilding Project" taking the tree forward or wasting time?
Answers
-
Well, I have now had what feels like a very positive response from FS Support (despite the 'hopefully' and 'eventually'):
We have not forgotten about you even though we are somewhat slow at some things.
Your message has been forwarded to the FamilySearch team that will follow up with the BYU RLL.
FamilySearch does not control the RLL but there are hopefully some influences that will eventually make a difference in the issues that you have described.
I have said thank you very much and requested that the team concerned keep us posted. We'll see.
1 -
Rather less positive message overnight:
Your reply message is appreciated, but, unfortunately, there are so many links in the chain between you and your message to FamilySearch and the BYU Linking Lab, we do not have a pathway to "keep posted". We suspect that you may not see any changes real soon, but on the other hand, the Linking Lab is a very active place and we cannot predict what their response will be. If we do learn, though, we will let you know.
Suppressing my irritation at 'we do not have a pathway', I am inclined to leave it 2 months and then, if there is no visible progress and we have received no update, try an abuse report.
1 -
I'm sure it's not that many of us don't appreciate your efforts in highlighting the damage these projects have inflicted on Family Tree, but have become so exasperated at the lack of recognition of the fact that we "gave up the will" long ago. So, whilst I admire your seeming indefatigability with regards to this issue, as long as FamilySearch management take a positive view of this work I doubt whether any amount of our negative reports will be of help in improving the situation.
In theory, this work (in the form of "stand alone" projects) could have provided excellent databases, but - as we have discovered - linking the data directly to Family Tree has just caused us too much grief!
4 -
I had to stop reading at page 3 in this thread and restart at page 10, so I’ve missed a lot. However, having had some experience, it seems to me that what this whole discussion points to is that FS etc. has decided that people-type users are not getting the job done. We have too much experience to create the quantity of records/profiles that FS etc. desires for “production."
The “solution” to the problem of “production” is now AI and all that that entails. Get the production, learn from the mistakes made and produce more AI programming to fix problems.
Hang the people-type users out to dry and move on down the line. This is now the way of big business, and after all, that is what we’re talking about here.
PS. I no longer have the desire to fight the infidels. Its better to hide in the bushes and wait.
0 -
@Tex Lawrence There seems to me to be a big difference in practice between FS' AI initiatives and BYU RLL's efforts.
Despite teething troubles, we are beginning to see FS' internal work having a real positive impact on FT data quality; their solutions don't update FT, but they do enable users to do a better job themselves. Full Text Search seems to me to be an unmitigated success; and from what I have read and seen, the Computer Assisted Indexing work is far better planned and implemented than the rather scattergun front window given to it by Get Involved would imply.
BYU have their own AI projects fed by FS Records and their indexes (Censuses, Numident). Initiatives like their Census Tree analytics tools are not a problem from my perspective. What is a problem is that they (possibly encouraged by FS, but we don't know this) at some point decided it was a good idea for teams of student volunteers, under what seem to me to be completely inadequate project controls, to update/create millions of FT profiles automatically using their AI models and the FS FT APIs.
3 -
A new twist - I spent most of yesterday untangling a treehelper mess. This time, instead of duplicates, at least 3 families, in 3 different locations, were conflated into a single family.
Sigh.
0 -
Oh dear, can't help thinking that's worse if anything (and not conducive to BYU's believably finding random people at state fairs on the Tree, either).
0 -
I haven't run into a blended family like that so far. I mostly find incorrectly attached records and/or duplicated individuals. I agree with Mandy. I think the idea that this activity to add masses of people to the tree is not accomplishing the goal of both discovery and continued activity in the tree. People may "discover" their family member on the tree, but do they come back? I would bet the answer is no. Why should they? The work is done (or it at least appears to be done). And its become clear, at least to us, the mess the projects are creating.
0 -
I've just been doing my monthly check of BYU RLL's activities against the profiles in my newly refreshed analysis database. A warning that the TreeBuilding Project user appears to have started doing merges (e.g. see GLBB-XDY), presumably via the documented Merge API.
0 -
Today, the TreeBuilding Project user has been let loose on various War II Draft Registration Cards collections. It has been creating MilitaryDraftRegistration custom events and Residence events. Because the Residence source information typically contains only the first component of the place (albeit that it is often the same as the first component of the custom event), these Residence events are being created with no standardized place.
This is having a significant impact on the Verify Places activity. Although that activity does not allow you to choose United States, that does not stop the activity from presenting these events in their hundreds for verification. This is going to discourage me from using Verify Places.
2 -
Surely the TB Project should be creating valid Residence events, that is, valid by the GEDCOM standard? In other words, the events should include everything, right through to "United States".
If most of the data is missing for those Residence events, they should not be allowed through. I call that incompetence, and not accidental either.
1 -
I suspect none of the APIs do that much checking of the content submitted (I have never seen any sort of 'invalid content' status shown as a possibility in the API documentation). And the level of oversight over the volunteer students doing the TreeBuilding Project scripting appears to be very limited.
1 -
@MandyShaw1 - I can cope with the concept of the submitted residence not being a standardised place - after all, that's how everything was to start with. But the idea that the submitted residence is just something like "1266 Main St" with no indication of the town / city, is appalling and shows that the people concerned don't understand the first thing about what's supposed to be in the Tree. (Please note that if the fingers on keyboards have never been taught this, I can't blame them, it's a management issue).
3 -
@Adrian Bruce1 Precisely - the lack of management oversight has to be the worst thing about this whole mess, and the thing that both FamilySearch and BYU RLL are most at fault for. And, if these students are looking for careers in IT, they are going to need to learn better practice very quickly.
2 -
More on the meaningless Residence values for WW2 Draft Registration Cards.
In my analysis database there are 4 profiles that had these sources added by TreeBuilding Project on 6th September. All have these meaningless Residence addresses, of which all but one have the 'Non-Standardised Place' message showing.
However that one (G6Z1-8FK) was subsequently (7th September) changed by the FamilySearch user to add a standardised place. This is something that appears to have happened to nearly 2,000 Residence events on profiles in my database (mostly untagged), with the updates occurring from June 2021 up to date. Looks like there has been a FS bot running regularly against FT tidying up missing standardised places. Maybe BYU are assuming FS will clean up after them in this way, given that they can't be bothered to put an accurate value in place to start with (noting that full Residence address information is definitely present in the record persona metadata).
(Incidentally the only other Residence tags I can find in my database that were added by TreeBuilding Project are Numident ones, and they appear to have had sensible place names in place from the start.)
2 -
Some more general research on the TBP WW2 Draft Registration automation.
I have 97 of these in my analysis database, with the sources added (all to existing profiles) between 6th September and my latest refresh date of 21st September.
First I had a look at how well the source attachments actually matched the 97 profiles, using the relevant Record Hints as a basis.
1 of the attachments, on profile GXCD-278, is a complete mismatch (and the Record was already attached to the correct profile). There must have been a glitch in the script.
4 of the attachments correspond to existing Hints that someone had already marked Dismissed (the automation script could easily have checked this point).
17 of the attachments do not correspond to Hints at all. In all these cases either the names don't seem to fully match or the birth date is in the 19th century, but the ones I looked at appeared fairly valid.
More importantly the duff Residence issue remains on many profiles. On others an attempt has been made to standardise the place at attachment time, but using the first matching place produced by the Places API, which has given some really awful results. See below for details of the first 25 of the Residence Events in alphabetical order of PID.
It occurs to me also that all these Residence Events are also (and worse) flawed in not having any dates on them, standardised or otherwise. 7 of the attachments didn't involve the addition of Residence events/tags; this appears to have happened where there was an existing dateless Residence event.
Do people think we can/should report this script as abusive?
Profile
Record Persona
Residence Event Place: Original
Residence Event Place: Standardised (if any)
Other relevant Labels present (if any)
Comments
2Z36-QNT
QGTQ-VY8L
Des Moines
EVENT_PLACE_ORIG
Des Moines, Polk, Iowa, United States
EVENT_PLACE
Des Moines, Polk, Iowa, United States
no standardised place yet
9KDX-VZ2
QGF9-XC9L
Norwalk
EVENT_PLACE_ORIG
Norwalk, Los Angeles, California, United States
no standardised place yet
9SBM-WY2
QKV9-9QTP
Kingman Reef, United States Minor Outlying Islands
Kingman Reef, United States Minor Outlying Islands
PR_RES_CITY_ORIG
Kingman
PR_RES_CITY
Kingman
EVENT_PLACE_ORIG
Kingman, Mohave, Arizona, United States
EVENT_CITY_ORIG
Kingman
standardised at attach time, but using first Kingman place. See existing EVENT_PLACE_ORIG
9VH6-4PF
QGF4-4JS9
Lindsay
EVENT_PLACE_ORIG
Lindsay, Tulare, California, United States
no standardised place yet
G456-58M
QKVM-6P22
Denver
EVENT_PLACE_ORIG
Denver, Denver, Colorado, United States
no standardised place yet
G46Z-VYV
QVR2-RGHQ
Jonas Ridge, Burke, North Carolina, United States
Jonas Ridge, Burke, North Carolina, United States
PR_RES_CITY_ORIG
Jonas Ridge
EVENT_PLACE
Jonas Ridge Township, Burke, North Carolina, United States
EVENT_CITY_ORIG
Jonas Ridge
standardised at attach time, PR_RES_PLACE_ORIG appropriately used
G47Z-X29
QGPN-BKLM
Kalispell
EVENT_PLACE_ORIG
Kalispell, Flathead, Montana, United States
EVENT_PLACE
Kalispell, Flathead, Montana, United States
no standardised place yet
G49L-D4J
Q2SJ-H3F2
Middletown
EVENT_PLACE_ORIG
Middletown, Dauphin, Pennsylvania, United States
no standardised place yet
G5HX-4CF
QPZH-CYXK
Clinton
EVENT_PLACE_ORIG
Clinton, Dewitt, Illinois, United States
EVENT_PLACE
Clinton, DeWitt, Illinois, United States
no standardised place yet
G5P5-FG4
QKVM-2VFN
Fort Lupton
Fort Lupton, Weld, Colorado, United States
standardised by FS bot using existing EVENT_PLACE_ORIG
G5W5-F76
QP48-M2LS
Freeport
EVENT_PLACE_ORIG
Freeport, Stevenson, Illinois, United States
EVENT_PLACE
Freeport, Stephenson, Illinois, United States
no standardised place yet
G5YC-K4N
XPJF-722
Beaumont, Jefferson, Texas, United States
Beaumont, Jefferson, Texas, United States
PR_RES_PLACE_ORIG
Beaumont, , Texas
EVENT_PLACE_ORIG
Beaumont, , Texas, United States
EVENT_CITY_ORIG
Beaumont
standardised at attach time using first Beaumont Texas place match, which happens to look correct in this case
G6Z1-8FK
QPTM-4ZYW
Muskegon Heights
Muskegon Heights, Muskegon, Michigan, United States
standardised by FS bot using existing EVENT_PLACE_ORIG
G72D-28Q
QL3N-MRL1
Floresville
EVENT_PLACE_ORIG
Floresville, Wilson, Texas, United States
no standardised place yet
G7BR-616
7QCM-9N6Z
Cranston, Providence, Rhode Island, United States
Cranston, Providence, Rhode Island, United States
PR_RES_PLACE_ORIG
Cranston, Rhode Island
PR_RES_PLACE
Cranston, Rhode Island
PR_RES_CITY_ORIG
Cranston
PR_RES_CITY
Cranston
EVENT_PLACE_ORIG
Cranston, , Rhode Island, United States
standardised at attach time matching existing EVENT_PLACE
G7HZ-VJK
QP8P-Q6YT
Derry
Derry Village, Derry, Rockingham, New Hampshire, British Colonial America
EVENT_PLACE_ORIG
Derry, Rockingham, New Hampshire, United States
EVENT_PLACE
Derry, Rockingham, New Hampshire, United States
standardised by FS bot, though apparently incorrectly
G7KB-1CT
QGF9-JJFQ
Stockton
EVENT_PLACE_ORIG
Stockton, San Joaquin, California, United States
no standardised place yet
G7WJ-1H5
Q2B8-B8QG
Elk City
EVENT_PLACE_ORIG
Elk City, Beckham, Oklahoma, United States
no standardised place yet
G8VV-Y94
QLXY-5BV3
Springfield
EVENT_PLACE_ORIG
Springfield, Greene, Missouri, United States
no standardised place yet
G9JP-B6Z
Q2CT-M5ZH
Meriden
EVENT_PLACE_ORIG
Meriden, New Haven, Connecticut, United States
no standardised place yet
G9NR-PN7
QLFM-DTWP
Salisbury, Guam
Salisbury, Guam
PR_RES_CITY_ORIG
Salisbury
PR_RES_CITY
Salisbury
EVENT_PLACE_ORIG
Salisbury, Chariton, Missouri, United States
EVENT_CITY_ORIG
Salisbury
standardised at attach time, but using first Salisbury place. See existing EVENT_PLACE_ORIG
G9YD-QDL
QP8G-WHHH
County Clare, Ireland
County Clare, Ireland
PR_RES_CITY_ORIG
Clare
PR_RES_CITY
Clare
EVENT_PLACE_ORIG
Clare, Clare, Michigan, United States
EVENT_PLACE
Clare, Clare, Michigan, United States
standardised at attach time, but using first Clare place. See existing EVENT_PLACE_ORIG
GB6Z-M7X
Q2C5-3M8R
Port St Joe
Port St. Joe, Gulf, Florida, United States
EVENT_PLACE_ORIG
Port St Joe, Gulf, Florida, United States
standardised by FS bot using near match to existing EVENT_PLACE_ORIG
GB82-XDS
Q2B4-V92D
Mangum
EVENT_PLACE_ORIG
Mangum, Greer, Oklahoma, United States
no standardised place yet
GBV7-F8F
QG6D-LNY4
Helix
Helix, Umatilla, Oregon, United States
standardised by FS bot using existing EVENT_PLACE_ORIG
1 -
Sadly(?) I've never got the idea that incompetence counts as abuse… Especially if FS regards more data as success…
0 -
But this is not manual edit incompetence from inexpert/uninformed contributors, it is high volume stuff from a team who should know better ... and the 'u's added recently did count as abuse.
0 -
I think I may have found another angle.
I have been investigating the standards that apply to organisations whose applications make modifications to the Family Tree via the FS APIs, in particular this:
BYU RLL are as far as I can see the only people who update existing Family Tree profiles using the APIs without the change-by-change user confirmation that is clearly expected by this standard. As we know, they have also inserted many, many profiles into Family Tree, without user confirmation, where there was a high confidence match (i.e. what would become a FS-flagged duplicate) present already, which again is clearly not compliant with the standard.
(Incidentally it seems to me that the FT gedcom import mechanism does comply with this standard.)
I have asked Support to find out for us whether this standard applies to BYU’s projects, and, if not, why not?
Separate point: it's clear that none of the abuse criteria recently identified for us by a mod on
(though the relevant comments seem, disappointingly, to have been deleted) cover any of BYU RLL's activities in any way.
3 -
No news from Support as yet. I'll leave it another week, then try an abuse report anyway, on squeaky wheel principles - what harm can it do? I have to assume (given FT is not just one of FS' biggest assets but one of the largest collaborative data stores in the world) that there is someone in FS who cares about information governance enough to know this situation is unacceptable, maybe they just haven't heard our message yet?
2 -
Here's my proposed abuse report. Comments please - current plan is to submit it next Monday. People's favourite (?) examples would also be useful.
(Edit - on second thoughts I have removed all reference to oversight - it's obvious it's inadequate, but we can't prove it …)
Over the last few years the BYU Record Linking Lab (RLL, ) has added and modified high volumes of entries to the FS Family Tree.
This abuse report relates to the fact that these projects, which use the FamilySearch APIs via automated scripts, have resulted in very large numbers of FT data quality problems, in particular profile duplication and inappropriate source attachments.
The relevant user names are USCensusProject, CommunityCensus Project, and TreeBuilding Project.
We accept that inappropriate changes to the Tree are not normally seen as abuse. BYU RLL’s projects, however, do not work with the Family Tree as ordinary users do. This is a matter both of scale and of accountability.
One would expect the RLL team’s work to be of a high standard; in practice, however, teams of student volunteers use the APIs to create or change many profiles at a time (millions, in some cases),
- without complying with FS’ published API usage standards (see https://www.familysearch.org/developers/docs/guides/implementation-cert , and, in particular, ),
- without taking existing data (or indeed the needs of other users) into account, and
- frequently in a manner indicating lack of care in design, implementation, and/or testing (for examples, see https://community.familysearch.org/en/discussion/comment/608281/#Comment_608281).
Note that the student volunteers (see ) appear to have been asked to ‘grow the tree in any way [they] could’.
BYU RLL’s stated approach (https://community.familysearch.org/en/discussion/comment/573042/#Comment_573042) is to automate many of the inserts and changes to a certain point, with the profiles then being left, silently, and often in an internally inconsistent state, for manual editing by other volunteers. (One widespread example is the use of married surnames when creating Census wife profiles.) This ‘halfway house’ approach was clearly never realistic as an overall solution, given the vast numbers of profiles involved.
We have made considerable efforts, going back to September 2024, to raise these matters with BYU RLL themselves; via FS Support; and (as recommended by Support) via Suggest an Idea. Despite some initially promising interactions, we have realistically made no progress at all.
We can provide detailed examples as required.
4 -
Re the abuse report, after sleeping on it I've concluded it would be better to add concrete examples and any helpful summary statistics in the initial submission, given we undoubtedly have only one chance to get our message across. I'll draft something asap and post it here for any comments.
Once we have the final text, I propose to also copy it, for information, to Support. If we still get nowhere we can widen the circulation, e.g. DQS team.
Incidentally, about the fact that we still have no idea which FS team is BYU's liaison on this and approves their ideas. One of Joe's comments in this thread makes it clear that such approval is part of the process (at a high/conceptual level only, I have to assume). I think our failure to identify, and therefore to communicate with, these people might be worth mentioning on the report.
2 -
I agree. I think concrete examples would be good. I ran across several examples in just the last week alone - merging at least 12 duplicates created by the users/projects mentioned, poor data entry for locations and names, etc. Unfortunately, I am working on an active research case and had to spend the time correcting everything in order to use FamilySearch to its best advantage - relevant record hinting on properly created profiles. If I run across anymore, I will take note so I can provide them.
1 -
Thanks @melanes, I am working on a draft with evidence included and will post it later today for review and also so that people can plug in more and/or better examples.
0 -
Just a quick update to say this is taking a lot longer than I hoped.
2


