Home› Ask a Question› Family Tree

Is the user "TreeBuilding Project" taking the tree forward or wasting time?

«1…45678910»

Answers

  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    July 31

    Well, I have now had what feels like a very positive response from FS Support (despite the 'hopefully' and 'eventually'):

    We have not forgotten about you even though we are somewhat slow at some things.

    Your message has been forwarded to the FamilySearch team that will follow up with the BYU RLL.

    FamilySearch does not control the RLL but there are hopefully some influences that will eventually make a difference in the issues that you have described.

    I have said thank you very much and requested that the team concerned keep us posted. We'll see.

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    August 1

    Rather less positive message overnight:

    Your reply message is appreciated, but, unfortunately, there are so many links in the chain between you and your message to FamilySearch and the BYU Linking Lab, we do not have a pathway to "keep posted". We suspect that you may not see any changes real soon, but on the other hand, the Linking Lab is a very active place and we cannot predict what their response will be. If we do learn, though, we will let you know.

    Suppressing my irritation at 'we do not have a pathway', I am inclined to leave it 2 months and then, if there is no visible progress and we have received no update, try an abuse report.

    1
  • Paul W
    Paul W ✭✭✭✭✭
    August 2 edited August 2

    @MandyShaw1

    I'm sure it's not that many of us don't appreciate your efforts in highlighting the damage these projects have inflicted on Family Tree, but have become so exasperated at the lack of recognition of the fact that we "gave up the will" long ago. So, whilst I admire your seeming indefatigability with regards to this issue, as long as FamilySearch management take a positive view of this work I doubt whether any amount of our negative reports will be of help in improving the situation.

    In theory, this work (in the form of "stand alone" projects) could have provided excellent databases, but - as we have discovered - linking the data directly to Family Tree has just caused us too much grief!

    4
  • Tex Lawrence
    Tex Lawrence ✭
    August 10

    I had to stop reading at page 3 in this thread and restart at page 10, so I’ve missed a lot. However, having had some experience, it seems to me that what this whole discussion points to is that FS etc. has decided that people-type users are not getting the job done. We have too much experience to create the quantity of records/profiles that FS etc. desires for “production."

    The “solution” to the problem of “production” is now AI and all that that entails. Get the production, learn from the mistakes made and produce more AI programming to fix problems.

    Hang the people-type users out to dry and move on down the line. This is now the way of big business, and after all, that is what we’re talking about here.

    PS. I no longer have the desire to fight the infidels. Its better to hide in the bushes and wait.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    August 14

    @Tex Lawrence There seems to me to be a big difference in practice between FS' AI initiatives and BYU RLL's efforts.

    Despite teething troubles, we are beginning to see FS' internal work having a real positive impact on FT data quality; their solutions don't update FT, but they do enable users to do a better job themselves. Full Text Search seems to me to be an unmitigated success; and from what I have read and seen, the Computer Assisted Indexing work is far better planned and implemented than the rather scattergun front window given to it by Get Involved would imply.

    BYU have their own AI projects fed by FS Records and their indexes (Censuses, Numident). Initiatives like their Census Tree analytics tools are not a problem from my perspective. What is a problem is that they (possibly encouraged by FS, but we don't know this) at some point decided it was a good idea for teams of student volunteers, under what seem to me to be completely inadequate project controls, to update/create millions of FT profiles automatically using their AI models and the FS FT APIs.

    3
  • Áine Ní Donnghaile
    Áine Ní Donnghaile ✭✭✭✭✭
    August 22

    A new twist - I spent most of yesterday untangling a treehelper mess. This time, instead of duplicates, at least 3 families, in 3 different locations, were conflated into a single family.

    Sigh.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    August 22

    Oh dear, can't help thinking that's worse if anything (and not conducive to BYU's believably finding random people at state fairs on the Tree, either).

    0
  • melanes
    melanes ✭✭✭
    August 22

    I haven't run into a blended family like that so far. I mostly find incorrectly attached records and/or duplicated individuals. I agree with Mandy. I think the idea that this activity to add masses of people to the tree is not accomplishing the goal of both discovery and continued activity in the tree. People may "discover" their family member on the tree, but do they come back? I would bet the answer is no. Why should they? The work is done (or it at least appears to be done). And its become clear, at least to us, the mess the projects are creating.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    September 7

    I've just been doing my monthly check of BYU RLL's activities against the profiles in my newly refreshed analysis database. A warning that the TreeBuilding Project user appears to have started doing merges (e.g. see GLBB-XDY), presumably via the documented Merge API.

    0
  • JulianBrown38
    JulianBrown38 ✭✭✭
    September 8

    Today, the TreeBuilding Project user has been let loose on various War II Draft Registration Cards collections. It has been creating MilitaryDraftRegistration custom events and Residence events. Because the Residence source information typically contains only the first component of the place (albeit that it is often the same as the first component of the custom event), these Residence events are being created with no standardized place.

    This is having a significant impact on the Verify Places activity. Although that activity does not allow you to choose United States, that does not stop the activity from presenting these events in their hundreds for verification. This is going to discourage me from using Verify Places.

    2
  • Adrian Bruce1
    Adrian Bruce1 ✭✭✭✭✭
    September 8

    Surely the TB Project should be creating valid Residence events, that is, valid by the GEDCOM standard? In other words, the events should include everything, right through to "United States".

    If most of the data is missing for those Residence events, they should not be allowed through. I call that incompetence, and not accidental either.

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    September 8

    I suspect none of the APIs do that much checking of the content submitted (I have never seen any sort of 'invalid content' status shown as a possibility in the API documentation). And the level of oversight over the volunteer students doing the TreeBuilding Project scripting appears to be very limited.

    1
  • Adrian Bruce1
    Adrian Bruce1 ✭✭✭✭✭
    September 8

    @MandyShaw1 - I can cope with the concept of the submitted residence not being a standardised place - after all, that's how everything was to start with. But the idea that the submitted residence is just something like "1266 Main St" with no indication of the town / city, is appalling and shows that the people concerned don't understand the first thing about what's supposed to be in the Tree. (Please note that if the fingers on keyboards have never been taught this, I can't blame them, it's a management issue).

    3
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    September 8 edited September 8

    @Adrian Bruce1 Precisely - the lack of management oversight has to be the worst thing about this whole mess, and the thing that both FamilySearch and BYU RLL are most at fault for. And, if these students are looking for careers in IT, they are going to need to learn better practice very quickly.

    2
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    September 13 edited September 13

    More on the meaningless Residence values for WW2 Draft Registration Cards.

    In my analysis database there are 4 profiles that had these sources added by TreeBuilding Project on 6th September. All have these meaningless Residence addresses, of which all but one have the 'Non-Standardised Place' message showing.

    However that one (G6Z1-8FK) was subsequently (7th September) changed by the FamilySearch user to add a standardised place. This is something that appears to have happened to nearly 2,000 Residence events on profiles in my database (mostly untagged), with the updates occurring from June 2021 up to date. Looks like there has been a FS bot running regularly against FT tidying up missing standardised places. Maybe BYU are assuming FS will clean up after them in this way, given that they can't be bothered to put an accurate value in place to start with (noting that full Residence address information is definitely present in the record persona metadata).

    (Incidentally the only other Residence tags I can find in my database that were added by TreeBuilding Project are Numident ones, and they appear to have had sensible place names in place from the start.)

    2
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    September 25 edited September 25

    Some more general research on the TBP WW2 Draft Registration automation.

    I have 97 of these in my analysis database, with the sources added (all to existing profiles) between 6th September and my latest refresh date of 21st September.

    First I had a look at how well the source attachments actually matched the 97 profiles, using the relevant Record Hints as a basis.

    1 of the attachments, on profile GXCD-278, is a complete mismatch (and the Record was already attached to the correct profile). There must have been a glitch in the script.

    4 of the attachments correspond to existing Hints that someone had already marked Dismissed (the automation script could easily have checked this point).

    17 of the attachments do not correspond to Hints at all. In all these cases either the names don't seem to fully match or the birth date is in the 19th century, but the ones I looked at appeared fairly valid.

    More importantly the duff Residence issue remains on many profiles. On others an attempt has been made to standardise the place at attachment time, but using the first matching place produced by the Places API, which has given some really awful results. See below for details of the first 25 of the Residence Events in alphabetical order of PID.

    It occurs to me also that all these Residence Events are also (and worse) flawed in not having any dates on them, standardised or otherwise. 7 of the attachments didn't involve the addition of Residence events/tags; this appears to have happened where there was an existing dateless Residence event.

    Do people think we can/should report this script as abusive?

    Profile

    Record Persona

    Residence Event Place: Original

    Residence Event Place: Standardised (if any)

    Other relevant Labels present (if any)

    Comments

    2Z36-QNT

    QGTQ-VY8L

    Des Moines

    EVENT_PLACE_ORIG

    Des Moines, Polk, Iowa, United States

    EVENT_PLACE

    Des Moines, Polk, Iowa, United States

    no standardised place yet

    9KDX-VZ2

    QGF9-XC9L

    Norwalk

    EVENT_PLACE_ORIG

    Norwalk, Los Angeles, California, United States

    no standardised place yet

    9SBM-WY2

    QKV9-9QTP

    Kingman Reef, United States Minor Outlying Islands

    Kingman Reef, United States Minor Outlying Islands

    PR_RES_CITY_ORIG

    Kingman

    PR_RES_CITY

    Kingman

    EVENT_PLACE_ORIG

    Kingman, Mohave, Arizona, United States

    EVENT_CITY_ORIG

    Kingman

    standardised at attach time, but using first Kingman place. See existing EVENT_PLACE_ORIG

    9VH6-4PF

    QGF4-4JS9

    Lindsay

    EVENT_PLACE_ORIG

    Lindsay, Tulare, California, United States

    no standardised place yet

    G456-58M

    QKVM-6P22

    Denver

    EVENT_PLACE_ORIG

    Denver, Denver, Colorado, United States

    no standardised place yet

    G46Z-VYV

    QVR2-RGHQ

    Jonas Ridge, Burke, North Carolina, United States

    Jonas Ridge, Burke, North Carolina, United States

    PR_RES_CITY_ORIG

    Jonas Ridge

    EVENT_PLACE

    Jonas Ridge Township, Burke, North Carolina, United States

    EVENT_CITY_ORIG

    Jonas Ridge

    standardised at attach time, PR_RES_PLACE_ORIG appropriately used

    G47Z-X29

    QGPN-BKLM

    Kalispell

    EVENT_PLACE_ORIG

    Kalispell, Flathead, Montana, United States

    EVENT_PLACE

    Kalispell, Flathead, Montana, United States

    no standardised place yet

    G49L-D4J

    Q2SJ-H3F2

    Middletown

    EVENT_PLACE_ORIG

    Middletown, Dauphin, Pennsylvania, United States

    no standardised place yet

    G5HX-4CF

    QPZH-CYXK

    Clinton

    EVENT_PLACE_ORIG

    Clinton, Dewitt, Illinois, United States

    EVENT_PLACE

    Clinton, DeWitt, Illinois, United States

    no standardised place yet

    G5P5-FG4

    QKVM-2VFN

    Fort Lupton

    Fort Lupton, Weld, Colorado, United States

    standardised by FS bot using existing EVENT_PLACE_ORIG

    G5W5-F76

    QP48-M2LS

    Freeport

    EVENT_PLACE_ORIG

    Freeport, Stevenson, Illinois, United States

    EVENT_PLACE

    Freeport, Stephenson, Illinois, United States

    no standardised place yet

    G5YC-K4N

    XPJF-722

    Beaumont, Jefferson, Texas, United States

    Beaumont, Jefferson, Texas, United States

    PR_RES_PLACE_ORIG

    Beaumont, , Texas

    EVENT_PLACE_ORIG

    Beaumont, , Texas, United States

    EVENT_CITY_ORIG

    Beaumont

    standardised at attach time using first Beaumont Texas place match, which happens to look correct in this case

    G6Z1-8FK

    QPTM-4ZYW

    Muskegon Heights

    Muskegon Heights, Muskegon, Michigan, United States

    standardised by FS bot using existing EVENT_PLACE_ORIG

    G72D-28Q

    QL3N-MRL1

    Floresville

    EVENT_PLACE_ORIG

    Floresville, Wilson, Texas, United States

    no standardised place yet

    G7BR-616

    7QCM-9N6Z

    Cranston, Providence, Rhode Island, United States

    Cranston, Providence, Rhode Island, United States

    PR_RES_PLACE_ORIG

    Cranston, Rhode Island

    PR_RES_PLACE

    Cranston, Rhode Island

    PR_RES_CITY_ORIG

    Cranston

    PR_RES_CITY

    Cranston

    EVENT_PLACE_ORIG

    Cranston, , Rhode Island, United States

    standardised at attach time matching existing EVENT_PLACE

    G7HZ-VJK

    QP8P-Q6YT

    Derry

    Derry Village, Derry, Rockingham, New Hampshire, British Colonial America

    EVENT_PLACE_ORIG

    Derry, Rockingham, New Hampshire, United States

    EVENT_PLACE

    Derry, Rockingham, New Hampshire, United States

    standardised by FS bot, though apparently incorrectly

    G7KB-1CT

    QGF9-JJFQ

    Stockton

    EVENT_PLACE_ORIG

    Stockton, San Joaquin, California, United States

    no standardised place yet

    G7WJ-1H5

    Q2B8-B8QG

    Elk City

    EVENT_PLACE_ORIG

    Elk City, Beckham, Oklahoma, United States

    no standardised place yet

    G8VV-Y94

    QLXY-5BV3

    Springfield

    EVENT_PLACE_ORIG

    Springfield, Greene, Missouri, United States

    no standardised place yet

    G9JP-B6Z

    Q2CT-M5ZH

    Meriden

    EVENT_PLACE_ORIG

    Meriden, New Haven, Connecticut, United States

    no standardised place yet

    G9NR-PN7

    QLFM-DTWP

    Salisbury, Guam

    Salisbury, Guam

    PR_RES_CITY_ORIG

    Salisbury

    PR_RES_CITY

    Salisbury

    EVENT_PLACE_ORIG

    Salisbury, Chariton, Missouri, United States

    EVENT_CITY_ORIG

    Salisbury

    standardised at attach time, but using first Salisbury place. See existing EVENT_PLACE_ORIG

    G9YD-QDL

    QP8G-WHHH

    County Clare, Ireland

    County Clare, Ireland

    PR_RES_CITY_ORIG

    Clare

    PR_RES_CITY

    Clare

    EVENT_PLACE_ORIG

    Clare, Clare, Michigan, United States

    EVENT_PLACE

    Clare, Clare, Michigan, United States

    standardised at attach time, but using first Clare place. See existing EVENT_PLACE_ORIG

    GB6Z-M7X

    Q2C5-3M8R

    Port St Joe

    Port St. Joe, Gulf, Florida, United States

    EVENT_PLACE_ORIG

    Port St Joe, Gulf, Florida, United States

    standardised by FS bot using near match to existing EVENT_PLACE_ORIG

    GB82-XDS

    Q2B4-V92D

    Mangum

    EVENT_PLACE_ORIG

    Mangum, Greer, Oklahoma, United States

    no standardised place yet

    GBV7-F8F

    QG6D-LNY4

    Helix

    Helix, Umatilla, Oregon, United States

    standardised by FS bot using existing EVENT_PLACE_ORIG

    1
  • Adrian Bruce1
    Adrian Bruce1 ✭✭✭✭✭
    September 25

    Sadly(?) I've never got the idea that incompetence counts as abuse… Especially if FS regards more data as success…

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    September 25

    But this is not manual edit incompetence from inexpert/uninformed contributors, it is high volume stuff from a team who should know better ... and the 'u's added recently did count as abuse.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 8 edited October 8

    I think I may have found another angle.

    I have been investigating the standards that apply to organisations whose applications make modifications to the Family Tree via the FS APIs, in particular this: https://developers.familysearch.org/main/docs/contributing-to-the-familysearch-family-tree

    BYU RLL are as far as I can see the only people who update existing Family Tree profiles using the APIs without the change-by-change user confirmation that is clearly expected by this standard. As we know, they have also inserted many, many profiles into Family Tree, without user confirmation, where there was a high confidence match (i.e. what would become a FS-flagged duplicate) present already, which again is clearly not compliant with the standard.

    (Incidentally it seems to me that the FT gedcom import mechanism does comply with this standard.)

    I have asked Support to find out for us whether this standard applies to BYU’s projects, and, if not, why not?

    Separate point: it's clear that none of the abuse criteria recently identified for us by a mod on

    https://community.familysearch.org/en/discussion/181805/a-bunch-of-generated-empty-profiles-all-with-name-u

    (though the relevant comments seem, disappointingly, to have been deleted) cover any of BYU RLL's activities in any way.

    3
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 15

    No news from Support as yet. I'll leave it another week, then try an abuse report anyway, on squeaky wheel principles - what harm can it do? I have to assume (given FT is not just one of FS' biggest assets but one of the largest collaborative data stores in the world) that there is someone in FS who cares about information governance enough to know this situation is unacceptable, maybe they just haven't heard our message yet?

    2
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 27 edited October 27

    Here's my proposed abuse report. Comments please - current plan is to submit it next Monday. People's favourite (?) examples would also be useful.

    (Edit - on second thoughts I have removed all reference to oversight - it's obvious it's inadequate, but we can't prove it …)

    Over the last few years the BYU Record Linking Lab (RLL, https://record-linking-lab.byu.edu/ ) has added and modified high volumes of entries to the FS Family Tree.

    This abuse report relates to the fact that these projects, which use the FamilySearch APIs via automated scripts, have resulted in very large numbers of FT data quality problems, in particular profile duplication and inappropriate source attachments.

    The relevant user names are USCensusProject, CommunityCensus Project, and TreeBuilding Project.

    We accept that inappropriate changes to the Tree are not normally seen as abuse. BYU RLL’s projects, however, do not work with the Family Tree as ordinary users do. This is a matter both of scale and of accountability.

    One would expect the RLL team’s work to be of a high standard; in practice, however, teams of student volunteers use the APIs to create or change many profiles at a time (millions, in some cases),

    • without complying with FS’ published API usage standards (see https://www.familysearch.org/developers/docs/guides/implementation-cert , and, in particular, https://developers.familysearch.org/main/docs/contributing-to-the-familysearch-family-tree ),
    • without taking existing data (or indeed the needs of other users) into account, and
    • frequently in a manner indicating lack of care in design, implementation, and/or testing (for examples, see https://community.familysearch.org/en/discussion/comment/608281/#Comment_608281).

    Note that the student volunteers (see https://universe.byu.edu/2023/02/27/byu-professor-connects-the-human-family-with-research-lab/ ) appear to have been asked to ‘grow the tree in any way [they] could’.

    BYU RLL’s stated approach (https://community.familysearch.org/en/discussion/comment/573042/#Comment_573042) is to automate many of the inserts and changes to a certain point, with the profiles then being left, silently, and often in an internally inconsistent state, for manual editing by other volunteers. (One widespread example is the use of married surnames when creating Census wife profiles.) This ‘halfway house’ approach was clearly never realistic as an overall solution, given the vast numbers of profiles involved.

    We have made considerable efforts, going back to September 2024, to raise these matters with BYU RLL themselves; via FS Support; and (as recommended by Support) via Suggest an Idea. Despite some initially promising interactions, we have realistically made no progress at all.

    We can provide detailed examples as required.

    4
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 28 edited October 28

    Re the abuse report, after sleeping on it I've concluded it would be better to add concrete examples and any helpful summary statistics in the initial submission, given we undoubtedly have only one chance to get our message across. I'll draft something asap and post it here for any comments.

    Once we have the final text, I propose to also copy it, for information, to Support. If we still get nowhere we can widen the circulation, e.g. DQS team.

    Incidentally, about the fact that we still have no idea which FS team is BYU's liaison on this and approves their ideas. One of Joe's comments in this thread makes it clear that such approval is part of the process (at a high/conceptual level only, I have to assume). I think our failure to identify, and therefore to communicate with, these people might be worth mentioning on the report.

    2
  • melanes
    melanes ✭✭✭
    October 29

    I agree. I think concrete examples would be good. I ran across several examples in just the last week alone - merging at least 12 duplicates created by the users/projects mentioned, poor data entry for locations and names, etc. Unfortunately, I am working on an active research case and had to spend the time correcting everything in order to use FamilySearch to its best advantage - relevant record hinting on properly created profiles. If I run across anymore, I will take note so I can provide them.

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 30

    Thanks @melanes, I am working on a draft with evidence included and will post it later today for review and also so that people can plug in more and/or better examples.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 21
    https://community.familysearch.org/en/discussion/comment/610817#Comment_610817

    Just a quick update to say this is taking a lot longer than I hoped.

    2
«1…45678910»
Clear
No Groups Found

Categories

  • All Categories
  • 44.7K Ask a Question
  • 3.6K General Questions
  • 598 FamilySearch Center
  • 6.8K Get Involved
  • 676 FamilySearch Account
  • 7K Family Tree
  • 5.5K Search
  • 1.1K Memories
  • 504 Other Languages
  • 66 Community News
  • Groups