Home› Welcome to the FamilySearch Community!› Ask a Question› Family Tree

Is the user "TreeBuilding Project" taking the tree forward or wasting time?

Options
  • Mute
«123456789»

Answers

  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 25, 2024 edited October 25, 2024

    @Paul W agreed on all counts (and thank you), though BMD records in E&W post 1837 do seem more logical and structured (even if a bit lacking in detail) when compared with the US.

    I'm just doing some more analysis (also using the BYU RLL Census Tree Dec 2023 snapshot which I found in the Computer-Generated Trees section of FS Genealogies) to look at these delays in more detail. More on this soon.

    0
  • melanes
    melanes ✭✭✭
    October 27, 2024

    I know that Rootstech is a few months away. If anyone is planning to attend, it would be worth the effort to try and talk to FamilySearch staff in person. Clearly the emails are easy for them to ignore or set aside as not that big of a deal. Joe clearly has connections with FamilySearch. I'm assuming that FamilySearch had to have approved these projects because the BYU RLL are using scripts (maybe API's) otherwise the activity would be flagged by systems engineers. The registering of dozens of "volunteer" accounts on a regular basis would normally be flagged as sock puppets and yet they are allowed to exist. It is also appears that FamilySearch is not reviewing these types of projects on a regular basis, both for outcomes and data integrity, or are not communicating about it. This kind of activity, despite not originating from FamilySearch, reflects poorly on an otherwise amazing service and platform.

    And while these projects are full of good intentions, they are being done against good standards of practice established by genealogy organizations around the world. It's easy to become enthusiastic by advances in technology and assume that technology will make the work easier. We are seeing just the opposite.

    3
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 28, 2024 edited October 28, 2024

    @melanes BYU has applications (though not re this functionality) listed on the partner solutions directory, which would I guess give them formal access to the update APIs; and anyway I get the impression that there is a long and close relationship between FS and BYU, with the latter doing a lot of the research and providing academic underpinning and recognition.

    I would love to go to RootsTech (for lots of reasons, not just bending people's ears about BYU RLL and data quality - the sessions I joined online last time were excellent) but as I live in England it's not realistic.

    Completely agree with your 'technology' point - just because you /can/ do it doesn't mean you /should/. FS, in my honest opinion, really need to keep far more of a beady eye on this.

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 30, 2024

    Further email sent to FS Support Europe:

    'In response to your points below:

    ‘We at FamilySearch Support Europe have no specific access to anyone in the BYU Record Linking team beyond the email we have already supplied … Perhaps you do not fully understand how the projects work …’

    You clearly know considerably more than we have ever been told - surely you must have FS contact(s) inside or outside the Support team who would be able to help us?

    To be clear, the married name point is just one of the issues we have encountered, many of which relate to communication rather than to any functionality or information matter.

    Based on the extensive discussions on FamilySearch Community, plus some detailed data analysis and some use of publicly available BYU RLL tools, I have put together a problem summary document which will hopefully help you identify a contact for us: see Numident summary.pdf (please let me know if you have any trouble using this link).

    Many thanks for your responsiveness and help.'

    (I've also updated the summary document, as linked to in the above email text, quite a bit - as always, comments gratefully received.)

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 30, 2024 edited October 30, 2024

    Here are some census married name vs maiden name statistics:

    Description

    Number

    Comments

    Total profiles in analysis database that are set as Wife on at least one attached census record persona

    6882

    Of which created by USCensusProject or CommunityCensus Project

    551

    the below values refer only to these

    With at least one husband present

    544

    Husband present and same surname

    440

    potentially require work

    Husband present and different surname

    104

    potentially correctly set to maiden name

    With at least one set of parents present

    52

    clear need to identify parents for as many as possible of the others

    With at least one duplicate flagged

    50

    clear need to either merge or state not a match

    Husband present and same surname, father present and same surname

    0

    so no-one is known to have married person with same surname

    Husband present and same surname, father present and different surname

    1

    potentially fixable married name -> maiden name

    Husband present and same surname, duplicate flagged and same surname

    9

    may have married person with same surname but needs checking

    Husband present and same surname, duplicate flagged and different surname

    28

    potentially fixable married name -> maiden name

    Husband present and different surname, father present and same surname

    51

    some minor spelling differences

    Not changed by anyone since 1 hour after creation

    278

    so just over half of the 551 have not been touched

    Not changed by anyone since creation in 2021

    47

    Not changed by anyone since creation in 2022

    163

    Not changed by anyone since creation in 2023

    67

    I have also been looking at the Dec 2023 Census Tree in Genealogies (https://www.familysearch.org/search/genealogies/submission/10000130/MMJZ-JR7). This could definitely be referenced to plug more of the maiden name gaps. Its algorithms link census records for different years, and in some cases, while the linking looks accurate, the records are attached in FT to profiles with different surnames that aren't necessarily flagged as duplicates.

    The Genealogies-provided source display for each Census Tree entry is really useful. See https://www.familysearch.org/service/gen/sforge/hints-view.html?personId=/ark:/61903/2:5:7JSN-NDY for an example (which does demonstrate the obvious dodginess of some of the linking). It helpfully looks up the source on FT for you and gives you a link to whatever it finds (and the profile links are all up-to-date, too). So a thank you to BYU RLL for providing this data set.

    I might have a go at fixing the profiles referenced in my table above myself sometime, in which case I will post what I find.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    October 31, 2024 edited October 31, 2024

    Census Tree statistics following from previous comment:

    Description

    Number

    Comments

    Tinkham surnames among the 551 profiles from previous comment

    42

    FS fuzzy surname matching

    Surname matches husband's surname exactly

    32

    Of these 32, Census-Tree-identified duplicates with different surnames

    4

    Ditto with no parents or FS duplicate flags present on BYU RLL-created profile

    4

    demonstrating usefulness of Census Tree in obtaining hints

    Apparently genuine duplicates with different surnames (manual investigation)

    3

    potentially fixable married name -> maiden name (via merge)

    I shall definitely be trying the Census Tree in future when I can't identify a maiden name and the dates make it sensible.

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 1, 2024

    Well, I am distinctly underwhelmed to report that FS Support Europe have responded as follows:

    'As we said in our latest email we at FamilySearch Support Europe have no specific access to anyone in the BYU Record Linking team beyond the email we have already supplied. If you are not receiving replies, we suggest you contact BYU more generally via [the main BYU website].'

    I propose to write back pointing out that (as clearly indicated by my summary document) there are 2 problems, a) BYU's damaging updates, and b) FS' lack of control over them, and that therefore, wherever the conversation with BYU may go, we still need to escalate this within FS.

    Meanwhile I will try the main BYU enquiries channel also.

    Anyone got any better ideas? (Does someone want to give Report Abuse a try? I am not optimistic, but you never know until you try, and one of you might have a lot more clout than me.)

    0
  • Paul W
    Paul W ✭✭✭✭✭
    November 1, 2024 edited November 1, 2024

    @MandyShaw1

    Whilst I (and I'm sure many other FT users) are supportive of the efforts you are making on the issue, it was stated from early on (by Professor Joe) that his projects have the full backing of FamilySearch management. As we have found with many other issues that have caused us concern in the past, there is no direct channel of communication with FS engineers, let alone management, so we are completely in the hands of support staff when wanting any issue escalated to that level. I am not optimistic about that happening here, I'm afraid.

    1
  • Áine Ní Donnghaile
    Áine Ní Donnghaile ✭✭✭✭✭
    November 1, 2024

    @Paul W I have to agree here. As I stated on another thread, the Report Abuse channel tests our patience even on issues that are proven to be abuse. I foresee the chance of snowballs in a hot place in this instance.

    0
  • JD Cowell
    JD Cowell ✭✭
    November 1, 2024 edited November 1, 2024

    Back in 2022 when I first noticed how much of a problem these projects created, I reported several profiles for abuse. I got these responses from Data Administration:

    9/8/2022 "Thank you for bringing this to our attention. We have passed the feedback along to those over the project." (received twice when reporting two different duplicates of the same person)

    9/14/2022 "Thank you for bringing this to our attention. Appropriate action will be taken."

    10/3/2022: (received twice when reporting a husband and wife) "We have reviewed a record that you reported in Family Tree as containing inappropriate content or abuse and have determined that this situation does not qualify as abuse.

    Types of inappropriate content to report might include the following:

    Offensive or abusive language or content
    Information that might harm or embarrass living relatives
    Links to external web pages with inappropriate content
    Solicitations for businesses or research services
    Harassment
    Political statement
    Copyright infringement

    Please do not use the Report Abuse feature to report inaccurate information about individuals or families, such as incorrect names or dates, or to request that the record be deleted or corrected. To correct these errors, work with the other contributors by using the discussions or internal messaging features, or use the Help feature in Family Tree to report your concerns."

    Responses like the latter make me extremely cynical about whether Data Administration has any interest at all in addressing these issues. Clearly they were lying to me when they said "appropriate action will be taken", as the problem is ongoing, and I have my doubts about whether they actually passed the feedback along to the project in the first place, although it's also clear that these projects ignore this type of feedback on a regular basis.

    2
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 1, 2024

    Thank you all for your input. I do entirely see all your points.

    It's sadly entirely clear that FS doesn't rein in any of BYU RLL's activities - what I am asking myself is, are the relevant FS people aware of the full current implications of activities to which they have at some point in the past given the green light? Which is why I am inclined to have one more go at escalating this, followed, if necessary, by submitting a formal complaint.

    It feels to me as if BYU RLL could achieve many of their objectives in a much less damaging way if they only communicated and collaborated properly with the rest of the FT community. That is something FS could make happen, I'd have thought.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 2, 2024

    I decided to re-send my previous email (of Sunday 13th October) to BYU RLL 'in case it had got lost in transit'. I will give them a few days to respond.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 3, 2024

    I have submitted my thoughts on how duplicate handling could be improved for automated mechanisms to Suggest an Idea, given that Ideas do now seem to be 'sticking', even if most of them aren't visible to (most?) Community users.

    https://community.familysearch.org/en/discussion/168945/handling-of-duplicates-by-automated-family-tree-profile-creation-mechanisms
    0
  • Áine Ní Donnghaile
    Áine Ní Donnghaile ✭✭✭✭✭
    November 3, 2024

    Since recent Suggestions have all disappeared, I hope you kept a copy. I believe the only ones that have "stuck" were posted during the weekend hours when staff is not on duty.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 3, 2024

    The text is in my 'numident summary' document (and is also quoted in the comment above, in fact).

    The Idea I submitted in September never got saved at all as far as I could see; the more recent ones show Permission Problem after a bit, but at least they appear to have been saved somewhere.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 4, 2024

    Hot news folks, we have had an answer from Joe to the email I sent twice to BYU RLL. As follows.

    'I'm responding to your email to rll@byu.edu. Normally, those emails forward directly to my BYU email but I'm not sure why they didn't with your emails on Oct 13th and Nov 2nd. I just noticed them by chance while checking on something else.

    Maybe what might help best is to just summarize your main concerns about the approach that we are taking and I can work with FamilySearch to find the best solution. I want to be respectful of your concerns and I will do all I can to respond and adjust our approach. 

    We would also be open to ideas for how you would ensure that everyone in the Numident (or other important datasets) are on the Family Tree. We don't want anyone to be missed. There are still 1.5 million families in the Numident where no one in the family seems to have a direct match to the Family Tree (based on the FamilySearch match files). If you would like to propose a way to add those families to the Family Tree, I would be certainly open to your ideas and your help.'

    I think this is really quite positive. I will start drafting a response (and post it here before sending it, obviously), but all thoughts very welcome.

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 9, 2024 edited November 9, 2024

    Here's my proposed response to Joe's first point (second one to follow). Comments please!

    Point 1:

    Maybe what might help best is to just summarize your main concerns about the approach that we are taking and I can work with FamilySearch to find the best solution. I want to be respectful of your concerns and I will do all I can to respond and adjust our approach.

    Communication

    There appear to be no communication channels in place between BYU RLL and the rest of the FT user base for either of the following:

    Publicising the aims and workings of BYU RLL projects to the wider FT user base (LDS and non-LDS), and setting expectations.

    Reporting and resolution of issues, including non-profile-specific concerns such as those listed here.

    Accountability

    BYU RLL projects don’t currently demonstrate the expected collaborative approach to editing the Family Tree:

    BYU RLL needs to respond meaningfully and in a timely manner to messages from other FT users.

    Appropriate reason statements need to be put on all changes made by BYU RLL.

    Failure to take current FT information into account

    Those using the Power Linker, Source Linker via BYU RLL hint emails, etc., don’t necessarily understand the impact a resulting change would have on the FT before they make or approve it:

    Just because a match looks great in the Power Linker (or indeed the Source Linker) does not mean that there is no contra-indication somewhere else in the FT profile's data, in its Research Helps (or Profile Quality Score guidance), or in an alert.

    Starting from a BYU RLL-chosen candidate match will always mean that the user has had no chance to judge whether this is the best match available to existing FT data.

    Timing problems and best practice

    We understand from FS Support that it requires multiple interventions to implement some BYU RLL changes, often with considerable delay between the steps.

    These situations may leave a profile in a state that does not reflect genealogy best practice, potentially for a long time (in my analysis database, over 50% of the ‘census wife’ profiles created by BYU RLL using married names have not been touched since - including many profiles created in 2021 and 2022).

    Such pending work is in no way communicated to other FT users, resulting in confusion and time-wasting (while FT reason statements and notes/Alerts are available to help with this, they do not appear to be used).

    Additionally, changes that may have been made by others in between BYU RLL visits are frequently not taken into account by BYU RLL’s activities.

    Handling of duplicates by BYU RLL profile creation

    We understand that BYU RLL uses FS’ standard duplicate flagging algorithm in its decision making.

    It appears that FS tunes the standard duplicate algorithm to minimise the flagging of duplicates that aren’t actually matches, i.e. to minimise false positives.

    But BYU RLL’s profile creation needs to be able to minimise false negatives, thereby protecting both the integrity of FT and the experience of its other users.

    The provision and use of a differently configured duplicate flagging algorithm therefore seems appropriate.

    A pre-create 'check for duplication' API (or the option to ‘reject if duplicate’ on the Create Person API) would also be beneficial.

    3
  • melanes
    melanes ✭✭✭
    November 11, 2024 edited November 11, 2024

    @mandyshaw1 I think it is a well written response. I have underlying philosophical problem the the lab and the work they are doing, but it's probably best not to argue that. Mentioning genealogical best practices is a good start.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 11, 2024 edited November 11, 2024

    Thanks @melanes, & also to the 'likers'.

    Extra para to go on the end of the 'duplicates' section:

    Even the standard duplicate algorithm will work better the more information it is given. The timing problems mentioned in the previous section may well, therefore, lead to unnecessary delays in identifying duplicates. One obvious example is that profile creation from Numident records appears initially to ignore the frequent treasure trove of alternate names present on the record as it follows the individual through time.

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 13, 2024 edited November 13, 2024

    Proposed response to Joe's second point follows. I'm planning on sending both halves off to him tomorrow afternoon unless anyone has any objection.

    Point 2:

    We would also be open to ideas for how you would ensure that everyone in the Numident (or other important datasets) are on the Family Tree. We don't want anyone to be missed. There are still 1.5 million families in the Numident where no one in the family seems to have a direct match to the Family Tree (based on the FamilySearch match files). If you would like to propose a way to add those families to the Family Tree, I would be certainly open to your ideas and your help.

    My initial thoughts on this follow (and may or may not add anything!)

    I assume that:

    You have already created these 1.5 million FT profiles, and that they are currently in small family groups (‘triplets’ etc.) – as you know, we are seeing this a lot.

    You are looking for merge opportunities that would link your family group accurately to the main Tree.

    I did an experiment using some of the 40 triplets created by TreeBuilding Project that I found within my analysis database.

    I successfully used the following methods to build up the evidence needed to identify potential linkages (which would obviously require detailed review before any action was taken):

    1.Flagged duplicates. The standard algorithm minimises false positives, but they remain the best place to start.

    2.’Research Helps’ that are already attached to other profiles (Research Helps are also accessible to certified partner solutions as webhints).

    3.’Similar Records’ on the Numident that are already attached to other profiles.

    4.Additional information from the Numident metadata.

    5.’Find Similar People’, with judicious use of Exact/wildcards.

    6.Lookups on the Census Tree in FS Genealogies (for possible links that FS’ algorithms can’t see).

    7.Find a Grave entries (for possible links between profiles, though obviously to be taken with a pinch of salt).

    8.‘Research Helps’ not already attached to other profiles (for potential additional evidence).

    My examples are here: Numident evidence collection examples.pdf

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 13, 2024 edited November 13, 2024

    This thread has regained its correct date on my Bookmarks and on the main Discussions list!

    (Edit, having checked re this comment) …. or perhaps not.

    0
  • Áine Ní Donnghaile
    Áine Ní Donnghaile ✭✭✭✭✭
    November 13, 2024 edited November 13, 2024

    It's on Page 1 of Recent Discussions, just barely, on my 34 inch monitor:

    image.png

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 13, 2024 edited November 13, 2024

    Yup, back to its old tricks, my Bookmarks say it was last changed by you at 1.05pm (GMT).

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 14, 2024

    The more I think about this the more I realise that bulk insert activities, if they are ever to work in a non-disruptive way, have to have more APIs available to them; just as one example, allowing identification of any attached profile(s) on a given record persona. (Plus the 'duplicate precheck' I have banged on about previously.)

    I can see these connections in my analysis database because I have already pulled the data, in particular re one specific family who happen to have meaningful but manageable amounts of data, and can therefore match this information at the database level, but that's not in any way a usable mechanism for actual bulk inserts.

    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    November 17, 2024

    Here's the response I finally sent to Joe: BYU RLL response.pdf

    I'll keep you all posted.

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    December 8, 2024

    Fyi I have had no response bar a quick reply from Joe confirming that what I had sent was what he wanted.

    1
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    December 13, 2024

    This thread is relevant and insightful:

    https://community.familysearch.org/en/discussion/170589/record-linking-labs-5-a-day-project-tips
    0
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    January 5

    Has this problem quietened down, or are people just no longer reporting it?

    I don't propose to chase Joe until early March, to give him a chance to discuss (and hopefully action) our submission internally.

    0
  • melanes
    melanes ✭✭✭
    January 6

    I am still finding examples from work done by users called TreeBuilding and whatnot as recently as October/November 2024. I just haven't taken the time to report it here. I cleaned up a mess just yesterday involving multiple duplicates in the same family because the volunteers don't understand the context in which they are working.

    3
  • MandyShaw1
    MandyShaw1 ✭✭✭✭✭
    January 6

    Thanks @melanes.

    0
«123456789»
Clear
No Groups Found

Categories

  • All Categories
  • 43.3K Ask a Question
  • 3.4K General Questions
  • 576 FamilySearch Center
  • 6.8K Get Involved/Indexing
  • 653 FamilySearch Account
  • 6.6K Family Tree
  • 5.2K Search
  • 1K Memories
  • 2 Suggest an Idea
  • 480 Other Languages
  • 62 Community News
  • Groups