Quality Score is misleading and counterproductive

TeresaWilson95 · December 2, 2024

In all honesty, the Quality Score is misleading and counterproductive.

I'm working on two families in my tree where the relationships are totally messed up: married to the wrong people (in one instance, mother-in-law attached as wife), not the correct children, children attached to wrong spouse (in multiple marriages), two different spouses with the same name (but not the same person), to name a few major problems. And yet, these records have a "High Quality" score.

It doesn't matter how "complete the data, tagging or consistency" assigned by the algorithm, if the relationships are not correct then the tree is largely ineffectual.

Not to mention, someone could see a record as "High Quality" and assume there are no problems. Or worse, justify why they undo corrections. (I'm waiting for that to happen on one of these trees. My cousin and I spent months researching out 4th (his 3rd) GG parents who've been "married" to the wrong spouse for who knows how long. I even documented the evidence in Memories using the records in Sources and spent a couple of days cleaning up the families on FamilyTree. Someone comes along deletes it all, justifying their changes on one record, and telling me I didn't know what I was talking about. This would be the same individual who would use the "High Quality" rating to justify their "beliefs" about their view of the family is correct, regardless of the evidence.)

My vote is to do away with it completely. Some of the info will never be available for individuals before 1900 due to lack of record keeping or loss of records. Some of us are only lucky enough to know the state where births/deaths/marriages took place before 1900. And in the 20th century, vital records are not publicly available in some states.

To me it's more important to have the relationships correct.

Anne986 · December 2, 2024

I would also prefer to eliminate the Quality Score or at least make its appearance optional.

Julia Szent-Györgyi · December 2, 2024

Given the problems even within just the rather narrow timeframe and geography that this rating scheme has currently been applied to, I see nothing but trouble ahead.

Yes, there's a "dismiss" option — but it's in the user-interface equivalent of a closed box in the back of a drawer that's hidden behind two cabinet doors. And dismissing the non-error on one family member does nothing to the same non-error on other affected family members, meaning the user has to open the cabinet, pull out the drawer, open the box, and write up why the computer is wrong again. The ratings will thus divide FS's users into two groups: those who blithely accept the computer's word, and proceed to propagate every misidentification, conflation, index error, misstandardized location, and other problem; and those whose blood pressure skyrockets every time they look at FS, and therefore start instinctively staying away.

And all of that leaves aside the other end: as Teresa said, what about all the incorrectly connected, nonsensical families? What good is a profile that perfectly matches the Ellis Island index, the draft registrations, and the SSDI, when the person in question never left Europe? What meaning is there to a "high" rating on someone's supposed-ancestor when the attached birth-and-baptism is for a child who died as an infant?

I agree fully with Teresa: this score thing is misleading and highly counterproductive. Please turn it off by default, or at bare minimum, allow users to opt out of the frustration.

Michael J Roueche · December 3, 2024

I do not think the quality rating as currently used is effective, accurate or helpful. I believe it could also be harmful as people seek to increase the rating when no other accurate records exist to date.

Yesterday I was updating a family where people had added two people who did not belong to the family. The quality rating for both was high. I also note that well researched and documented entries receive a medium score because no familysearch indexed record is attached.

Have you considered using AI to search each entry to determine whether the research and attachments are good and accurate. Maybe base a score on that.

Right now it's just a check the box rating, and it's easy to check boxes and still be totally wrong.

Adrian Bruce1 · December 3, 2024

@Michael J Roueche suggested "… Have you considered using AI to search each entry to determine whether the research and attachments are good and accurate. Maybe base a score on that. … "

Please no. Since AI is well known for hallucinations (apparently, yes, they are called that) it seems a little strange to suggest checking the quality of human research by running a check using a facility known to produce errors.

Adrian Bruce1 · December 3, 2024

"… the Quality Score is misleading and counterproductive. … "

Sorry, I can't agree. In the first place, the term "counterproductive" needs to be justified with actual examples of researchers chasing quality scores and creating inserting information that is wrong because of that chase. Otherwise, I'm sorry but "counterproductive" is just anecdotal.

"Misleading" - I think that is potentially, partially true. To a large extent that's because it's not clear to me that we all agree what the quality score is a reflection of.

One possibility is that it is a measure of the absolute quality of the data. Another is that it's a measure of how well the researcher has done.

To take a reasonably specific example of someone born in England & Wales during the 1800s. If they are born after mid-1837, their birth date should be available, and let's say for the purposes of this argument that a specific birth date should appear to get a high quality rating.

If they are born before mid-1837, then there's only a small chance that their birth date is available.

If a pre-1837 profile has no birth date at all, then should we give it a high quality rating simply because such date aren't normally available? I would say no - the absolute quality of the data of the profile with no birth date should not change depending on the date, ie depending on whether civil registration was in place or not. If a birthless profile is only medium quality after 1837, it should be medium quality before 1837 as well.

The argument that "I shouldn't be penalised because the data isn't there in the first place" is, to my mind, irrelevant to a measure of the absolute quality. (After all, there are parish registers which do include exact birth dates pre-1837 - surely we should be able to recognise the higher quality of such data?)

If, on the other hand, we believe that the quality should be a measure of how well the researcher has done… Well, I think that's a seriously misguided idea which would lead to people chasing "gold stars"… To say nothing of how one could possibly set a metric for doing well. You'd need really detailed knowledge to know that someone baptised pre-1837 in Wybunbury cannot be expected to have a birth date whereas someone baptised pre-1837 in Witton in the same county should have a birth date because that was one of the rare registers with birth dates in.

Yes - I do think there are issues still to be resolved - such as when and how to dismiss DQ problems. And we need to make it clear to people that a profile that is high-quality need not be right. But on that score, I'm pretty clear that I'd start by looking for missing or problematic data first and I'd get that from the DQ scores. Only once I've kicked the tires in that manner would I start looking for deeper problems.

Julia Szent-Györgyi · December 4, 2024

The problem is, data by itself doesn't have a quality. It doesn't matter what birthdate is or isn't entered on a profile: that's just a date, or lack of one. What matters is whether that date or blank actually applies to the person or not — and there is no algorithm yet devised that can judge that. At all, never mind badly.

Adrian Bruce1 · December 4, 2024

@Julia Szent-Györgyi - true. So maybe it needs to be rebranded as a Data Completeness score?

Julia Szent-Györgyi · December 4, 2024

@Adrian Bruce1, hmm, maybe labeling it "completeness" would help (although I continue to question the genealogical value of a burial date when you have a death date).

Regarding justifying "counterproductive": it certainly has been that for me, this week. I've been avoiding even looking at the Tree, never mind working in it, because of the added frustration of that stupid grade up in the top corner.

kathryngz · December 6, 2024

I am finding the quality score very helpful and I'm grateful to the FamilySearch engineers who developed it. It's a new feature, so of course there are going to be issues and bugs to be resolved. But I love how it alerts me to potential problems. I also love how FamilySearch is willing to hear our feedback and improve things.

I spend a huge percentage of my time cleaning up messes in Family Tree: bad merges, wrong relationships, corrupted lines across generations. Other users have messed up good data I've put in the tree. I don't have hard evidence yet, but I strongly suspect the quality score will reduce the damage being done to my high-quality profiles.

MandyShaw1 · December 6, 2024

To me, thinking about it, data quality is something that is judged (and judgeable) by its absence, not its presence. So we should be looking at the individual data quality issues DQS has identified on profiles, effectively as extra Research Helps, not chasing gold stars. I agree that the algorithm's being limited to checking the metadata on FS Record sources is unhelpful and that the relevant UI needs clarification.

Data Quality Score Feedback

Quality Score is misleading and counterproductive

Commenti

Categorie