Find-By-Name results contradicted by Profile results
Answers
-
Adrian Bruce said: Annoying to think with 20/20 hindsight that these issues would be massively reduced if FS had stuck to the GEDCOM idea of a Address item and a Place-Name item. Putting all the decorations (like the church name) in the Address would result in a Standard Place name set that wouldn't be wrecked by a thousand different ways of entering a church name.0
-
Lundgren said: I can tell you what I do, but that is just my preference and not how it should be done.
When I add people, I TRY to find the standardized place. If the standardized place is close enough, (the birth farm is missing but the place above that is present, I will just use the place above it. If it isn't close enough, then I put what I know is right in the value and ignore the standardized place. (I don't put street addresses, or dates or sentences in the place, that just doesn't work for anything.) I will then go to the places research and submit my place. I will also add more details to the comments if I want them preserved.
In time, the system will grow into the correct values as it keeps improving. As the search system will get the new places in time, those will eventually standardize correctly. Until that happens, they will at least standardize out consistently and be locatable by using the same text.
The search and hints are not driven just by the places. Those are a part, but names and relationships matter greatly as well. When all of it is taken into view search and hints both work pretty well with good places that don't standardize correctly.0 -
Juli said: Transliteration is a different subject than translation, though. Going from Roman letters to any of the three different Japanese writing systems gets Real Complicated Real Fast, but luckily the other direction is (relatively) easy, because the writing corresponds to a single sequence of sounds, and each transliteration system assigns one letter or letter set to each of those sounds.0
-
Adrian Bruce said: Thanks Lundgren - when I was trying to be slightly more positive about it, I was starting to think along the lines of what you do for place-names.
I think that the missing element in the way that this has all been presented to the world (insert cynical comment about the way that FS does or doesn't present things to the world) ... the missing element is the continuing need to improve the standard places to include stuff from Historical Records and from whatever is necessary for Vital events (at least). I'd previously never seen any need to add extra place-names for churches, etc, because I'd just decorate the standard name of the town or village. But now, to (eventually) avoid mess-ups like the appearance of St. Peter's in Stockport vice St. Peter's in Elworth, it's clear that I need to submit those extra place-names.
And if I do decorate a place-name (if) I suspect that I need to make sure that the suggested standard at the top of the list is the right one - so "Nantwich Road, Crewe, Cheshire..." is probably not a good idea if the top suggestion is "Nantwich, Cheshire..." More work than before...0 -
Jeff Wiseman said: Adrian, that road name before the city is a COMMON problem that confuses the standards matching algorithm. In Ohio there are many "Pikes" that run from town to town. Chillicothe Pike may be in a town that is not even in the county where Chillicothe is located, but putting the name in always causes Chillicothe (the town) to rise to the top of the list, even though a correct town, county, and state are ALREADY included in the entered place name.
HOWEVER, the algorithm seems to be smart enough, that if there is a NUMBER in the street or Pike name, it will not be treated as a possible city. This is nice, although I find myself concocting display names that will always put the correct standard at the top of the list since I have frequently seen that attached standard lost due to FS "improvements" and others just "tweaking" the display name not noticing that it blew away the assigned standard and replaced it with the "top of the list" value.
Also you use the term of "decorating" which is interesting. I consider something like "Saint Margaret Cemetery, Chillicothe, Ross, Ohio" no less of a Place Name than "Chillicothe, Ross, Ohio, United States" is. The cemetery name *IS* part of the place name. Why shouldn't it be included in the place name field?
So it is NOT a "decorated" place name, It *IS* a place name. However, it could arguably be referred to as something like and "extended standard" name (otherwise known as a "Standardized" name)0 -
Lundgren said: Agreed. In this case, we handle all of this via localization and not active translation or transliteration.
We have a dates represented as gedcomx dates (you can find the spec for that online.) easy. We have code that can convert any date to any supported browser language.
We also have code to covert place text that we place IDs (place standards). We use that same code to convert the place id to a human readable string, ideally in the same language as the browser language.
If the full place id has a translation defined in the browser language, it is used. If not, the default value for the place ID is used. Humans build the place ID to place text in the language ahead of time. We just reference it and return it based on available for the browser language.
You can also see place text that is partially localized, at a county, state level for example, but not a city.0 -
Paul said: After reading the comments posted here, I must say I'm very pleased I've chosen never to include a full address in the display name! If I want to record specifically that a relative died at 123 High Street wherever, I just add that to the box beneath the Vital. That way, I can see the exact place just as clearly as if I had entered it in the place field AND not have any worries about how it will affect any standardisation process. Okay, users SHOULD be able to choose the display name of their choice, but if the current algorithm is proving inconsistent in how those entries are handled, I see no advantage in squeezing a long address (do users add postal/zip codes, too?) into that field when there is plenty of room for it to be displayed immediately below.
I'm sure some will say, "But you shouldn't show that sort of detail as a 'reason statement'", but I feel we have to use the system to meet circumstances, so if the current algorithm isn't working "as it should" in relation to our inputs, just find a way around things - at least until the problem is sorted.
(I appreciate it suits the needs of many users to have a full address as a display name, but now we don't have to click on "Edit" (except in the beta version, as I just found!) to see the reason statement comments, I find it suits me to add full address detail there, rather than in the birthplace field.)
0 -
Tom Huber said: This is one area where it would be nice to tag a note to an entry in Vitals, Family, or Other sections. That way we could set the title of the note to "Died at (full address)."0
-
Juli said: If I want to record a full street address from a source, I add a residence. This has a description field where I can put the number and street and postal code. The city/state/country goes in the place field, the date of the source goes in the date field, and a short reference to the source ("from draft card") goes in the reason box.
But Lundgren's description of how the tree search basically throws away our efforts in choosing the correct standard is ...worrysome. Why exactly are we bothering?0 -
Paul said: Please ignore my suggestion about the beta version function being different with regards to the Detail View display. Juli made the point (in another thread) that I had this switched off when in Beta.0
-
Juli said: I've been mulling Adrian's question of "to decorate or not to decorate". Most of the time, I find it unnecessary or more trouble than it's worth: most of the places I deal with had one church or one registrar's office, so specifying the town is completely sufficient. Even if there were more churches, I generally just give those details in the sources. ("FamilySearch Film NNNN image M of P (BP, Calvin-square Reformed): baptism of XY, 1893".)
For cities like Budapest, it'd be nice if the district names and numbers were in the places database, but they're confusing (there are entire websites dedicated to trying to keep track of how Budapest's districts have changed over the years) and there doesn't seem to be any consensus on how to format things -- Roman numeral or Arabic? Before the city name or after? Just the number, or also one of its names? (Some of the latter are in the database, along with a random selection of cemeteries and squares and neighborhoods, but the district numbers tend to just confuse things.)
I keep coming back, though, to how discarding the standard and starting over from the text is Just Wrong.
Example: Budapest district 22 is made up of several formerly-independent places, including Nagytétény (also often just plain Tétény) and Budatétény (earlier Kistétény). There was also a Tétény in Moson county; it's now Tadten, Austria. Now suppose I had a source for an event that simply said "Tétény, Hungary", and suppose further that I had narrowed this down to one of the places that would become district 22, but I didn't know which one. Knowing what I thought I knew about the place standardization process, I could just enter "Tétény, Hungary" for the display, but choose "Nagytétény, Pest-Pilis-Solt-Kiskun, Hungary" as the closest standard, right? (It's the third choice out of four in the drop-down.) Except apparently for Find's purposes, it'd be putting the event in Austria, because Mosontétény is the first item on the list.
I'm really just left wondering: why do we even bother?0 -
Jeff Wiseman said: Are you talking about the Incorrectly labeled "Reason This Information Is Correct" box that should properly be named "Reason You are Making This Specific Change"?
If you are, then anything you put into that field will likely disappear into the dregs of the Change History Log the next time anyone touches the vital.
Street addresses, cemetery names, hospital names, or church names are all LOCATIONS and belong in LOCATION fields. They are not REASONS and are illogical to place into REASON fields (even when they are labeled as the wrong Reason type).
Also, vital locations do not always correspond to residences. A Christening may be in a church. A Death may be in a Hospital (or in a mine in the case of some of my ancestors). None of those are residences.
Highjacking fields intended for other purposes to use for your own (different) purposes inevitably will create side effects for you.
But if those fields are schizophrenic in how they are treated within the site software, well, you're pretty much hosed :-)0 -
Juli said: I never put data in the reason box, just references to same. (As I said: "from draft card".) The reason I use "Residence" is that it has a _description_ field. I put the address in the _description_ field. It may not use the word "location", but a description of a residence _is_ a location field. I'm not hijacking anything.0
-
Paul said: To be honest, the full address does not interest me at all. So, in practice, I do not add it anywhere - my screenshot was just an example of an option I created today. Nevertheless, I admit it was probably bad advice to offer this as an option to anyone not wishing to lose this detail. However, saying, "anything you put into that field will likely disappear into the dregs of the Change History Log the next time anyone touches the vital" surely applies to entering anything, anywhere on the person page. Open-edit means the full address in the place field is also likely to disappear if another user wants to format the box another way. Everything we input is subject to change, according to the whims of other users, so again your question below ("why do we even bother") can be true of any of our entries.0
-
Adrian Bruce said: Lungren said (snipping massively) "Because of this [volatility], the search and hinting systems do not currently use the standardized place ids created when a person is entered. They use the text version of the places the user adds and standardize it with a static long-term supported version of the place standards. "
The basic issue with this re-standardisation appears to me to be that the automatic, no-human-intervention re-standardisation has to choose what would have been item 1 in the drop down list in the GUI (assuming that the GUI's data matches the long-term data). (Not sure if date comes into the reckoning in this re-standardisation process).
But when the standard that the user chose was number, 2, 3, 4, whatever out of the drop down list (as can easily happen) then, as Juli and others point out above, the automatic, no-human-intervention re-standardisation will come up with the wrong answer, with potentially all sorts of problems for people searching or using hints and not reading the profile.
I have to take the volatility issue as read but wonder if there's a way to work round it for some stuff?
Let's suppose that the automatic, no-human-intervention re-standardisation could be modified to use both the text version of the place that the user added (as now) and the standardised version of the place-name as chosen by the user. (No idea if that should be the machine readable key or the text - probably the key, if possible, given the number of duplicate names of different types).
Revised processing would be to take the user-standardised version of the place-name; see whether it's in the static long-term supported version of the place standards; and if it is - use that.
(All sorts of assumptions there about whether the keys are the same in both the live and the long-term systems)
If the user-standardised version of the place-name is not in the static long-term supported version of the place standards, or if there isn't a user-standardised version at all - then proceed on current logic by re-standardizing using the text version of the places the user added.
The above would always come up with a long-term supported version of the place-name. It would also reduce those instances where the automatic, no-human-intervention re-standardisation place-name is different from the user's version.
For instance, in my example that started this off, assuming that Elworth is in the long-term data, it would standardise on Elworth like I did, because it uses my Elworth.
The issue with my proposal is that I have no idea whether the data that I suggest should be used, gets through to the point where it's needed.0 -
Adrian Bruce said: Re "you use the term 'decorating' which is interesting". If you find it interesting, I shall take that as a compliment!
Yes I'm not sure of its exact origins but I seem to have developed the habit of using that term to describe the process that I go through to extend (or whatever) the text from the standard name. Usually it is a prefix but I think that I did once successfully stick "Co" for "County" into the middle of one of those absurdly repetitious place-names like (fictitious example) "Orange, Orange, Virginia, USA"
Yes, it is a place-name in its own right. It's just not a standard place-name.0 -
Juli said: An exact street address can help find elusive census records, for example. (Steve Morse has a utility for converting addresses into enumeration districts.)0
-
Jordi Kloosterboer said: Haha FamilySearch doesn't have it (yet) but Utrecht, Utrecht, Utrecht, Netherlands is an example :P town/city within a municipality within a province all the same name0
-
Jeff Wiseman said: And it can help discern the difference between people with similar names living in the same neighborhood around the same time.0
-
Lundgren said: I believe the reason to bother is that you want the data recorded in the tree correctly.
Eventually, the standards will catch up. Today, you probably aren't searching the tree for the people that you add that frequently, but when you are, based on names, dates AND places, you will probably locate the person you are after.
If not, let us know and we can look into what may be happening.
Hinting also uses names, dates, places, as well as relationships. All of those parts are used together to determine if a record is a hint.
If the place IDs from tree are calculated one way (by a human picking from somewhere in the list) and the places in the records using a different method, then they will not match, using the same method ensures that two identical text places will translate to identical place IDs.
It is also worth noting, that the systems do not pick JUST the top one in the list and use it. Based on internal thresholds, if there isn't a clear match, each non-deterministic place is indexed using multiple possible place IDs.
It is not limited to just one place. If you see a pull down that has 50 places in it, the system may index it on the top 5 depending on the confidence of the place results being returned from standards.
If the "right" one is the 3rd one down, then it will probably be used in the indexes, as well as the top two that may not be as good, and the 4th one down that is really not so good.
Since the records are not perfect, and the tree data is not perfect, the system has to do multiple things to try to compensate and return the best possible results.
Hopefully that helps dispel a little despair.0 -
Adrian Bruce said: "I believe the reason to bother is that you want the data recorded in the tree correctly".
Well, no, the data for Charles Charity is recorded correctly in the tree. It's finding it that's the issue because the results from
menu option FamilyTree / Find / Find By name, give the wrong answer in that they specifically state that the Charles Charity in the tree, is buried in Stockport, whereas I know that he lived in Elworth at the other side of the county.
To give a use case of how this might happen:
- Imagine that I don't already know that "my" Charles Charity in in FamilyTree;
- Suppose I find an image of his grave (I think it's on BillionGraves) or I find the grave in Elworth;
- I decide I want to see if he's in FS FamilyTree so use menu option FamilyTree / Find / Find By name plus rough date of birth to search for him;
- Back comes the results list that is the first image I posted at the top of this thread. There are several guys called Charles Charity - I discard the first because he's in totally the wrong counties. I discard the 2nd because, although he's buried in Cheshire, he's buried in Stockport, the other side of the county from Elworth, where I know he's buried. Etc, etc.
- I therefore come to the conclusion that "my" Charles Charity is not in FS FamilyTree, so I go on to create a profile for "this" Charles Charity, which, unbeknown to me, is a dupe.
"if there isn't a clear match, each non-deterministic place is indexed using multiple possible place IDs ... the system may index it on the top 5 depending ..."
That's good to know - however the results list in FamilyTree / Find / Find By name only shows one (As Far As I Know!!).
"Eventually, the standards will catch up."
But will they? Because I thought I knew that I could happily prefix Standard Place-Names with church-names, cemetery names, etc, I for one was not even thinking about asking the Standards team to enter the individual places. Now I really don't know what I should do - there are 9 Church of England churches in my home town with registers, plus a Roman Catholic church plus 11 or more non-conformist churches with registers, plus 3 cemeteries. Big job.
I could treat the church names like you do many of your farm names and leave them out.
Alternatively - if I do ask the Team to enter those churches into the standard place-names it will - potentially - mess up others using place-name prefixes until they also ask for them to be entered because (say) someone might enter "Daresbury, Cheshire, England" with a prefix of "All Saints" and carefully standardise it to "Daresbury, Cheshire, England". But the background long-term standardisation might decide the manually input "All Saints, Daresbury, Cheshire, England" matches the (theoretical) long-term name "All Saints, Crewe, Cheshire, England" instead.
I'm reasonably happy with the idea that there is a long-term version of the place-name data - I've been there with questions of synching changes in data. Most of what you told me is done seems to be pretty reasonable. It's the problem with the incorrect results in option FamilyTree / Find / Find By Name that is the issue.
As I said above, I wonder if the user-standardised version of the place-name can be used instead of the auto-generated one provided that it's in the static long-term supported version of the place standards.0 -
Lundgren said: I think that church names should be treated like cemetery names and should be added. I would submit the names that you are are aware of and let the standards team use their methods for determining what gets added and what doesn't.
I would like to be able to see both the "localized" version as well as the original version in the search results.
For documents or tree people with entries in character sets that I may not understand, that would be even more helpful.
I can make suggestions in this area, but I am not on the UI team and don't get to decide that goes on the results page. (My team can provide data, but we don't pick what gets shown on the screen.)0 -
Adrian Bruce said: Yes I guess that the original user input text for the place name would be good for solving the issue. Let's hope that the UI team realize that there's an issue and can find something useful in this thread to fix it.0
-
Paul said: Probably going (as usual) rather off-topic here, but isn't the whole issue around standardizing of place names seriously flawed? It's only by accident that I have found the reason a displayed place name looks like nowhere that exists (or ever has) is that the "hidden" so-called standardized place bears little or no resemblance to the displayed one. The "system" (or another user) has standardized it (if you are fortunate) with just the county name (e.g. Toytown, Essex, England is standardised as Essex, England), but equally possible is the displayed name (and it doesn't always have to look so silly) has been standardized as somewhere thousands of miles away (e.g. Essex, Vermont, United States).
I have found the problem particularly troubling when it comes to place names beginning with the name of a church - e.g. All Saints or St Peter's. I suggest users take a random look at some of these. I have often thought, "Oh, I wouldn't have expected that to be standardized correctly." The truth has been that IT HASN'T!
Don't let the lack of a data warning flag lull you into believing the display name is serving any useful purpose with regards to any FamilySearch algorithm helping you find matches for your relatives. As far as the programming / search algorithm is concerned, you are looking for an individual born possibly thousands of miles from that location.
And you haven't ever wondered if there might be any other reasons (excepting the main subject of this topic) why "the system" can find no sources relating to your great-great uncle George?0 -
Lundgren said: There is certainly room for errors in the system. And it doesn't deal with just Tree data.
Tree, currently has a user interacting with the system to determine place names. (Which interaction doesn't always yield perfect results.) This hasn't always been the case and there is lots of data in the tree that hasn't seen human interaction in quite a while. Some of it may even have been imported from third party systems with different standardization systems that provided nothing but a text value for the place.
Another of the systems Historic Records. That has data that at times is ambiguous with no clear answer in the record information. For example a US census record from the state of Arkansas. On that record, they asked the person where they were born and recorded the country or state. If the person living in Arkansas answers that they were born in Georgia, the enumerator wrote down Georgia and moved on. What Georgia was that? Georgia USA, or the country of Georgia? There is no clear answer. We must index on both. (There are many places where this same thing happens.)
Additionally we have uploaded genealogies. There is no standard around that data at all. It comes in as raw text strings for places with no control guaranteed around what is placed in any fields, and no user interaction possible at standardization time. People overload the place fields with all sorts of data.
If the place of Essex is not a perfect match on the CT or the UK location, we will index on both and search on both. This increases the recall, albeit at the cost of precision. Showing you only the localized standardized version must fail in some cases.0 -
Paul said: An interesting example I came across this evening. My ancestors lived at a place named Danby in Yorkshire. There is no correct standard place name for this in Family Tree. The first option in the drop-down menu is Danby by Middleham: there is no place of this name. The second option is Danby by Castleton: again, no place officially known as this. Indexing projects have led to Danby being standardized as either of these two effectively non-existent places.
I have to be honest and say there is a Danby HALL near Middleham - about 50 miles west of the village / parish of Danby. There is also a village of Castleton, a couple of miles from Danby, but however did the two incorrect place names come to find their way into the standards database, whereas the correct place name has been completely omitted?0 -
Paul said: An interesting example I came across this evening. My ancestors lived at a place named Danby in Yorkshire. There is no correct standard place name for this in Family Tree. The first option in the drop-down menu is Danby by Middleham: there is no place of this name. The second option is Danby by Castleton: again, no place officially known as this. Indexing projects have led to Danby being standardized as either of these two effectively non-existent places.
I have to be honest and say there is a Danby HALL near Middleham - about 50 miles west of the village / parish of Danby. There is also a village of Castleton, a couple of miles from Danby, but however did the two incorrect place names come to find their way into the standards database, whereas the correct place name has been completely omitted?
See https://www.genuki.org.uk/big/eng/YKS.... Even the parish's aka "Danby in Cleveland" is not mentioned here, as the official name of the parish has always been just "Danby". (Email sent to placefeedback@familysearch.org)0
This discussion has been closed.