Input request for searching familysearch records from a tree person.
LegacyUser
✭✭✭✭
Lundgren said: We are currently evaluating the search parameters that are used when you run a historic record search from a tree person.
We are doing this by taking persons from the tree, gathering all their information on the tree as well as the IDs of the familysearch records that are attached.
We then take the data from the tree and formulate requests that we then run against record search.
We score the request based on the number of the attached records that are returned in the results.
The query that returns the highest number of results across all of the tree ids will be the query that we recommend the Tree Web and Mobile teams use when querying the historic records from a person in the tree.
We already have gathered a significant sample set of tree ids that we are using to evaluate searches.
If you would like to provide some tree-ids that you would like us to include in our sample set, please reply in this thread with the tree id[s] of the person[s] you would like included.
A sample Tree ID could be: LC32-HZ6
You could also supply a link to a tree person like this:
https://www.familysearch.org/tree/per...
We will use IDs, but aren't going to search out names/places/dates to find an ID.
In the future we may build more specific search from the tree functionality, but for now, we are looking to create the best uniform search to recommend to the teams that create the user interfaces.
We are doing this by taking persons from the tree, gathering all their information on the tree as well as the IDs of the familysearch records that are attached.
We then take the data from the tree and formulate requests that we then run against record search.
We score the request based on the number of the attached records that are returned in the results.
The query that returns the highest number of results across all of the tree ids will be the query that we recommend the Tree Web and Mobile teams use when querying the historic records from a person in the tree.
We already have gathered a significant sample set of tree ids that we are using to evaluate searches.
If you would like to provide some tree-ids that you would like us to include in our sample set, please reply in this thread with the tree id[s] of the person[s] you would like included.
A sample Tree ID could be: LC32-HZ6
You could also supply a link to a tree person like this:
https://www.familysearch.org/tree/per...
We will use IDs, but aren't going to search out names/places/dates to find an ID.
In the future we may build more specific search from the tree functionality, but for now, we are looking to create the best uniform search to recommend to the teams that create the user interfaces.
Tagged:
0
Comments
-
gasmodels said: I am by no means an expert on searching but as a user the number of attached sources is not a good measure of the correctness of a record. If you look at John Butterworth LZ2M-ZZ9 you will see he has 85 sources attached has 4 spouses and 35 children attached and there are 4 additional possible duplicates. The record has not evolved to the IOUS stage but it is clearly an over merged record and searching using this record would not be a useful example of something to base search engine characteristics. It is obvious the hinting and possible duplicates features have be used by individuals who have not done any research to create this record. All I am doing is suggesting that some care needs to be taken in the selection of Test or Example records for the results to be truly meaningful.0
-
Juli said: I think this fact is partly why Lundgren is asking for us to supply some PIDs: ones with lots of _good_ source attachments may help the algorithm work better, and maybe possibly will eventually make IOUSs less likely.
(Unfortunately, I don't think I can help much, as about 90% of my sources are unindexed, and of the indexed ones, about 50% have one or more errors in the indexing.)0 -
Paul said: Lundgren
Firstly, I would like to express my appreciation that you are reaching out to the "ordinary user" for help in developing this feature.
However, I would be grateful if you could be a little more specific regarding examples of IDs that would be most useful for your purposes. Otherwise, I'm afraid we might easily provide samples that will not be helpful in this exercise.
The first point I would like you to confirm is whether you are talking about our searches from the person page - specifically by clicking on the FamilySearch logo, which takes us to the Search page? By conducting a search by this method, there are obviously certain data carried across, depending on the already, user completed fields.
If I now make a few general points, I believe you will be able to respond by confirming whether these will be considered as part of any enhancement, or whether I am completely off-track and all you are really looking for (in us providing the requested IDs) is "A, B & C".
Will any future change, after analysing various IDs:
(1) Possibly lead to the "Exact match" facility being withdrawn?
(2) Lead to "United Kingdom" (for example) no longer being carried across, which currently increases the number of results than otherwise would be the case, without this appearing in the Place Name fields?
I suppose my problem is getting my head around what you hope to gain from this exercise. Another thought that comes to mind - in comparing, say, the number of sources attached to different IDs - is the fact that some of my ancestors will not have a single source to attach from the available FamilySearch collections, whereas others will have a huge number to choose from.
Then there is the factor of whether (as in gasmodel's example) these 85 sources genuinely relate to the ID concerned or if they relate to multiple individuals of similar identity.
Finally, there is the issue that search results will vary according to whether the user's account is public or a member's one. (LDS members will presumably see results from collections that are available to them, but not the general public. To confuse things further, sources attached to some of my IDs would no longer be found in my searches as certain collections have since been withdrawn from public accounts.)
I confess I am deliberately "rambling on" to try to ensure any examples provided are completely relevant to your purposes, rather than being totally irrelevant to the exercise!
Sorry about this, but I can't stop thinking about my main problem of other users attaching sources not relating to the ID in question, either due to their interpretation of the (current) search results that are offered, or through Record Hints.
But my general confusion concerns how you are going to evaluate the information you receive when the factors (say in number of sources, rightly or wrongly attached) are so varied and complex.
Forgive me if I am making the issue / your request far too complicated!0 -
Paul said: The problem is, what is a "good" source attachment and how would any analysis be able to differentiate between sources correctly or incorrectly attached? There is also the factor of sources that have been detached by another user. That is partly why I added the long post addressed to Lundgren, Without knowing exactly the sort of IDs he is requesting, we could be helping to provide meaningless information, which would lead to an incorrect analysis of the data.0
-
Lundgren said: I'm sorry I wasn't clear enough.
I have seen some passion on this site from people recommending what parameters should be passed from the tree to the familysearch historic record search for the initial search.
Some like the mobile options better than the website options.
Some have suggested we pass every detail from the tree person to the record search.
Some want us to pass every detail of every place.
Since the mobile app and the website are not consistent, we are doing a study to determine what the best parameters are for a wide range of tree persons.
We are going to take all of the tree-ids we identity and run the many different searches across all of them.
Sample questions we are going to answer based on the results from the searches are:
- Should we pass everything in the birth place fields to the record search, or just the top two or three levels?
- Should we pass the birth/residence/marriage/death or just the birth information?
- Should we pass parents, spouse, children or just the primary person?
- Should we pass just the event year or the year +-1 or +-2 or +-5 years?
The search that returns the highest number of attached sources will be considered the best.
Based on the results we will recommend that the tree website and the mobile applications use the search parameters we determine to be the best.
This effort will not result in the removal of features from the familysearch record search system.
We have already selected a significant sample set.
We would like to give you the opportunity to participate and know that we are considering your specific cases.
We would like to let you add IDs for tree people that have good sources attached.
If you have or know of persons on the tree that have good, indexed records attached to them as sources, that you would like us to consider in this evaluation, please post them here.
Thank you!0 -
Lundgren said: I'm sure I haven't answered every question in this post below. Hopefully I have gotten the major ones though.
At this stage, we are not going to evaluate the quality of the records attached, we are trusting that our users have done that before they attached them. We understand that there are probably some that are poor as you indicated. We have also selected from our own ancestors that we know have good attachments. You can help us improve the data set by adding IDs that you know are good and skipping ones you know are not.
The ID I chose as an example above is not in our sample set, just a well known historic figure.0 -
Paul said: Lundgren
Thank you for your response. I must admit I am rather surprised you are using this method to determine any changes on what "passes across". Indeed, I would have thought that existing knowledge of certain, known factors would have led to a more reasoned approach to any forthcoming changes. For example, in relation to criteria you mention above:
(1) We know the adding of "United Kingdom" to its constituent countries (some years ago) led to a far wider number of results being offered, many being unlikely matches.
(2) The event year always needs to be in a range - to allow for inaccuracy in the records: e.g. the 1841 England & Wales census has ages "rounded down" to up to 4 years and census and death records in general often provide an age that is at least 1 or 2 years adrift.
(3) It is a known factor that the existing practice of birth, marriage and death all coming across inhibits the production of relevant results. Most experienced users already know they must delete some of this detail to get more accurate / specific results. For example, if I am searching for a marriage, it is best to remove the birth & death results carried over to the Search page. If I am searching for a death, I usually leave a range of birth dates (as these are prioritised) but delete the birthplace, as leaving this will almost certainly provide me with "Nil" results.
What I am trying to illustrate is the fact that experienced users know only too well the pitfalls of the present feature, especially whereby birth, marriage & death inputs are carried across - and know the various workarounds in order to refine the initial results produced. The really careless user will take anything at face value and end up attaching anything to anybody. (No exaggeration, not even same first name / surname!)
I honestly don't see how the factors you mention will be highlighted by any analysis that involves good, indexed records being attached. In themselves, these are likely to contain inaccurate detail relating to the individual and will only have been attached after careful analysis by the user - or sometimes by there being no other close match in the source details to anyone else who lived on the planet!
Sorry to sound so negative, but I believe (as evidenced by the Record Hint / Possible Duplicate algorithms) it is always going to to be near impossible to find the fine balance between providing too much information / too many results and too little. This can lead to either, on the one hand, to an accurate source being missed altogether and, on the other, it being attached to the wrong ID.
My best wishes in any attempt to try to achieve your desired goal. Whatever method you use is going to prove extremely challenging.0 -
Jordi Kloosterboer said: Here are some IDs for Dutch people with 10+ sources:
L4HN-WNL
LZXP-W95
LWYC-YG6
LBBT-FMF
KZW3-BVJ
LYB5-FRC
KL15-QJP
Let me know if these profiles are proficient, if I should find more like them, or if I should find different ones.0 -
Tom Huber said: There is a strong case against passing all details as search parameters. The results could easily equal zero finds.
As it is, when I use the profile FamilySearch Search function, I often edit down the information, since I am usually looking for certain information, and not a general blanket "everything."
But that's me and likely not the case with many other users.0 -
Tom Huber said: Hm. In other words, go for the majority of users. I don't mind having too many parameters and it is easy enough for me to delete them.0
-
Lundgren said: Thank you for your comments.
Because there is so much variation in the tree and the records, we are doing an exhaustive comparison of searches rather than targeting it on what we believe already know.
This will give us a greater degree of confidence that the choice we make is the best choice based on the data set we are using.
We have the scenarios you mentioned in 1,2,3 all covered and more.
Our first pass showed us that neither of the two methods being used by mobile or web user interfaces provide the best experience.
We are now exploring additional options that are not being done by either of the user interfaces including exploring search options that are not yet available in the user interfaces.
If you have any tree people that you feel have good records attached to them that you would like included in our data set, please provide them!
Thank you!0 -
Lundgren said: These are great! If you have more that you would like to share, please do!0
-
Tom Huber said: Pieter Claesen 9312-XFX is problematical in searching (either historical records or using the Find function in the tree). The biggest problem is that seven years before he died, he had to select a fixed name for a surname -- the British didn't like patronymics -- He selected Wyckoff (actually a variation of the name -- see his record and the listing of those who participated in the pledge of allegiance).
I just did a quick check using the FamilySearch in Search Records from Pieter's Profile and the result that came back was confusing at best and not related in any way, even though Pieter had a daughter named Annetje.
Something to look at...0 -
Lundgren said: There also seem to be many near duplicates/variations in the alternate names....0
-
Jordi Kloosterboer said: Okay, here are some more:
LZGH-KBP
L62K-5Q1
LZDH-NG2
LZF8-F93
LCRB-BF7
9MQ1-R8N
K63S-JYQ
L6FQ-LYW
KCLN-H24
L61Z-WZ10 -
terry blair said: Lundgren,
I'm still a little confused here in what you are asking for. Obviously the head of a big family located in the US in the period of time when birth, death, marriage, and census records are available will have many more sources possibly attached to him than someone born in 1700 in a location where the court house was destroyed. So, just the number of attached sources will provide --- ?0 -
Lundgren said: Thank you!0
-
Lundgren said: Jordi Kloosterboer provided a perfect example of what we would like.
Good tree ids that have good records that are attached that you would like added to our "truth" set of tree persons that we will use for baseline search testing.
We will take the info from the tree as well as the attached record ids and see how many of the attached records a single search can return.
If there are only three historic records in our system for a tree person, we want our default search to return as many of them as possible.
If there are 20 sources for the person, we want our default search to return as many of those 20 as possible.
Thank you!0 -
Paul said: Lundgren
Hope these might be of help - just picked at random, but all having 10+ sources attached.
M5MY-2R4
LBMS-D2D
MMTW-HTV
MNJQ-SGM
LDFC-YZG
MM99-Z4V
L4CK-8SN
MQQ7-RB50 -
Jordi Kloosterboer said: Tom, just so you know, ij and y are basically the same in Dutch, which that person seems to have spoken especially since he lived in Nieuw Nederland. I pretty much always convert y to ij and do not show both as alt names since, if I did, I would explode the alt names section with names that all look like each other but slight spelling variations. (And I pretty much always delete long lists of slight variation spelled names I find on my relatives from the Netherlands.) I also assume the search knows that 'IJ' is treated as a single letter in Dutch and corresponds to 'Y'? However, if it is the Americanized/anglicized version of the Dutch, then I may leave the y version as then it is a different language. You may choose to change some of the alt names based on this or not. But that is my take on it.0
-
gasmodels said: Lundgren
Here are a couple of records from the Isle of Man
LL9J-W2Z
LLMT-R4B
KGZ1-F3J
These are interesting because they are from the Isle of Man and it is sometimes very difficult to find the records because of the way the historical records are indexed. The first two each have over 50 sources attached the 3rd one only 9 -- all of them have 3 christening records for different collections but the current search engine easily finds only two of the records. Hope you find them useful0 -
David Newton said: LHTP-5KM
MFG3-MGP
MNMY-MSJ
LWDY-QZL
These are direct ancestors of mine from various bits of England.
One point about the search that I would make is that it is too dependent on the place of birth. That's fine if the person stayed in the same area for their whole life. However even if they moved to the next county over that makes finding death records and indeed marriage records much more difficult. Most marriage and death records don't have birth location information in them but do have birth year information in them. Therefore search results for marriage and death records should have the date much more heavily weighted than the place. Geographical proximity should also be weighted more than jurisdictional similarity, so two records five miles apart but in the different jurisdictions should be grouped together more than two records 30 miles apart but in the same jurisdiction.0 -
Tom Huber said: In this particular case, the names were recorded in Friesen on the British allegiance document, not Dutch, though I suspect the two are very similar.
I'm not a language guru, so I really don't know, but one of the purposes is for the search routines in Find to locate what otherwise, may not be readily apparent. So I've listed both.
I stopped with what I had entered because I kept finding more spelling variations over the course of Pieter's life.0 -
Tom Huber said: By the way, I appreciate the input, Jordi.0
-
Jordi Kloosterboer said: No problem0
-
Lundgren said: Thank you for your input. All of the supplied IDs have been added to our data set.0
This discussion has been closed.