General database query
Hello,
I'm trying to get statistical information about the most common names (firsts and surnames) by age and country.
I think Family Search would be an excellent tool for having all those stats, but I think there's no actual way to achieve this, so I'm asking if any of you would know if there's a way to get all these data?
Best Answer
-
I would imagine you would need to submit a formal request for assistance with this, since ordinary users do not have anything resembling this sort of reporting/analytics access. @Ashlee C. can you help here please?
1
Answers
-
I would have thought censuses were a good place to start. I'd say you were probably better off applying to the authorities in the individual countries to see if this summary information is available (and potentially for more recent censuses than are publicly available at detail level). The FS research wiki may well help identify those authorities. I definitely wouldn't use the FS Family Tree in any way, too inaccurate, incomplete, skewed towards those families that are either LDS or happen to have been researched, and anyway we as users have no real analytics access to the FT (or any other FS) database.
3 -
Censuses or birth registration indexes, I'd suggest. (Lots of countries either don't have censuses or have destroyed them after using them for statistical purposes - Australia is an example of the latter).
Purely in the interest of helping you firm up on your ideas - what do you mean by "age"?
If there was a way to access a snapshot of the UK (say) today, age would enable you to analyse when names were used and therefore in what proportion. However, there is no such snapshot (or rather, there is, it's the census which can't be released until 100y have passed).
If you look at birth indexes and still want to analyse when names were used, then you'd need to look at the year births were registered.
Just as an incidental, I'm fairly certain that the General Register Offices of England & Wales, and of Scotland do produce annual analyses of name usage for births - this enables newspapers to track the emerging and declining popularity of names as prompted by fashion. How many administrations do similar, I've no idea…
2 -
For the U.S., the Social Security Administration has (extensive) given name data sorted by birth year, and they make it all available to the public. For surnames, the Census Bureau is the place to look.
As Adrian pointed out, many countries destroy census enumerations after tabulation — but that tabulation often does include name statistics. I don't know whether there's any website that attempts to collect such data in one place, but there are sites that claim to offer worldwide name statistics; perhaps you could start with some of those.
I agree with Mandy that FS's Family Tree would not be a good source of statistics: it would be highly skewed toward "Mormon" and "famous", and the counts would be highly inaccurate due to duplicate profiles and misspellings.
2 -
Thank you all to your replies, but that's not exactly what I'm looking for…
Answering the question of "what do I mean by age?", I apologise, I meant, by year, this is because I'm attempting to generate a database with the most frequent names by in this case, age, periods of 20 years.
Since I'm looking for these frequency, censuses are limited to a few centuries and my goal is to have them since at least the 10th century and by country/area/location.
I know Family Tree may not be the best source, but it's the most complete I know could make me reach my goal, that's why I'm interested in getting the data from this website.0 -
I honestly would not consider the data on FT to be of sufficiently high quality, /especially/ pre census information, to meet what sound to be your needs. I wonder if church or other religious records might be an option for some countries/periods of time?
1 -
Honestly, there are SO many duplicate profiles in the FSFT, with more being created daily, that the data would not be reliable.
1 -
@Paul11102 - re "my goal is to have them since at least the 10th century and by country/area/location."
Never mind the mechanism, you need to seriously reconsider the scope and feasibility of your objectives. Parish Registers didn't begin in England & Wales until 1538 - and I don't think England & Wales was particularly behind the times. ( See https://en.wikipedia.org/wiki/Parish_register )
There is, for the vast majority of English & Welsh folks virtually no evidence of their names before parish registers. Yes, there are name sources before 1538, such as Manorial records, but they are few and far between. Since they aren't online but in various archives across the country, it would be a workload probably equivalent to a college dissertation to do just England & Wales. Plus you need the statistical tools to normalise the English & Welsh data versus the French (say) in order to gauge relative frequencies of Jean and John, say.
As you go further back, so the surviving records with names get statistically skewed to those people who were important enough to be found on charters, etc. Plenty of (upper class) Norman names - very few Anglo Saxon names after the Norman Conquest.
I urge you to study the types of source documents first, identify how they match up to your overall ambition, where they are held, understand their limitations and only then decide how you could extract the data.
3 -
Yes! Indeed I know all these issues involved, but once again, I've found that this website is by far, the most complete database in order to achieve my goal.
This is not a scientific research, this is a hobby for getting stats for names by year and region.
Therefore, the precision and formalism of the database is not relevant, the objective is to have a estimate.
The cognates names are also I'm not making relevant, since I will have the frecuency of Jean in France and John in England, but also the frecuency of Jean in England and John in France.
Then once again, my question is, is it possible to query all of the Family Search database?
0 -
The last I read, I think something less than 15% of the billions of records on FamilySearch are indexed to make them searchable by name.
There is a pilot program for full-text search, but that currently covers only a few sets of records in a couple of locations.
1 -
If I might thow my hat into the ring with minor warning: The list of all names that have been presented in digital records is not ideal. There are some names that have been entered into the digital record that do not actually exist as real names. Example: If you are able to obtain a list of all surnames in the G.R.O. registers for England and Wales you find that some but actually very few are corruptions of real surnames. If you then use this to test surnames that appear in England and Wales censuses or church records, you will find a lot of surnames that do not match. You will be able to select those that are worthy of further scrutiny, and with practice, they might be correctable. Expanding this to other parts of the world will be a huge undertaking.
To summarize: There's a whole load of stuff in there that's not valid. You need to be prepared for that.
5