Please Allow Upper-Level ASCII Characters in Source Citations, Locations, and Name Fields
When building a source citation in FamilySearch Family Tree (either through Add Source or the Source Box), the program only allows plain text entries. Guidance for making acceptable source citations given at BYU, BYU-I, and in the FamilySearch Wiki all refer the researcher to the Chicago Manual of Style and Evidence Explained for proper formatting. Both of those guides require the use of italics. As mentioned above, FamilySearch only allows plain text, so italics are prohibited. Additionally, citations of works from other countries will often contain homoglyphs such as umlauts, etc. The current FS/FT program also prohibits these characters in source citations, although they are allowed in name and place fields. Italics and homoglyphs are contained in the Upper-Level ASCII character set. Please revise the program to allow for Italics, underlines, and homoglyphs in source citations.
This is not a new request. I brought it up to Ron Tanner a couple of years ago at one of his Facebook (now YouTube) webinars but to no avail. Concerning upper-level ASCII in citations, Helen Schatvet Ullmann and “GetSatisfaction” brought it up in 2018 on this site. That request languished without comment from FamilySearch and is now in “Legacy.” (Whatever that means.) Concerning upper-level ASCII in name fields, Linda Barney brought it up last August on this site. It generated four comments to date, but nothing pro or con from the developers. It would be nice to have their input so we know we are not just venting here.
Comments
-
GetSatisfaction was the predecessor to this forum. The "Legacy" status of those threads simply means they are read-only.
We are told that FS Staff read the comments in this forum, but they never respond. If you're seeking a reply to a question or suggestion you should contact FamilySearch Support.
0 -
GetSatisfaction was the predecessor to this forum. The "Legacy" status of those threads simply means they are read-only.
We are told that FS Staff read the comments in this forum, but they never respond. If you're seeking a reply to a question or suggestion you should contact FamilySearch Support.
0 -
I use the additional Norwegian characters Æ, Ø, and Å all the time in sources and can easily put other additional letters and diacritics in them as well, such as Ñ, À, Í Ś ß. This is in titles, citations, URLs, notes, wherever they are needed. What specific characters are you having trouble with?
0 -
I use the additional Norwegian characters Æ, Ø, and Å all the time in sources and can easily put other additional letters and diacritics in them as well, such as Ñ, À, Í Ś ß. This is in titles, citations, URLs, notes, wherever they are needed. What specific characters are you having trouble with?
0 -
Thinking more about your question, I'm not sure I'm understanding things correctly. The only definition I can find for "homoglyph" is characters that look the same, at least in certain typefaces, but have different meanings, such as lower case L and upper case I, or zero and upper case letter O, or the Greek, Cyrillic, and Latin A characters. As you use the term, do you actually mean the citations can't use non-Roman character sets? Can you give a specific example of something you can type here but can't type in a citation?
0 -
Thinking more about your question, I'm not sure I'm understanding things correctly. The only definition I can find for "homoglyph" is characters that look the same, at least in certain typefaces, but have different meanings, such as lower case L and upper case I, or zero and upper case letter O, or the Greek, Cyrillic, and Latin A characters. As you use the term, do you actually mean the citations can't use non-Roman character sets? Can you give a specific example of something you can type here but can't type in a citation?
0 -
Italics is controlled by font. Simple text type documents tend to use a single font only. You would need HTML formatted text (multi-font) to support that and HTML coding is more complex and takes up more space than simple text. There are reasons to use simple text for some fields. For example, the "Collaboration Notes" only use simple text. That is partly because they can be synced to other platforms and carried via GEDCOM files, most of which do not support multiple-font and other formatting styles for their text. I'm not sure that allowing formatting controls in those areas would be such a good idea. I for one would not want it because it would foul up my syncing of sources between my records and those in FS.
Note that FS uses italics in the index source citations that they create, so there might be a double standard here :-)
However, regarding the character set, if you are talking about what used to be called "extended ASCII", the site supports far more than that as it uses unicode fonts (over 65,000 characters depending on the font used). In addition to the zero-spaced characters that you mention (umlauts,circumflex, etc.), it also supports things such as ligatures, so if you cannot get at those characters in the notes and descriptions of source citations, then something is wrong.
I just checked and I can insert these special characters into both the Description and Notes fields of a source citation. This part shouldn't be an issue.
But as far as multi-font and format style control, you have to write your citations the same way that everyone had to do years ago when they were using typewriters (e.g., double quotes were used instead of italics, etc.)
:-)
0 -
Italics is controlled by font. Simple text type documents tend to use a single font only. You would need HTML formatted text (multi-font) to support that and HTML coding is more complex and takes up more space than simple text. There are reasons to use simple text for some fields. For example, the "Collaboration Notes" only use simple text. That is partly because they can be synced to other platforms and carried via GEDCOM files, most of which do not support multiple-font and other formatting styles for their text. I'm not sure that allowing formatting controls in those areas would be such a good idea. I for one would not want it because it would foul up my syncing of sources between my records and those in FS.
Note that FS uses italics in the index source citations that they create, so there might be a double standard here :-)
However, regarding the character set, if you are talking about what used to be called "extended ASCII", the site supports far more than that as it uses unicode fonts (over 65,000 characters depending on the font used). In addition to the zero-spaced characters that you mention (umlauts,circumflex, etc.), it also supports things such as ligatures, so if you cannot get at those characters in the notes and descriptions of source citations, then something is wrong.
I just checked and I can insert these special characters into both the Description and Notes fields of a source citation. This part shouldn't be an issue.
But as far as multi-font and format style control, you have to write your citations the same way that everyone had to do years ago when they were using typewriters (e.g., double quotes were used instead of italics, etc.)
:-)
0 -
Echoing Gordon and Jeff, I have no trouble whatsoever typing citations on FS in Hungarian (which uses áéíóöőúüűÁÉÍÓÖŐÚÜŰ along with the 52 alphabetic characters on a standard American keyboard).
It seems that you have conflated or confused several different parts of the text entry and display question. In particular, I'm not sure what you think ASCII means. (From Britannica: the American Standard Code For Information Interchange is a "standard data-transmission code that is used by smaller and less-powerful computers".) FamilySearch's website uses Unicode for character encoding, which currently has more than 140,000 characters. (ASCII is the first 128 of those; even with all of its later extensions, it's a very small fraction of Unicode.)
Formatting such as italics and bold is completely separate from character encoding. Yes, source citations in Family Tree do not allow for such formatting, but this is because its function of defining/labeling data is done more directly: instead of italicizing text to indicate that it's a title, you write it in a field labeled Title. There are shortcomings in the source-generation processes on FamilySearch, but its basic setup separating text data from function labels is completely in line with current "best practice" beliefs/guidelines.
0 -
Echoing Gordon and Jeff, I have no trouble whatsoever typing citations on FS in Hungarian (which uses áéíóöőúüűÁÉÍÓÖŐÚÜŰ along with the 52 alphabetic characters on a standard American keyboard).
It seems that you have conflated or confused several different parts of the text entry and display question. In particular, I'm not sure what you think ASCII means. (From Britannica: the American Standard Code For Information Interchange is a "standard data-transmission code that is used by smaller and less-powerful computers".) FamilySearch's website uses Unicode for character encoding, which currently has more than 140,000 characters. (ASCII is the first 128 of those; even with all of its later extensions, it's a very small fraction of Unicode.)
Formatting such as italics and bold is completely separate from character encoding. Yes, source citations in Family Tree do not allow for such formatting, but this is because its function of defining/labeling data is done more directly: instead of italicizing text to indicate that it's a title, you write it in a field labeled Title. There are shortcomings in the source-generation processes on FamilySearch, but its basic setup separating text data from function labels is completely in line with current "best practice" beliefs/guidelines.
0 -
As I've been thinking through the day, I have to wonder if the problem is that FamilySearch doesn't really have any clear instruction on how to access the other character sets that reside in the "upper-level" reaches. I've never tried this before in source citations, but checked today and found that the source citations fields, all of them, accept quite nicely whatever my keyboard can produce. And my Mac provides a multitude of keyboards that the operating system makes very easy to switch between including:
Russian: ершы шы кфтвщь ршеештп еру лунищфкв
Sanskrit: ूपगे गे सदी ीोल्दस ूांू
Urdu: ھعرع یس ا لہت ہف راندہم تعشت
I can use all of these in Sources and they save and display just fine. So this is not a problem FamilySearch needs to solve. It is just learning what your computer can or can't do.
0 -
As I've been thinking through the day, I have to wonder if the problem is that FamilySearch doesn't really have any clear instruction on how to access the other character sets that reside in the "upper-level" reaches. I've never tried this before in source citations, but checked today and found that the source citations fields, all of them, accept quite nicely whatever my keyboard can produce. And my Mac provides a multitude of keyboards that the operating system makes very easy to switch between including:
Russian: ершы шы кфтвщь ршеештп еру лунищфкв
Sanskrit: ूपगे गे सदी ीोल्दस ूांू
Urdu: ھعرع یس ا لہت ہف راندہم تعشت
I can use all of these in Sources and they save and display just fine. So this is not a problem FamilySearch needs to solve. It is just learning what your computer can or can't do.
0 -
Here is a screen shot from my Source Box of a source I created.
0 -
Here is a screen shot from my Source Box of a source I created.
-1 -
That's right Gordon. But the Mac has always had a fairly consistent way over the years of handling fonts and extended character sets. On the PC it is a bit different and depending on the OS version you are using and even the application you are using at the moment, the handling of these things can vary from PC to PC, OS to OS, and app to app.
0 -
That's right Gordon. But the Mac has always had a fairly consistent way over the years of handling fonts and extended character sets. On the PC it is a bit different and depending on the OS version you are using and even the application you are using at the moment, the handling of these things can vary from PC to PC, OS to OS, and app to app.
0 -
Gordon and Jeff are getting into yet another part of the complex text-entry-and-display equation, namely the text entry part: how do you generate a character that isn't on your keyboard? Websites have absolutely no control over this; expecting FamilySearch to do something about it is like expecting your doctor's office to plow the roads so you can make it to your appointment in a snowstorm.
0 -
Gordon and Jeff are getting into yet another part of the complex text-entry-and-display equation, namely the text entry part: how do you generate a character that isn't on your keyboard? Websites have absolutely no control over this; expecting FamilySearch to do something about it is like expecting your doctor's office to plow the roads so you can make it to your appointment in a snowstorm.
0 -
Another aspect that keeps coming up, but is really not relevant to what I think the question might actually be, is fonts.
The use of specific fonts to augment the available glyphs is outdated and never worked well in the first place: it doesn't matter what dingbat (yes, that's the technical term) the Wingdings font sticks in place of a C (a thumbs-up: 👍︎) if the person viewing your file doesn't have the Wingdings font installed; he's either going to get a different dingbat (such as "index right" 👉︎ in Wingdings2), or a letter C.
Unicode solves the problem by assigning a unique code to unique glyphs, which are described rather than prescribed. For example, the font that my text editor uses has a thumbs-up glyph with the hand pointing the other way (wrist on the left) than what this comment box uses (wrist on the right). Most fonts do not have glyphs for all 140,000+ Unicode characters, so browsers and other programs/apps that display text generally have a heirarchy of places to look for a glyph corresponding to any particular code: main font first, if that fails, then some other font in the same family, if that fails, then a generalist/completist font, and so on, with the eventual option of giving up and displaying an empty box or question mark or whatever.
Websites have some control over the font that text displays in. Mostly they do this by choosing popular fonts that nearly everyone has installed, because this makes it less likely that someone's browser will need to substitute a different font. The popular fonts all have fairly thorough coverage of the alphabetic parts of Unicode, but of course there's variation in what's popular in China or Japan versus the U.S. or Russia.
While character encoding and font are closely intertwined, they're not the same thing. Character encoding is the scheme by which computer codes designate glyphs. Unicode is the most thorough and complete of these, but there are still websites and programs out there (such as Legacy, the [inexplicably, to me] popular family tree application) that do not support it. Incompatible or mismatched character encoding causes gibberish displays (called "mojibake", based on a Japanese term: https://en.wikipedia.org/wiki/Mojibake).
0 -
Another aspect that keeps coming up, but is really not relevant to what I think the question might actually be, is fonts.
The use of specific fonts to augment the available glyphs is outdated and never worked well in the first place: it doesn't matter what dingbat (yes, that's the technical term) the Wingdings font sticks in place of a C (a thumbs-up: 👍︎) if the person viewing your file doesn't have the Wingdings font installed; he's either going to get a different dingbat (such as "index right" 👉︎ in Wingdings2), or a letter C.
Unicode solves the problem by assigning a unique code to unique glyphs, which are described rather than prescribed. For example, the font that my text editor uses has a thumbs-up glyph with the hand pointing the other way (wrist on the left) than what this comment box uses (wrist on the right). Most fonts do not have glyphs for all 140,000+ Unicode characters, so browsers and other programs/apps that display text generally have a heirarchy of places to look for a glyph corresponding to any particular code: main font first, if that fails, then some other font in the same family, if that fails, then a generalist/completist font, and so on, with the eventual option of giving up and displaying an empty box or question mark or whatever.
Websites have some control over the font that text displays in. Mostly they do this by choosing popular fonts that nearly everyone has installed, because this makes it less likely that someone's browser will need to substitute a different font. The popular fonts all have fairly thorough coverage of the alphabetic parts of Unicode, but of course there's variation in what's popular in China or Japan versus the U.S. or Russia.
While character encoding and font are closely intertwined, they're not the same thing. Character encoding is the scheme by which computer codes designate glyphs. Unicode is the most thorough and complete of these, but there are still websites and programs out there (such as Legacy, the [inexplicably, to me] popular family tree application) that do not support it. Incompatible or mismatched character encoding causes gibberish displays (called "mojibake", based on a Japanese term: https://en.wikipedia.org/wiki/Mojibake).
0