It is possible to make proportional Oldstyle figures default in a Opentype font?
And, is there a way to make it work on a «non-design» software (for example MS Word)?
Thank you very much guys.
Juan Pablo del Peral
It is. Just have a look at Microsoft's Cleartype fonts which ship with Vista. :)
You can do whatever you want. The guidelines mention that tabular/lining is typically the default, but not assumed. If having oldstyle proportional figures makes sense as the default for your font, go ahead and do it.
The way to do it is to put the oldstyle proportional glyphs in the normal zero-nine slots, and put the others in alternate slots. Then write your feature code for onum, tnum, lnum, and onum to select the proper glyphs accordingly. (There is no "default number style" feature tag--the layout application simply removes any of the four number style features that may be active, leaving whatever you have put in the zero-nine slots.)
Karsten, all of the ClearType fonts except Consolas were originally designed to have proportional oldstyle numerals as defaults. But when Office decided that they wanted Calibri and Cambria to be new default fonts in Word, they requested that we change the encoding so that the lining tabular numerals became default. [This, by the way, messed up some of the superior numeral lookups and other things, currently being fixed.]
Thank you very much!
I still have a question. What would be the naming of those glyphs? I mean: for each character you have 4 glyphs. For example: for 1 you have one, oneoldstyle, unif644 and unif6dc. (At least it's like this in Caslon Pro)
If I use the one slot for the oneoldstyle glyph, how I should use the others slots?
Thank you again and sorry about this «english».
Ignore those names in Caslon Pro. Adobe isn't using that approach any more.
All the numeral glyph names should be parseable back to the default numerals. Other than that, you can name variants with whatever suffixes make sense to you. I use a two-letter suffix for numerals, one for style and one for spacing:
one (default numeral)
one.OP (oldstyle proportional)
one.OT (oldstyle tabular)
one.LP (lining proportional)
one.LT (lining tabular)
one.SP (smallcap-lining proportional)
one.ST (smallcap-lining tabular)
Adam Twardoch favours using OpenType feature tags as suffixes, something like:
one (default numeral)
one.onumpnum (oldstyle proportional)
one.onumtnum (oldstyle tabular)
one.lnumpnum (lining proportional)
one.lnumtnum (lining tabular)
one.smcppnum (smallcap-lining proportional)
one.smcptnum (smallcap-lining tabular)
Thank you very much for the information. I was very confused but now I understand all.
See you guys and thankx again!
"...but now I understand all."
Cool! So you know that we're all working in a box where every side presents bad options, as do the top and bottom.
That is to say, that OS and apps are retarded in their acceptance of OpenType, so founders and not users, are the only ones capable of setting the defaults for case.
That is to say that unicode treats lowercase letters like one of the family, but lowercase numbers, like an orphan. (everybody! make up names!)
That is to say the tools for making fonts are retarded by having limits on the lengths of strings, making John's names for things as short as possible.
That is to say, what Adam's amazingly repetitive naming is about, the inability to organize properly, as his names are neither short, (like John's, for ease on the designer) or to the point, (like a user could read 'em).
Isn't it cool how few smart, purely logical and user-friendly options we have?
Or capable of setting a number of things, all the proper domain of users, not founders. I would particularly like to see the user be able to add to the exception kerning list while leaving the basic class kerning untouched, and save this in the font, not in some external program that always has to be read in.
By in large, I agree with all you've said (leaving Adam's naming system out of it). Perhaps in OpenType, version 2?
"...OpenType, version 2"
...if there is a google term for a search that finds only 3 results, you may accurately use it. :-)
I don't think there are technical limitations precluding handling kerning in that way. But since very few font vendors' licenses would allow such modification of the original font, I don't see that such a proposal is likely to be met with much excitement from application creators.
I'm well aware that Adobe is one of the few foundries that does allow user-modified kerning, and more. The "more" issues can wait, but I feel compelled to point out that the user can modify the kerning, one instance at a time, in most any layout program making any pretensions to "setting type."
The point of having kerning in the font is so the compositors do not have to repeat their work every time a certain letter/symbol sequence occurs, for the bulk of material to be set (i.e., text).
I have been kerning type for over 20 years, and still cannot get things right on my first attempt. I use "first attempt" to cover several passes and laser printouts. I find I have to see the type as it appears in the final printed form, more than once, and revisit the kerning several times after seeing ink on paper.
While some may argue that I am just inordinately slow in arriving at passable kerning, I disagree. Moreover, I rather doubt that any foundry can get the kerning right before releasing a font. Nor would most of us typesetters expect that. In the photocomp days, it was expected that the type shop would make up their own kerning tables. One of the many things I admire about Matthew Carter is that he both understands and acknowledges the difference between type designers and type users.
Since the typesetter can change the kerning one instance at a time, and can made up any accented character out of the several glyphs in the font, it make no sense to prohibit them from making these changes permanently, in the font itself.
I agree font licenses are necessary. But in their present constitution, they make the product virtually unusable for a significant segment of type users. I don't have all that much influence in my sector of the market, but when asked, I always recommend against a designer using a font from a foundry where the license prohibits a use we all know is likely to occur.
This is a forum frequented more by type designers than by type users. But consider this: with rare exceptions, we do not frame & hang a font specimen sheet in a museum. If the licenses stray too far from the conventions of using type, both the type designers and the type users are in trouble.
Note: I made an edit; this post may have originally been made before Chris's post.
I can't imagine a typeface vendor wanting to deal with tech support issues on any altered font even if the EULA allowed modification.
I can certainly understand the need for more cooperation on support of fonts and OpenType features between, OS developers, Application developers, and Typeface developers. Naming conventions and menu hierarchies among all parties could be agreed upon and followed as a standard. The 4 font linked family maximum at this point in time is archaic. Lack of support for OpenType features is unacceptable by now as well. Font developers should not need to invent workarounds to get fonts to function as expected in non-compliant applications. Kerning should not be buried where users need dig for the option to turn it on.
I would particularly like to see the user be able to add to the exception kerning list while leaving the basic class kerning untouched, and save this in the font, not in some external program that always has to be read in.
I think that an external file is the best way to deal with this. This clearly separates data as provided by the originator of the file, and the modifications.
What would be really great were something like Adobe's SING Glyphlet. This one however would not hold outlines, but GPOS/GSUB tables with features, to be used instead of same-named features present in the font, or as additional features if not present in the font. This would also hold some metadata taken from the font it refers to, especially names and version, since the glyphset and thus GIDs to which OT layout tables refer may change as fonts get updated.
The advantage: It is not bound to any particular application. The real challenge would be to establish something like a standard which then would be adopted by OS and application developers. Hello Thomas!
Can Glyphlets provide substitutes for this or that glyph of a specific OT font (rather than additional ones)? This would make it much easier for designers to do adjustments to fonts for individual projects -- without need to mess around with fonts.
Maybe a Glyphlet's metadata could even name a Stylistic Set feature by which one Glyphlet file's glyphs are applied. Or allow specifying an alternative font name, so font + Glyphlet's alternate glyphs are regarded and presented as a separate font/style in the font/style menus.
Leaving the EULA matters aside, eh, if it was that easy to do, as it was with Type1, when you could modify/extend the kerning via AFM - and it has precedence over the PFM data... The beauty of obsolete font formats!
But using external file for adding such things to OTF brings additional mess when installing fonts. You have to keep the set of related files together, which kills the advantage of the single-file OT...
'Mess' seems too harsh to me. If the additional file is named like the original font, plus an extra suffix, both would be sorted next to each other. This wasn't the case with many T1 fonts whose PS file and suitcase often had very different file names. So, only advantages. :D
Not to forget -- an additional file is better that setting up kerning tables in a layout application (XPress) which are then bound to this application, and if you migrate from one computer to another you better don't forget the according files ... Search this most interesting discussion for the phrase "I had a several hundred XPress kerning files before I switched to ID" (near the end, and don't miss the following posts).
The existence of two files make it very clear that there is the original font (Don't touch! EULA!) and there are refinements and additions which the typographer thinks are necessary. It's not just two files. It's a message.
[Later addition: Like Glyphlets can be embedded in a document, 'Featurelets' might be too.]
The real challenge would be to establish something like a standard which then would be adopted by OS and application developers.
Yes. And especially since if the type designers included something like this with their fonts (as with .AFMs), they'd likely include it in the EULA.
The existence of two files make it very clear that there is the original font (Don’t touch! EULA!) and there are refinements and additions which the typographer thinks are necessary.
If the licensing agreement defeats the purpose of a font, it is the source problem. Font developers/foundries need to be protected from theft, be that stealing a font for resale under another name, or simply not purchasing enough copies for the computers being used.
I believe that the "no modifications" rule was aimed at preventing the first; that is, just doing a little work on the font & then reselling it. But that is absurd; just ban the the reselling of the font, or any *derivatives* from it (if that's the right word).
There is also the question of damaging the font maker's reputation by some truly awful kerning or glyph modification. But these can be done in the applications program, too. Sadly, anybody can destroy the look of a font right now, without breaking the EULA.
I look at the kerning that comes with a font as an extension of getting the sidebearings right. It isn't the final solution, it is just a nice starting point for good character fit. A typesetter can use use it. But a compositor (who is also a typographer) can take over and add what makes the fonts work for him/her.
I should clarify that I am all for users being able to modify kerning of their fonts - I just don't think doing it in the original font is practical overall. I agree with Karsten and others that this sort of thing is best handled in an outboard file of some sort.
David: That is to say that unicode treats lowercase letters like one of the family, but lowercase numbers, like an orphan. (everybody! make up names!)
There's an unfortunate tendency to chastise the people responsible for character encoding standards for not understanding how typographers look at text, when that is not their task or responsibility. One might as well chastise typographers for not understanding how encoded character sets work.
Unicode is a plain text encoding, which means that -- with idiosyncrasies and exceptions to accomodate conflicting principles of inclusion, e.g. backwards compatibility with pre-existing standards, which is in part responsible for the proliferation of script-specific numerals (European, Arabic, Persian, Devanagari, etc.) -- Unicode seeks to encode whatever is necessary to preserve the semantic content of text, not everything necessary to its display. The display of text is something that happens, as they say, at a higher level, for which the text encoding provides a foundation as one of the many functions that the text encoding must serve, i.e. it is a mistake to assume, just because we are typographers, that the providing for the correct typographic display of text is the principle or even one of the most important purposes of a character encoding standard.
Now, the distinction of case in our bicameral orthographies frequently carries a semantic distinction, for instance in the capitalisation of proper nouns in English and, more dramatically, in the capitalisation of all nouns in German. That is semantic information that should not be lost in a plain text encoding; ergo, the upper and lowercase letters are separately encoded even though in many other ways they are functionally identical, notably in representing the same phonemes (from which it should be obvious that case distinction is more semantically meaningful for displaying text than it is for voice synthesis). We can summarise this by saying that Latin alphabetic letters are orthographically bicameral.
Numerals, on the other hand, are not only not orthographically bicameral they are not alphabetic. All numerals are ideographic: they do not represent sounds, they represent quantitative and combinatory concepts. There is never any semantic distinction between a cap-lining numeral 2 and an 'oldstyle' numeral 2: they always represent the same semantic content, and ergo there is no need to preserve a merely visual distinction between then in plain text. There are many good reasons why one would want to preserve a visual distinction in various kinds of typeset text, but that is quite rightly considered a display issue, not an encoding issue.
So the questions regarding how font developers, application developers and users might conspire to select appropriate styles of numerals for specific documents, and for specific uses within those documents, are all questions of layout feature mapping and user interface. The situation, for developers and users, is far from perfect -- see the complaints I have made in the past regarding Adobe's OT numerals style selection model, which precludes having anything other than oldstyle or lining numerals as the default style (no three-quarter height or hybrid numerals, for example) because there is no way to access the proportional or tabular spacing features independently of style selection. But no part of this problem is due to Unicode's sensible encoding of a single set of European numerals on the basis of their semantic content.
We've had this discussion here before, at which point I argued that it was presumptive, and problematic to typography for Unicode to claim the semantic distinction of uppercase/lowercase casing as a quality of encoding, rather than leaving it to layout.
that is not their ... responsibility.
On the contrary, in as much as their actions impact upon typography, they have a responsibility towards it.
Nick, you were wrong then and you are wrong now. Once again: case distinction has a semantic value in bicameral orthographies ergo it needs to be maintained at the plain text level. This isn't a negotiable position or an opinion, it is a necessary feature of a text encoding that sets out to do what Unicode sets out to do. And I just spent several paragraphs explaining why the decision to encode a single set of European numerals based on their semantic value and not on their appearance does not have an impact on typography: what has an impact on typography is what is built on top of that semantic foundation, which is display, which is not plain text, which is text layout services, which is fonts, and which is ultimately a user interface issue.
And again: it is a mistake to assume, just because we are typographers, that the providing for the correct typographic display of text is the principle or even one of the most important purposes of a character encoding standard. The majority of what gets done with the majority of digitally encoded text involves operations in invisibility, i.e. the text isn't displayed at all. How much of the billions of 'pages' of text indexed by Google ever gets displayed? How often? To how many people? That's just one example of something that requires immense material and processing resources that is completely text-centric and yet only comparatively rarely involves display when, or if, someone wants to search for and read a particular piece of text. And that processing is a heck of a lot easier if one doesn't have dozens of encoded variants with the same semantic value. So if you want to talk about things on which Unicode has a direct impact, you might consider all the things that people do with text in addition to displaying it.
that processing is a heck of a lot easier if one doesn’t have dozens of encoded variants with the same semantic value.
Google searches ignore the semantic value of casing. Doesn't this example support non-bicameral encoding as being more efficient?
my glyphname suggestions have a very simple background. The OpenType specification offers a well-documented list of four-letter feature tags:
In this documentation, the user will be able to find out that "sups" stands for Superscript, "smcp" for Small Caps, "swsh" for Swash and "ss02" for Stylistic Set 2. The user will be able to read the concise descriptions of what those features are all about.
Since this stuff is and will continue to be widely documented, why not use those four-letter tags as glyphname suffixes as well? This way, it will be clear that "a.smcp" stands for an "smcp" i.e. small-cap variant of "a", and "ampersand.ss02" stands for a "ss02" i.e. stylistic set 2 variant of "ampersand", that "copyright.sups" is the superscript version of the copyright sign and "one.onum_tnum" is the oldstyle tabular variant of "one".
Moreover, smart software should be then able to automatically realize "Aha! The designer has put the ampersand.ss02 glyph in the font so I should build the 'ss02' feature code that will map ampersand to ampersand.ss02".
This way, the glyphnames can become building recipes for OpenType Layout feature code -- no manual coding required.
Nick: Doesn’t this example support non-bicameral encoding as being more efficient?
To clarify. There are two, contentions here. David is implying that Unicode should have encoded 'lowercase numerals', and you are implying that Unicode should not have encoded the Latin script as bicameral at all.
I cited things like indexing and searching as reasons why it is not a good idea to encode multiple variants of things with the same semantics, i.e. why encoding 'lowercase numerals' is not a good idea, because there is no semantic distinction between a lining 5 and an oldstyle 5 (semantic distinction in the context of character encoding meaning specifically semantic distinction that needs to be preserved at the plain text level; obviously there are higher levels of semantic or quasi-semantic significance that might be made in particular use of lining or oldstyle numerals as a function of typographic articulation).
I had already pointed to the fact that upper- and lowercase letters do have orthographically determined semantic differences that need to be preserved in plain text, and ergo need to be separately encoded. Since this need precedes any consideration of things like index and search efficiency, the Google example does not have any significance for the character encoding model.
Adam: Moreover, smart software should be then able to automatically realize “Aha! The designer has put the ampersand.ss02 glyph in the font so I should build the ’ss02’ feature code that will map ampersand to ampersand.ss02”.
This way, the glyphnames can become building recipes for OpenType Layout feature code — no manual coding required.
Yes, although the same functionality could be achieved with any other suffix pattern: all you would need is a mapping of naming conventions to features.
Sergey has been considering adding a similar kind of functionality to VOLT, only in this case as a means of allowing the use of wildcards in lookup definitions. So, for instance, one could code the entire 'smcp' feature with a single line:
* -> *.smcp
you are implying that Unicode should not have encoded the Latin script as bicameral at all.
Perhaps there could have been a way of structuring Unicode to better accomodate the semantic structure of typography.
Unicode is a plain text encoding, so its structure is defined by the needs of a plain text encoding (along with certain adjustments for backwards compatibility). There are, sometimes, more than one way to skin the plain text cat, but I'm not sure any of them 'accomodate the semantic structure of typography' better than another. Why? Because the semantic structure of typography is hugely variable, much of typography isn't semantic at all but aesthetic, and it isn't easy to disentangle the semantic and the aesthetic in typography. Can you give me an example, for instance, of a situation in which the distinction between lining numerals and oldstyle numerals is properly semantic, i.e. where a different meaning is implied? Your phrase 'semantic structure' suggests something like a grammar, but while there are multiple articulatory and organisational conventions observed within typography, I'm not sure that there is anything that functions quite like a grammar. Further, I don't think what is typically signaled in those conventions is semantics, properly understood. For example, consider the use of a different size and style or weight of type to signal a subhead in a document. This is an organisational convention, specific to the document but also part of a broader set of conventions for signaling subheads that has developed within a particular tradition of publication design. [Note also that the concept of a subhead is not typographic, it is editorial; it is only signaled typographically.] The typographic treatment of the subhead helps to visually articulate the structure of the document. It tells the reader that this piece of text is a different kind of information. And this is an important job. But it doesn't actually affect the meaning of the text, so I'm not inclined to call it semantics. The idea of articulation is, once again, helpful. When we speak, how we speak helps a listener to understand our meaning, but this is extra-semantic in a linguistic sense. A tone of voice might indicate that we are being ironic, and that what we mean is actually the opposite of the words we are saying; but that irony itself is resting on top of the actual semantics of those words and their organisation: the semantics belong to the words, not to how we say them. We can play off the semantics verbally or visually, in speech or typography, but only because the semantics are independent of that play.
John: "There’s an unfortunate tendency to chastise the people responsible for character encoding standards for not understanding how typographers look at text"
Oh? Thinking only as a user, is why I added to this thread. Normal users lack the tendency to chastise the people responsible for character encoding standards, only because they have typographers instead.
"How much of the billions of ’pages’ of text indexed by Google ever gets displayed?"
"All of it that 'can get' displayed, will be displayed with you in mind," is the proper answer from a typographer to a user. Thinking only as a user, if text remains invisible, it doesn't matter what it looks like, unless it effects my credit rating.
Adam:"...why not use those four-letter tags as glyphname suffixes as well?"
Well, do you see all kinds of goofy abbreviations in the menus of applications programs?
Remember....TNRI09. Before I was even a user thought I, 'they' have to fix this soon. They didn't fix it soon, but rather, eventually. And, slowly the evil ninnyhood of that abbreviation system, dies away.
Thinking only as a user, I added my voice of concern to the four letter names for features being presented to users, and that seems to be, by general consent, on the way out the door, in favor of names users can deal with, right away, straight from fouder to user, no translation, like font names...
Thinking only as a user, I don't want abbreviations for font names, feature names, or glyph names, that only connect mentally back to somebody's production solution, showing up on my screen.
So, no matter how Unicode does "what Unicode set out to do", it has encoded uppercase and lowercase alphabets, not upper and lowercase figures, and driven a wedge into typography that confounds users enough — Let's not punish them more by calling them funny names.
The different typographic ways that the same letter may be displayed comprise a semantic system which is used by type designers, font developers, application engineers, and graphic designers both professional and amateur.
The semantics of type are not the same as the semantics of "plain text" as it is understood by linguists and programmers, who are more at home, and in command of the digital system, than typographers.
As stakeholders, typographers and type designers haven't involved themselves much, understandably, because few have the necessary skill sets.
The way things have been done in the digital era does have a certain efficiency, but by blowing off the messy business of typographic semantics into the realm of display and "aesthetics", it has created huge difficulties for the type industry. Here are some examples of the way that linguists and programmers have advanced typography, but failed to meet the more discriminating needs of typographers, privileging their own conception of what is necessary in text--
1) ASCII encoding: plenty of mathematic symbols, but no "curly quotes" or fi and fl ligatures.
2) The 4-member font family: this defines a bicameral concept of typographic weight that continues to plague foundries developing multi-weight families for cross-platform use. It restricts typographic design in the broad arena: a contemporary sans serif like Verdana should have had Thin, Light, and Extra Bold weights. With an incremental range of weights, modernist layout is possible, but this has been severely discouraged on the internet. Even today, the ClearType fonts offer users (both typographers and readers) bicameral weighting, not range weighting. This is one hundred years after the introduction of multi-weight type families, and fifty years after the famous Univers chart, and after 20 years of making digital typography, Microsoft still hasn't caught up and offered users this basic feature of modern typography.
3) Separate Unicode numbers for upper and lower case letters. By coding both, things have been made extremely complicated for font and application developers, and typographers. Small capitals, unicase, figure styles, superiors and inferiors--all these typographically meaningful features of sophisticated typography are awkward to twist one's head around when there are separate encodings for upper and lower case letters. The result is different implementations between different layout applications, and from different foundries, which doesn't help advancing the cause.
4) Looking to the future, I suspect that any standards that are adopted for downloadable web fonts will have similar difficulties.
"4) Looking to the future, I suspect that any standards that are adopted for downloadable web fonts will have similar difficulties."
Do, have. Will is the future tense. :)
Standards always seem to be the carrot on the string, being chased by one and all but never attainable.
David: “All of it that ’can get’ displayed, will be displayed with you in mind,” is the proper answer from a typographer to a user.
I completely agree. But the business of providing that display is necessarily something that has to happen above the plain text level, i.e. display should be kept as much out of the character encoding as possible (but not any further out). And I'm not just saying that for the benefit of character encoding and text processing: I'm saying it for the sake of display too. It's a church and state matter: the separation is intended to be mutually beneficial, not to protect one from the other.
So, no matter how Unicode does “what Unicode set out to do”, it has encoded uppercase and lowercase alphabets, not upper and lowercase figures, and driven a wedge into typography that confounds users enough
While I agree with the analysis of 'oldstyle' numerals as being, structurally, akin to lowercase letters, I don't recall ever having seen them referred to as 'lowercase numerals' in any book on typography. I don't think this is a wedge that Unicode has delivered -- in any case, the unitary encoding of numerals is something that predates Unicode by a long time --; I think different numeral styles have been poorly analysed for a long time, and poorly analysed by most typographers.
But it seems very obvious to me that there are two perfectly legitimate ways in which to analyse numeral styles, one in which their visual distinctiveness is considered significant, and one in which their semantic lack of distinctiveness is considered significant. Not surprisingly, in the context of plain text encoding, the lack of semantic distinction is the more significant feature, because plain text encoding is concerned with semantics.
Nick: The different typographic ways that the same letter may be displayed comprise a semantic system which is used by type designers, font developers, application engineers, and graphic designers both professional and amateur.
As I wrote above: 'Can you give me an example, for instance, of a situation in which the distinction between lining numerals and oldstyle numerals is properly semantic, i.e. where a different meaning is implied?'
As I tried to explain, the kind of 'semantics' that can be expressed in typographic variation are semantics of information type, as in my subhead example. As I also tried to explain, these do not constitute 'a semantic system', because they are too numerous and variable. One can say that there are organisational and articulatory conventions, which may be systematisable in a variety of ways producing a multiplicity of systems: such is the great richness of typography in its many historical and local expressions. But the semantic foundation of all typography is linguistic semantics, the semantics of words and grammar that constitute text. So it should neither surprise nor dismay us that the foundation of computer typography -- in which for the first time the semantic content of text and the visual display of text are so closely linked -- is likewise the linguistic semantics, i.e. plain text encoding.
What should suprise and dismay us is that the controls for manipulating the typographic display of text are still so crude, and that anyone thinks these limitations can or should be addressed at the character encoding level.
I don’t recall ever having seen them referred to as ’lowercase numerals’ in any book on typography.
Actually, yes, by Paul Rand.
But this has more to do with when to use them than to suggest they are semantically different.
the semantic foundation of all typography is linguistic semantics
John, you've explained very well, many times, that your concept of semantics is linguistically centred.
these [typographic variations] do not constitute ’a semantic system’, because they are too numerous and variable.
Typography is not a semantic system? That doesn't seem right. The various parts work as a whole, and have meaning in relation to one another. They are, to a large and growing extent, independent of language and script.
What should suprise and dismay us is that ... anyone thinks these limitations can or should be addressed at the character encoding level.
One would indeed be foolish to imagine that Unicode could be fundamentally modified at this point in time, but one shouldn't be detered from contemplating its shortcomings.
Can you give me an example, for instance, of a situation in which the distinction between lining numerals and oldstyle numerals is properly semantic, i.e. where a different meaning is implied?
Jonathon Hoefler designs three-quarter lining figures as the default for a news text face (Mercury) because he considers old-style figures to be too "twee". He makes this meaning quite clear to prospective users by including it in his foundry's description of the typeface.
Semanticity isn't black and white. Everything about every expression of language probably has some semanticity to it, if only for the fact of not having chosen alternative possibilities. But I think there's a sense in which it's fair to say that the difference between "a" and "b" has more semanticity than the difference between "a" and "A", which has more semanticity than the difference between "a" and "a", which in turn has more semanticity than the difference between a Fournier "a" and a Garamond "a". The Unicode people have chosen to draw the line after the first two in the case of bicameral alphabets. One can debate whether that was the right place to draw the line - a strong case could be made that capitalization is a stylistic feature whose semanticity is additive rather than fundamental to the text - but obviously a line has to be drawn somewhere for practical purposes.
It does make things look more frustrating, though, when the standard isn't internally consistent in applying its own rules. A lot of Unicode's shortcomings exist because it really has two different, partially incompatible, purposes. It isn't just a plaintext encoding system for all languages; it's also intended to unify all the pre-existing encoding standards (and be congruent with ISO 10646). As a result, lots of things are in it that shouldn't be according to its own criteria, because somebody had encoded them already - like the fi and fl ligatures. I suppose the prevailing attitude of trying to do the least damage by prioritizing backwards compatibility above adherence to rules is sensible, but it does confuse people who think their ligatures - or whatever other presentational form that didn't make the cutoff - ought to be encoded too.
it does confuse people who think their ligatures - or whatever other presentational form that didn’t make the cutoff - ought to be encoded too.
Even if you're not confused, it's difficult to work with.
Here is the OpenType panel in TextEdit, IMO one of the most intelligent OT interfaces.
Even here we are left with the mysterious "No Change"--an admission that the application doesn't know what case the font's default figures are.
Note the clear presentation of typographic semantics; if nothing else, the designer has been forced to confront the issue of what exactly is the relationship between old-style and lining figures, and has decided that it is casing, which seems to make a lot of sense. Sure, we always said, "old style figures are like lower case, with ascenders and descenders," but that didn't mean we conceived of lining figures as upper case. The impediment was type technologies which more often than not had lower case letters married to lining numbers, as default. And in the early days of DTP, with SC&osf, it was even stranger, as if old-style figures were primarily part of the small cap case. So we are making progress towards a more "proper" semantics of typography.
Of course, there are ways other than character encoding to assign case to numerals.
"Can glyphlets provide *substitutes* for this or that glyph of a specific OT font (rather than additional ones)?"
Yes, they certainly can.
"I believe that the 'no modifications' rule was aimed at preventing the first; that is, just doing a little work on the font & then reselling it."
I don't think that is entirely accurate. I think the most common two reasons for the "no modifications" rule are:
1) Tech support concerns, not wanting to deal with supporting modified fonts. (Adobe deals with this by telling people they can make mods, but they're on their own for tech support once they do so. Kind of like the computer hack that voids your warranty.)
2) Simple possessiveness around the design, not wanting it messed up by some bozo who doesn't know what they're doing. Yes, people can mess it up just in their regular applications, but they can do even more damage with a font editor. :/
Nick: John, you’ve explained very well, many times, that your concept of semantics is linguistically centred.
Semantics is a subcategory of linguistics, and almost all other uses of the term are derived from this and essentially metaphorical, in the same way that many uses of the word 'language' are metaphorical, e.g. 'language of painting', 'language of music', etc.. Metaphors are useful and important things, but one shouldn't make the mistake of thinking that because a metaphor works on one level that it works on all levels or, to put it another way, that because something is like language in some respects that it is like language in all respects and actually functions like language.
Jonathon Hoefler designs three-quarter lining figures as the default for a news text face (Mercury) because he considers old-style figures to be too “twee”. He makes this meaning quite clear to prospective users by including it in his foundry’s description of the typeface.
I asked for an example of an oldstyle numeral meaning something other than a lining numeral; that is, meaning something other than e.g. five. That is semantics. Thinking that a particular style of numerals is too 'twee' for a particular use is not semantics. That isn't a question of meaning but of association and judgement, which may be conventional, subcultural or idiosyncratic as the case may be. If I were to disagree with Jonathan's judgement of the appropriateness of oldstyle numerals in Mercury, would that imply anything at all about the meaning of the numerals? Would 5 suddenly stop meaning five? Would 5+5 equal something other than 10? Of course it wouldn't.
Further, even if one applies a metaphorical use of 'semantics' to include this kind of associative judgement, i.e. if one calls this a kind of 'meaning', there is still nothing there that would be a proper subject for a plain text character encoding, since that kind of 'meaning' is not the kind of meaning that needs to be preserved in plain text; indeed, it is the kind of 'meaning' that presumes styled text. If one were to determine that there should be an encoding that preserves all the possibly associative, emotional, or aesthetic aspects of typography, then the distinction between plain text and styled text would be pointless.
Here is the OpenType panel in TextEdit, IMO one of the most intelligent OT interfaces...
Thank you, Nick, for demonstrating the point I have been trying to make: this is a user interface issue, not an encoding issue. Here's a better radio button interface:
In this case, it doesn't matter a rat's arse what numerals are the default encoded glyphs in the font cmap, because the numeral style and spacing are always controlled by the user. The appropriate OTL features are active according to the user settings, and these can be set for all new documents.
*Lining can also mean smallcap lining if such variants are available in the font, since the numeral features can be interactive with the smallcap features. You could even have petite-cap lining numerals if you wants, and even oldstyle numerals ranging around the smallcap or petite-cap heights.
Semantics is a subcategory of linguistics, and almost all other uses of the term are derived from this and essentially metaphorical
It's true that typography is almost meaningless without text, but it doesn't follow that its system of meaning is a metaphor for language. Both sides of the brain are used in reading; the side concerned with language is somewhat better organized--or perhaps just more conducive to formal exegesis.
If I were to disagree with Jonathan’s judgement of the appropriateness of oldstyle numerals in Mercury, would that imply anything at all about the meaning of the numerals? Would 5 suddenly stop meaning five?
What the disctinction between old-style and lining numerals means to typographers has no bearing on the meaning of 5. But it is meaning nonetheless, and important to font users, as opposed to readers.
that kind of ’meaning’ is not the kind of meaning that needs to be preserved in plain text;
Right, that was agreed earlier in the thread. And taking a harder line one may ask, does capitalization belong in plain text? A hypothetical question, as it's well ensconced by now, but worth asking to remind oneself that what's most logical from a linguistic perspective doesn't always make things easy for typography.
Here’s a better radio button interface
This is actually what it looks like when a font is loaded:
[X] No change
[X] No change
Because, as I noted earlier, how is the application to know what the default figure style is?
Although it's not coded in the glyphs, couldn't a layout application determine the status of the default figures, by noting which classes they appear in?
In Germany capitalization can be very important in plain text, as John already mentioned; Even more dramatically in the polite form of address.
And yes, I love examples:
"Ich kann s ie nicht leiden!" = "I don't like her!",
"Ich kann S ie nicht leiden!" = "I don't like you!"
I asked for an example of an oldstyle numeral meaning something other than a lining numeral; that is, meaning something other than e.g. five. That is semantics.
Last week, I saw an annual report with such a distinction:
The designer used osf and lf side by side, same font, same size, on the same page/line: osf were used for dates (of years etc.) and other text, while the lining figures were reserved for figures (i.e. amounts of money). Looked quite good and perspicuous (though unfamiliar at first).
Florian, but still 5 means 5, right!?
but 5 years is not the same as 5 Euros. It can be a difference of syntax.
Florian, that's an interesting example. The style of numeral still doesn't affect the semantics of the numeral, but it is used as a unit indicator. The same thing could, of course, be done with any other differentiation in styling (bold, italic, etc.), and this sort of thing isn't uncommon. It is part of what I call organisational or articulatory typography, in this case as an aspect of information design. But still the semantics of the numerals themselves are unchanged, and this is what plain text encodes.
Nick: And taking a harder line one may ask, does capitalization belong in plain text?
Yes, because it is semantically important in the linguistic sense in the orthographies of many languages. If people only used capital letters in algorithmically determinable situations, then one might argue that it would be unnecessary to encode them in plain text, but that is not the case: capital letters are used to indicate different types of words, and that is important semantic information that needs to be preserved. Do you want to tell the Germans that they can't indicate nouns in plain text? Do you want to be nick instead of Nick?
Christoph, that's a great German example: a nice case (excuse the pun) of capitalisation in the orthography providing more information than the spoken language.
capital letters are used to indicate different types of words, and that is important semantic information that needs to be preserved.
Important, but not fundamental. Germans, for instance, are comfortable with all-cap text.
The way we work at the moment operates in a three-part mode, with plain text, rich text, and document-specific text.
And now there is XML, which takes textualization even further (and is considered a "language"!)
Given these different degrees of encoding, might it not have been better, for the system as a whole, and for typography in particular, to assign capitalization to rich text?
If red colored figures in a table of figures can mean a negative balance and black positive, why not the cited example of lining figures for currency and oldstyle for dates? It is an indicator of content no matter how you catagorize it.
Chris, I didn't say that the distinction between numeral styles could not be used to indicate different kinds of information, in the same way the colours or other styling such as bold and italic may. This is one of the things that typography does. But indicating different kinds of information, such as units of quantity, doesn't constitute a change in the semantics of the numerals. Now the question of positive or negative numbers is an interesting one, since 5 is indeed semantically different from -5, so if you represent -5 as 5 in some particular styling, e.g. 5 , to distinguish it from 5, then you are indeed using style to indicate a semantic difference. And it is a very dangerous thing to do. You better hope that whoever is preparing documents in which negative numbers are presented as positive numbers only in red is working from a source text in which either those negative numbers are encoded as e.g. -5 or are in some way tagged as negative numbers. And you better hope than no one who needs to cite that document is cutting and pasting from an electronic version in which the plain text distinction between 5 and -5 has been lost.
Using numeral style to indicate such a semantic difference is much more ambiguous than using colour or other stylings, though. How quickly and reliably can you tell the difference between an oldstyle 8 and a lining 8 in isolation?
Nick: Important, but not fundamental. Germans, for instance, are comfortable with all-cap text.
All-caps text is anomalous in many orthographies -- Greek anyone? --; it is a special case used in exceptional situations, and it comes with caveats including the fact that it may obscure information that is part of the normal written language and introduce ambiguities of meaning. The Germans, like the Greeks, put up with these caveats because they like the look of monumental capitals as much as anyone, but like other users of bicameral scripts -- excepting some concrete poets, avant-gardists, teenagers and wankers -- what they don't want is all-lowercase text.