The enduring unhelpfulness of “font size”
I have to mix serif, sans, and mono face fonts all the time. It’s quite common in technical writing. But it’s also witheringly difficult to get right:

That was Times New Roman, Arial, and Courier New from the ever relied upon “corefonts” all at “20 pt”. These are so widely used that they have come to be the very definition of serif, sans-serif, and monospaced. Which is sad, because they are completely useless together: each of those bits of text were written at the same “size”, but clearly they are not. There is no resemblance between the different faces. The weights are completely different. And the text itself is radically different sizes. So much for being able to “just pick a bunch of fonts at the same size”.
It turns out that the root cause problem is that the point size of a font is, for historical reasons, something that isn’t directly measurable from the appearance of a single line of text. It was the body size, the size of the metal bits that were used to compose a line of text. Unfortunately there may be more (or less) spacing between lines, so the body size is not necessarily the distance between consecutive baselines. From the good ship Wikipedia’s article on typefaces, “when specified in typographic sizes the height of an em-square, an invisible box which is typically a bit larger than the distance from the tallest ascender to the lowest descender, is scaled to equal the specified size.”
Ok, so let’s see how that looks here:

Sure enough, the height of the em-square here is bigger than the ascender plus descender. It’s also wider than the letter M (traditionally an em was the width of a capital M). The em-square happens to match the spacing between baselines, though whether that’s the font or the render engine is hard to say.
In any event, the size of the font, here shown to be the size of an em-square, has no relationship to the height of the letters or the proportions between upper case letters and lower case ones:

I first learned about all this because the corefonts, from Microsoft, are cost-free to use, but not libre for changes or redistribution and thus cannot be shipped in Linux distributions like Debian or Fedora. As a replacement, Red Hat obtained and then generously released the Liberation font family. They were advertised as being a libre substitute for the corefonts which were “metric compatible”. Silly me, of course, assumed that they meant aesthetically compatible with each other, but of course they meant what they said — that they were individually metric compatible with the corresponding proprietary Microsoft corefont. Fair enough; they were meant as a “drop in” replacement so your presentations and word docs wouldn’t get screwed up if you switched to Linux. I have my doubts about even that, but in any event as you can see here it didn’t help matters in the relative size department:

There it’s Liberation Serif, Liberation Sans, and Liberation Mono. The problem is that again font metrics aren’t calibrated to correspond to each other: the font size (which as we’ve seen is the size of the em-square) has no relation to the cap-height, the x-height, or the ratio between them.
At best, you can hope that the height of the capital letters will be the same, but even that’s a stretch. Needless to say this is chaos for the font libraries to have to deal with; you see mis-sized text all the time in desktop applications; and meanwhile authors of web browsers haven’t got a chance when some things are specified in points, others in pixels, and relative % sizes appearing from time to time just to completely drive your poor renderer to the wall. Worse of all, us poor users trying to write documents achieve nothing but a muddy mess trying to achieve something which should be simple. I just want text that looks right!
Looking at these examples, you realize that what you wan’t isn’t the capital letter height to be aligned (who cares!) but for the lower case letters to be aligned. In typography this is called the “x-height”, the size of the lower case letter x.
Screen writing
We happen to have a font family that was designed to look good together on screen:

That’s Deja Vu Serif (that almost no one seems to know about; try it as your Document font!), Dejav Vu Sans (which has excellent Unicode coverage and as the default GNOME Application font is thus very widely used), and Dejav Vu Sans Mono (the complementary monospaced version, also very nice, and a good choice for your terminals and source code editor). You can see the relative sizes are almost in sync; in fact, on a computer monitor at 10pt at 90-120 dpi, they are aligned, because these are screen fonts designed for low resolution display:

More to the point, they are clearly a family that go together. That’s hard to come by, and as a family are fantastic on screen. [Droid Serif, Droid Sans, Droid Mono are also nice as a family and are fine on small device screens with even lower density, but for a proper laptop or desktop screen they're a bit lacking in refinement].
But these are all screen fonts, and, frankly, just don’t look right printed at 1200 dpi on paper — look at the glyph spacing and the blockiness of the serifs. To be expected; Deja Vu (based on Bitstream Vera) was designed for readability at low res; things like crossbars on ts and fs and the ear on the g have to be placed so that they work in the limited number of pixels available to draw each glyph. The whats? Found a really nice illustration of typeface anatomy (it’s 1920×1200 though).
Printing for fun and profit
So we’re left trying to find printer fonts that go well together. They don’t have to be from the same typeface family, or originate from the same foundry. They just have to look right.
One of the things that pisses me off about conventional word processors is having to specify fonts for bloody everything. Every time you turn around you have to specify the font for a span of text, a new paragraph… want to create a new style for something? Pick your font. Again. And meanwhile you get a list with every last font installed on the system.
First things first. You don’t need eighteen fonts. You need about three.
A beautiful serif font for your normal text. A good one will have excellent Unicode coverage and will handle italics and bold well as a matter of course.
Next you need a monospaced font for program source fragments, code literals, and other computer-y stuff.
And finally you need a sans-serif font as a tool to indicate things like proper names.
And that’s all you should need to pick!
Of course once we’ve selected a sans font and a mono font we have to figure out different “sizes” for each such that they correspond with the serif font we’ve picked, otherwise we’ll end up with the mishmash we’ve seen above. But once that’s done, we shouldn’t have to do any more font work. In particular, there’s nothing worse than the occasional line being taller than the rest just because there happens to be a superscript or some other font present. No. There should only be one line height (leading) and now that you’ve picked a serif font, you want it to control that line spacing. You know, font size.
How do you get the sizes for sans and mono figured out? Ultimately you’ve got to print to paper and make sure you’re happy with the aesthetics of your choices, but as you are picking fonts, why not get a preview of of the three fonts together so you can converge the x-heights? I’ve been working for a while on a document editor called Quill and Parchment whose rendering engine configuration offers something along these lines:

Once you’ve converged the heights of the Sans and Mono selections, you’re done! The renderer can figure out everything for you from there.

Ah, there we go :)
Here we have Linux Libertine as the serif font, Liberation Sans for sans-serif spans, and Inconsolata for monospaced text. And now you can mix and match these as necessary and you know they’ll fit together. You don’t have to specify the metrics and everything else just because you happen to put a filename in the flow of your text. You tell Quill that the span of text is a filename. The default Parchment render engine is configured to use the mono font in an italicized style. And then it Just Does The Right Thing™
The font choices themselves aren’t the point. Getting the relative sizes aligned is; I could have achieved this with just the Liberation family or even the unfortunately ubiquitous Corefonts. The set above are just the defaults suggested out of the box. I think they’re pretty good together, but ultimately the whole point is that you can and will make your own choices. To assess the suitability of fonts — both individually, and to see whether they are compatible working together — you really do have to print out to paper. Laser printers don’t bleed ink like moveable type presses did, but it’s not until you look at the printed page can you decide whether your choices are good ones. Screen previews — including the illustrations here — don’t do high resolution fonts justice. Not at all.
But you shouldn’t have to go through the hassle of selecting a suitable font and figuring out what size it should be every time you create a different semantic markup for a piece of text. Nope. That’s up to the renderer. Funny how the “styles” mechanisms in WYSIWYG word processors — and web browsers — seem to fall short in all this.
Anyway, next time you’re writing a document and battling fonts trying to make them look good together, try ignoring “font size” and instead zoom in and see if you can align the x-heights. You might be pleased with the result.
AfC
