java-gnome 4.0.20 released

This post is an extract of the release note from the NEWS file which you can read online.


java-gnome 4.0.20 (11 Jul 2011)

This will be the last release in the 4.0 series. It is meant only as an aide to porting over the API bump between 4.0 and 4.1; if your code builds against 4.0.20 without reference to any deprecated classes or methods then you can be fairly certain it will build against 4.1.1 when you finally get a system that has GTK 3.0 and the other GNOME 3 libraries on it.


AfC

Using tinc VPN

We’ve been doing some work where we really needed “direct” machine to machine access between an number of staff and their local file servers. The obvious way to approach this sort of thing is to use a Virtual Private Network technology, but which one?

There are a lot of VPN solutions out there. Quite a number of proprietary ones, and of course the usual contingent of “it’s-free-except-that-then-you-have-to-pay-for-it”. In both cases, why anyone would trust the integrity of code they can’t review is quite beyond me.

We’ve used OpenVPN for some of our enterprise clients, and it’s quite robust. Its model excels at giving remote users access to resources on the corporate network. Technically it is implemented by each user getting a point-to-point connection on an internal network (something along the lines of a 10.0.1.0/30) between the user’s remote machine and a gateway server, and then adding routes to the client’s system to the corporate IP range (ie good old 192.168.1.0/24). That’s fine so long as the assumption is that all the servers on the corporate network have the gateway as their default route, then reply packets to 10.0.1.2 or whatever will just go do default and be sent back down the rabbit hole. Gets messy with things like Postgres if your remote developers need access to the databases; in the configs you do need to add eg 10.0.1.0/24 to the list of networks that the database will accept connections from.

Anyway, that’s all fairly reasonable, and you can set up the client side from NetworkManager (install Debian package network-manager-openvpn-gnome) which is really important too. Makes a good remote access solution.

Peer to Peer

But for our current work, we needed something less centralized. We’re not trying to grant connectivity to a remote corporate network; we’re trying to set up a private network in the old-fashioned frame-relay sense of the word — actually join several remote networks together.

Traditional VPN solutions route all the traffic through the secure central node. If you’ve got one system in NSW and another in Victoria, but the remote access gateway is in California, then despite the fact that the two edges are likely less than 50 ms away direct path, all your traffic is going across the Pacific and back. That’s stupid.

A major complication for all of us was that everyone is (of course) stuck behind NAT. Lots of developers, all working remotely, really don’t need to send all their screen casts, voice conferences, and file transfer traffic into the central corporate network just to come all the way out again.

The 1990s approach to NAT implies a central point that everyone converges to as a means of getting their packets across the port address translation boundary. Things have come a long way since then; the rise of peer-to-peer file sharing and dealing with the challenges of internet telephony has also helped a great deal. Firewalls are more supportive and protocols have evolved in the ongoing attempt to deal with the problem.

Meet tinc

So the landscape is different today, and tinc takes advantage of this. According to their goals page, tinc is a “secure, scalable, stable and reliable, easy to configure, and flexible” peer-to-peer VPN. Uh huh. Because of its peer-to-peer nature, once two edges become aware of each other and have exchanged credentials, they can start sending traffic directly to each other rather than through the intermediary.

$ ping 172.16.50.2
PING 172.16.50.2 (172.16.50.2) 56(84) bytes of data.
64 bytes from 172.16.50.2: icmp_req=1 ttl=64 time=374 ms
64 bytes from 172.16.50.2: icmp_req=2 ttl=64 time=179 ms
64 bytes from 172.16.50.2: icmp_req=3 ttl=64 time=202 ms
64 bytes from 172.16.50.2: icmp_req=4 ttl=64 time=41.6 ms
64 bytes from 172.16.50.2: icmp_req=5 ttl=64 time=45.4 ms
64 bytes from 172.16.50.2: icmp_req=6 ttl=64 time=51.3 ms
64 bytes from 172.16.50.2: icmp_req=7 ttl=64 time=43.3 ms
64 bytes from 172.16.50.2: icmp_req=8 ttl=64 time=42.3 ms
64 bytes from 172.16.50.2: icmp_req=9 ttl=64 time=44.2 ms
...
$

This is with the tincd daemons freshly restarted on each endpoint. The first packet clearly initiates edge discovery, key exchange, and setup of the tunnels. It, and the next two packets, are passed across the Pacific to the central node. Ok, fine. But after that, the tunnel setup completes, and both edge nodes have been informed of the peer’s network addresses and start communicating directly. Nice.

See for yourself

Watching the logs under the hood confirms this. If you run tincd in the foreground then you can specify a debug level on the command line; I find “3″ a good setting for testing:

# tincd -n private -D -d3
tincd 1.0.13 (May 16 2010 21:09:47) starting, debug level 3
/dev/net/tun is a Linux tun/tap device (tun mode)
Executing script tinc-up
Listening on 0.0.0.0 port 655
Ready
Trying to connect to einstein (1.2.3.4 port 655)
Trying to connect to newton (5.6.7.8 port 655)
...

If you give it SIGINT by pressing Ctrl+C then it’ll switch itself up to the exceedingly verbose debug level 5, which is rather cool. SIGQUIT terminates, which you can send with Ctrl+. If you’re not running in the foreground (which of course you’d only be doing in testing),

# tincd -n private -kINT

does the trick. Quite handy, actually.

Performance is respectable indeed; copying a 2.8 MB file across the Pacific,

$ scp video.mpeg joe@einstein.sfo.example.com:/var/tmp

gave an average of 31.625 seconds over a number of runs. Doing the same copy but sending it over the secure tunnel by addressing the remote machine by its private address,

$ scp video.mpeg joe@172.16.50.1:/var/tmp

came in at an average of 32.525 seconds. Call it 3% overhead; that’s certainly tolerable.

Setup

Despite my talking above about joining remote networks, an important and common subcase is merely joining various remote machines especially when those machines are both behind NAT boundaries. That’s our in-house use case.

The tinc documentation is fairly comprehensive, and there are a few HOWTOs out there. There are a few gotchas, though, so without a whole lot of elaboration I wanted to post some sample config files to make it easier for you to bootstrap if you’re interested in trying this (install Debian package tinc).

tinc has a notion of network names; you can (and should) organize your files under one such. For this post I’ve labelled it the incredibly original “private“. Note that when you specify host names here they are not DNS hostnames; they are just symbolic names for use in control signalling between the tinc deaemons. Flexibility = Complexity. What else is new. Obviously you’d probably use hostnames anyway but administration of the tinc network doesn’t need to be co-ordinated with people naming their laptop my-fluffy-bunny or some damn thing. Anyway, on system labelled hawking I have:

hawking:/etc/tinc/private/tinc.conf

    Name = hawking
    AddressFamily = ipv4
    ConnectTo = einstein
    ConnectTo = newton
    Interface = tun0

Note that I’ve got an Interface statement there, not a Device one. That’s a bit add odds with what the documentation said but what I needed to make it all work. Only one ConnectTo is actually necessary, but I’ve got one server in California that is reliably up and one in Victoria that is not so I just threw both in there. That’s what your tincd is going to (compulsively) try to establish tunnels to.

hawking:/etc/tinc/private/hosts/hawking

    Subnet = 172.16.50.31/32

Somewhat confusingly, you need a “hosts” entry for yourself. Above is what you start with. Each host also needs a keypair which you can generate with:

# tincd -n private -K4096

with /etc/tinc/private/rsa_key.priv getting the private key and the public key being appended to the hosts/hawking file:

    Subnet = 172.16.50.31/32

    -----BEGIN RSA PUBLIC KEY-----
    MIICCgKCAgEAzSd5V91X6r3NB3Syh2FV8/JC2M7cx6o2OKbVzP6X5SFPI1lEH1AD
    7SfIlQF4TE++X8RcpJaBi4KjMS/Ul36Tuk75eKA18aNTBoVqH/ytY0BipQvJ6TUd
    BEkCjYrOUHFYOQn8MxQzziG6nk9tvhTWS0yKCNbd68e5i9uyKOem3R/pJsd/Kh9V
    wdVB51Wxs1Sv07OYmGYyRmGWh450wBNEmQfPHmM60Yh6uoQNJ0Ef41k1ZcswWcfO
    0jp9EOvbW/ZCdBW6teIYZ3GMuMB/cFj0Dw2fx6dHNHZVZrPcivt0cuOG8L4jNoHj
    HQUGuzMrpDN8N1ymM/eDlx+kBFYreKiEYGoWWqlZPNoY+bCekMrNf6Sr9bBwbj23
    xmY1jf6v1LkxGtOi4wWJfbU4xaMnquIRQe6FtB4LHp29l2SYWcpZnjuLcZ4ZoZLQ
    WK4bb0bUCAI/eYb19JRnfKEwS9MhYaQhZLWAJ3xyOt9u/Kk9KV7vWApxR1f5e2KT
    77A446eQU5aedm8nBDbd+WHqTdklAQ7SdRyYmbD8PoXBd3DGP6dFiURVTy8Wn4gz
    Bn7PMI3zmhfCMtwq/3A/xfyjQY3qesGCmKUwTno3fhv1DScS0rS9TRxZfyxlaOB1
    qjtlU79VhI0UKlha2Fv4XLshQ5dYEutpatpij0NzPYlwiQFphFQKStsCAwEAAQ==
    -----END RSA PUBLIC KEY-----

These are the public identifiers of your system and indeed the remote system in your ConnectTo statement must have a copy of this in its hosts/ directory. For nets of servers we maintain them in Bazaar and share them around using Puppet. Central distribution brings its own vulnerabilities and hassles; for very small team nets we just share around a tarball :).

You don’t need the /32, it turns out, but I left it in here to show you that tincd is effectively trading around network route advertisements, not host address.

hawking:/etc/tinc/private/tinc-up

    #!/bin/sh
    ifconfig $INTERFACE 172.16.50.31 netmask 255.255.255.0

This gets run when the net comes up. You can do all kinds of interesting things here, but the really magic part is assigning a broader /24 network mask than that given the interface in the hosts/hawking file. That means this interface is the route to the network as a whole (not just to a single-attached host on the other side of a point-to-point tunnel, which is what OpenVPN does, leaving the default gateway to sort it all out). Lots of other ways to wire it of course, but one /24 in RFC 1918 land is more than enough. I’ve even heard of some people using Avahi link-local networking to do the addressing.

I could have hard coded tun0 there, I suppose, but they supply some environment variables. Much better.

Now for the California node:

einstein:/etc/tinc/private/tinc.conf

    Name = einstein
    AddressFamily = ipv4
    Interface = tun0
    Device = /dev/net/tun

That one I did need a Device entry. Not sure what’s up there; it’s a server running Stable, so could just be older kernel interfaces. Doesn’t matter.

Note again though that the tinc.conf file doesn’t have a public IP in it or anything. Bit unexpected, but hey. It turns up in the hosts files:

einstein:/etc/tinc/private/hosts/einstein

    Address = 1.2.3.4
    Subnet = 172.16.50.1/32

    -----BEGIN RSA PUBLIC KEY-----
    MIICCgKCAgEAqh/4Pmxy5fXZh/O7NkvebFK0OP+YD8Ph7JvK8RsUn75FY3DXjCCg
    VNRR+kRhnVoKVJcIAuvW7Tbs4fovWELOJbbUbKea8G+HANCgOY5F0rkJVtIAcTCL
    Jg1OelAfhF6yHV4vVgcawafWiMF2CtprveHomCnOwCbGuTDwTBqaUBZ9IOLzU2bx
    ArVA2No9Ks+xaaeSHejYoii3+WT58HUccntmIYkcdBa0uKZSis1XLUwdT7Evr1Ew
    K54RyMMEPC0MUziYZhAA0Qqpz79EzLXAGgQeuFxLjPoW/NbAD0PEBmsdmI5odprp
    t9Tx11v/UuhK2fszYKjM+DF2pYxxrKlOyus58zx5KKJQjjrzazrru5Ny0DNf/E6Y
    uB2kUtt7TCmoZg2CLAbIkyGJEiK+Wy2x2mabGDgicIs422XVslz2EODSI3qqF+f6
    gu+h/vYvjZxglYrL0SxTRV7wkUc+o9OVXMMYPazgPIkwnBeLrEhGL8GS4wDIYu4G
    E89m9UBE0fhVPJyw4QSfdeJZ4PgpJk6SG/7koVsJqr9EZOLp53K7ipnPylUKaRLD
    mcarvoDO6ybCuHUVUsLuzZZStSG8JEEe/8jb/Ex7UNBzJ14Nglqtu0aUZ/tzkrdS
    nPFFhdIwlUctM7sWKVfBugEkWjs3sR+XRVsCjxMrpZX0lXzcw9vhu60CAwEAAQ==
    -----END RSA PUBLIC KEY-----

This file must be on every system in the net (that has a ConnectTo it) — it’s how the edges know where to call. So the same file is copied to hawking:

hawking:/etc/tinc/private/hosts/einstein

    Address = 1.2.3.4
    Subnet = 172.16.50.1/32

    -----BEGIN RSA PUBLIC KEY-----
    MIICCgKCAgEAqh/4Pmxy5fXZh/O7NkvebFK0OP+YD8Ph7JvK8RsUn75FY3DXjCCg
    VNRR+kRhnVoKVJcIAuvW7Tbs4fovWELOJbbUbKea8G+HANCgOY5F0rkJVtIAcTCL
    Jg1OelAfhF6yHV4vVgcawafWiMF2CtprveHomCnOwCbGuTDwTBqaUBZ9IOLzU2bx
    ArVA2No9Ks+xaaeSHejYoii3+WT58HUccntmIYkcdBa0uKZSis1XLUwdT7Evr1Ew
    K54RyMMEPC0MUziYZhAA0Qqpz79EzLXAGgQeuFxLjPoW/NbAD0PEBmsdmI5odprp
    t9Tx11v/UuhK2fszYKjM+DF2pYxxrKlOyus58zx5KKJQjjrzazrru5Ny0DNf/E6Y
    uB2kUtt7TCmoZg2CLAbIkyGJEiK+Wy2x2mabGDgicIs422XVslz2EODSI3qqF+f6
    gu+h/vYvjZxglYrL0SxTRV7wkUc+o9OVXMMYPazgPIkwnBeLrEhGL8GS4wDIYu4G
    E89m9UBE0fhVPJyw4QSfdeJZ4PgpJk6SG/7koVsJqr9EZOLp53K7ipnPylUKaRLD
    mcarvoDO6ybCuHUVUsLuzZZStSG8JEEe/8jb/Ex7UNBzJ14Nglqtu0aUZ/tzkrdS
    nPFFhdIwlUctM7sWKVfBugEkWjs3sR+XRVsCjxMrpZX0lXzcw9vhu60CAwEAAQ==
    -----END RSA PUBLIC KEY-----

Ok, you get the idea with the public keys, but I wanted to emphasize the point it’s the same file. This is what you need to share around to establish the trust relationship and to tell E.T. where to phone home.

The Address entry in the hosts/einstein files spread around is what tells edge nodes which have been configured to ConnectTo to einstein where the real public IP address is. You can use DNS names here, and could play dynamic DNS games if you have to (sure, further decentralizing, but). If you have a few machines capable of being full time central supernodes then you’ll have much better resiliency.

You do not, however, need to share a hosts/ file for every other node on the net! If laptop penrose is already connected in to einstein and has been assigned 172.16.50.142 say, and hawking joins einstein and tries to ping .142, the central node einstein will facilitate a key exchange even though neither hawking nor penrose have each others’ keys, and then get out of the way. Awesome.

And finally, this all works over further distributed topologies. When new nodes join the new edges and their subnets are advertised around to the rest of the net. So if central nodes einstein and curie are already talking, and sakharov joins currie, then traffic from our hawking will reach sakharov via eintstein and currie, and in fairly short order they will have handled key exchange, step out of the way, and hawking will be communicating with sakharov direct peer to peer. Brilliant.

Nothing stopping you from share around (or centrally managing out-of-band) the hosts/ files with the Subnet declarations and the public keys, of course; it’ll save a few round trips during initial key exchange. Up to you how you manage the trust relationships and initial key distribution.

For completeness,

einstein:/etc/tinc/private/tinc-up

    #!/bin/sh
    ifconfig $INTERFACE 172.16.50.1 netmask 255.255.255.0

No surprises there.

Applications

Using tinc to cross arbitrary NAT boundaries has turned out to be supremely useful. I have successfully used this from within my office, over 3G UTMS mobile broadband, at internet cafes around Australia, in airport lounges in the States, and even from beach-side resorts in Thailand. In all cases I was able to join the private network topology. In fact, I now just leave tincd running as a system daemon on my laptop. When I need to talk to one of the file servers, I ping, and it’s there.

One surprising benefit was in getting voice-over-Jabber running again. We had some horrible regressions with audio quality during the Maverick release series of Ubuntu Linux. At one point in our diagnostics we found that the STUN algorithms for local and remote candidate IP detection were preferentially choosing localhost virtual bridges with lower route metrics than the default gateway resulting in routing loops. We brought up tinc and since both parties were on 172.16.50.x, Empathy and Jingle chose those as the “best” network choice. Packet loss problems vanished and the audio quality really improved (it didn’t finally get resolved until we got a Natty base system, tore out the Unity stuff, and got GNOME 3 and Empathy 3.0 on board via ppa:gnome3-team/gnome3 but that’s a separate issue). And as a side-effect we’ve got some ice on our voice channel. Excellent.

I’ve since read about a number of other interesting applications. A frequent use case is not needing encryption. While most people would interpret the “private” in virtual private network as meaning “secure”, in the old days it just meant a custom routing and network topology layered over whatever the underlying physical transport was. One crew running a large farm of servers on cloud provided infrastructure struggled to enable their various distributed nodes to find and talk to each other. So they disabled the encryption layer but used tinc as a means to facilitate do IP-over-IP tunnelling and giving their sys admins a stable set of (private) addresses with which to talk to the machines. They gave a talk at FOSDEM [their slides here] about it.

Also at FOSDEM was a talk by the “Fair VPN” effort, who are looking at improving efficiency of the network when the number of nodes scales into the thousands. Some nodes are “closer” than others so presumably they should be used preferentially; you don’t really need to discover information about every other node in the network on joining, and so on. The fact that they were able to use tinc as a research platform for this is fascinating and a nice kudo.

Next steps

So I’m pretty pleased with tinc, obviously. We’ve had a very positive experience, and I wanted to put a word in. If you’re involved in network engineering or security hardening, then I’m sure they’d welcome your interest.

It would be outstandingly cool if we could work out a NetworkManager plugin to set this up on demand, but that can wait for tinc version 1.1 or 2.0. I gather they’re working on making the key exchange and configuration easier; what I showed above is obviously well thought out and flexible, but there’s no denying it’s a bit cumbersome; there are a fair number of little knobs that need to be just right. A fire-and-forget daemon cross-product with some form of automatic addressing would be brilliant. But on the other hand, when you put network and security in the same sentence there’s a limit to how much you want to happen without any direct influence over the process. As it stands now tinc strikes a good balance there, and is entirely suitable for an environment managed by properly competent sysadmins.

AfC

Updates

  1. Turns out I was wrong about needing the Interface statement. After Dan’s post I tried it without one and tincd worked fine. Then I remembered why I’d done it that way — without an Interface statement the network interface was named for the tinc net label, private in this case. Preferring tun0, I went back to manually forcing it for my own personal aesthetic benefit.

Force Pidgin online

The Network Mananger 0.9 series has made some changes which break current Pidgin. After I installed network-manager 0.8.999 Pidgin won’t connect, stalling with “waiting for network connection”.

Turns out there is a workaround in Pidgin: you can force it to ignore what it thinks network availability by running it as:

$ pidgin -f

There’s no GUI way in gnome-shell to edit a launcher at the moment, fine; The old “edit menus” trick didn’t seem to work either. So to do that manually:

$ cp /usr/share/applications/pidgin.desktop .local/share/applications/
$ vi .local/share/applications/pidgin.desktop

And change the Exec line to:

Exec=pidgin -f

It won’t take effect until the desktop recaches things. Reload gnome-shell by typing “r” in the Alt+F2 run dialog and you’ll be on your way.

I’m sure upstream will catch up with the Network Manager changes in due course but I can live without network availability detection for now and this gets me back online.

I’ve been using Empathy for instant messaging for a long time, but I still love Pidgin for IRC. Go figure. So two clients it is.

AfC

java-gnome 4.0.19 released

This post is an extract of the release note from the NEWS file which you can read online … or in the sources from Bazaar.


java-gnome 4.0.19 (14 Feb 2011)

What do you mean that’s not the font I asked for?

This release includes some minor feature enhancements.

Preliminary coverage of Pango’s Font object. Font is Pango’s abstraction describing a typeface, and is what is actually loaded. We’ve exposed the methods that allow you to find out what was actually loaded for a given FontDescription request. You do this with Context’s loadFont() and then Font’s describe(). Thanks to Behdad Esfahbod for explaining how all this works.

Exposed a few utility functions, including one to find out if your program is running in a terminal or from the Desktop directly.

GTK improvements

Further improved some corner cases involved in using Actions, and now you can make them with named Icons.

There are some odd corner cases, especially with TextView, where idle handlers need to run before you have the calculations you need ready to query. One workaround appears to be letting the main loop cycle, so we’ve exposed Gtk.mainIterationDo() and the Gtk.eventsPending() which wraps it.

Build improvements

Building java-gnome on Mandriva now works! Thanks to Liam Quin for helping QA the top level configure script.


You can download java-gnome’s sources from ftp.gnome.org_, or easily checkout a branch from_ ‘mainline:

$ bzr checkout bzr://research.operationaldynamics.com/bzr/java-gnome/mainline java-gnome

though if you’re going to do that you’re best off following the instructions in the HACKING guidelines.

AfC

Loading extensions into SQLite

It’s fairly common in financial systems to represent monetary amounts as cents rather than trying to represent dollars and cents with floating point [and all the rounding errors and inaccuracy that comes from using base 2 floating point to represent concrete base 10 numbers]. Call it fixed point if you like. Anyway, in an SQLite database I’m working with, the schema represents amounts as integer numbers of cents.

  3995

is

$39.95

I found myself constantly having to convert a number of cents to a number with two decimal places. Would you believe I came up with this?

SELECT
date(timestamp),
CASE
WHEN amount = 0 THEN
    '0.00'
WHEN amount % 10 = 0 THEN
    CAST (amount / 100.0 AS TEXT) || '0'
ELSE
    CAST (amount / 100.0 AS TEXT)
END,
description
FROM ...

I mean, cool that you can do that with SQL, but yikes. What I really wanted was a SQL function that would just do the conversion from cents to money, all nicely formatted.

SELECT date(timestamp), money(amount), description FROM ...

SQLite has a very straight-forward extension mechanism. Admittedly I already had a grip on the basics of the SQLite C API because I’d written a hyper-thin Java wrapper for it, but it writing an extension in C to define a new scalar SQL function was pretty simple.

Implement function

You create a funciton with a call to sqlite3_create_function() and end up implementing it by specifying poitners to functions with the signature void (*xFunc)(sqlite3_context* ,int , sqlite3_value**). Alright, should be easy:

static void 
convert
(
        sqlite3_context* context,
        int argc,
        sqlite3_value** argv
)
{
        ...

That’s all good. You get the argument passed with one of the sqlite3_value() functions:

        num = sqlite3_value_int(argv[0]);

do something with it, say:

        dollars = num / 100;
        pennies = num % 100;

        snprintf(str, 11, "%7d.%02d", dollars, pennies);

and send it back with a sqlite3_result() function:

        sqlite3_result_text(context, str, -1, SQLITE_TRANSIENT);
}

Easy enough API to use. I should add, the main SQLite C API is lovely — you get a database connection, you prepare a statement, then you step through the results. Of course, that’s what you do with any database engine in any language, but doing it from C against SQLite is really clean.

To register my function I just have to name it, say how many parameters it has, and pass a function pointer to it, like this:

        sqlite3_create_function(db, "money", 1, ..., convert, NULL, NULL);

Some details omitted there, but you get the idea. Simple enough. Except for one thing. Where do you make that call from? Some sort of entry point, presumably, but what? And meanwhile, where do I get that sqlite3*, the database connection, from?

Loading extension

Hunting around, there’s a “load extension” function. Ok, that’s promising, and it says that the default entry point that will be looked for is “sqlite3_extension_init” but it’s not entirely obvious from the documentation for sqlite3_load_extension() what (if anything) the signature for the entry point is. The usual void (*)(void)?

void
sqlite3_extension_init
()
{
        sqlite3_create_function(db, "money", ...);
}

That’s not going to cut it. Where do I get db from?

Well, I put two and two together and guessed that actually the entry point function is invoked with a pointer to the open database connection on the call stack, because, well, you need one. And sure enough it seems to work. Eventually I found the actual signature described on the reference page for sqlite3_auto_extension()one of the few times it was really hard to find something.

int
sqlite3_extension_init
(
        sqlite3* db
)
{
        sqlite3_create_function(db, "money", ...);
        return SQLITE_OK;
}

There we go. Compile it up into a nice little shared library and you’re on your way.

$ gcc $CFLAGS -shared -lsqlite -o money.so money.c

Last bit, get it loaded. Programmatically you’d use that load extension function, but if you’re working from SQLite’s command line interface, you use the .load instruction. The catch here is that although the sqlite3 interface does filename completion, this doesn’t work:

$ sqlite3
sqlite> .load m<TAB>
sqlite> .load money.so
Error: money.so: cannot open shared object file: No such file or directory
sqlite>

Bah Humbug. It seems to be looking on the LD_LIBRARY_PATH. Ok, fair enough, but a bit annoying that it’d complete a file it can’t load. Whatever:

sqlite> .load ./money.so
sqlite>

There. Now we can try it:

sqlite> SELECT money(3995);
39.95
sqlite>

Nice!

That’s obviously a lot of work just to coerce two decimal places, but it also means in future I can do custom things like currency symbols, thousands separators, etc. Cool. I’m not about to write actual reports in SQL, but this is really handy for debugging and exploring the schema I’m working with.

Incidentally, there’s no .unload so if you change your sources and recompile you’ve got to .quit sqlite3, restart, and .load again.

AfC

Postscript

The extension API is actually more powerful than this — you can override the inherited behaviour of SQLite functions themselves. You do that by including sqlite3ext.h and placing the following macros:

#include <sqlite3ext.h>
SQLITE_EXTENSION_INIT1

int
sqlite3_extension_init
(
        sqlite3* db,
        const char** err,
        const sqlite3_api_routines* api
)
{
        SQLITE_EXTENSION_INIT2(api)
        ...

which is more correct than what I was able to get away with above. Hooray for the lack of type safety of dynamic symbols. There’s a wiki page that describes the correct usage further.

The enduring unhelpfulness of “font size”

I have to mix serif, sans, and mono face fonts all the time. It’s quite common in technical writing. But it’s also witheringly difficult to get right:

MS core fonts

That was Times New Roman, Arial, and Courier New from the ever relied upon “corefonts” all at “20 pt”. These are so widely used that they have come to be the very definition of serif, sans-serif, and monospaced. Which is sad, because they are completely useless together: each of those bits of text were written at the same “size”, but clearly they are not. There is no resemblance between the different faces. The weights are completely different. And the text itself is radically different sizes. So much for being able to “just pick a bunch of fonts at the same size”.

It turns out that the root cause problem is that the point size of a font is, for historical reasons, something that isn’t directly measurable from the appearance of a single line of text. It was the body size, the size of the metal bits that were used to compose a line of text. Unfortunately there may be more (or less) spacing between lines, so the body size is not necessarily the distance between consecutive baselines. From the good ship Wikipedia’s article on typefaces, “when specified in typographic sizes the height of an em-square, an invisible box which is typically a bit larger than the distance from the tallest ascender to the lowest descender, is scaled to equal the specified size.”

Ok, so let’s see how that looks here:

point size illustration

Sure enough, the height of the em-square here is bigger than the ascender plus descender. It’s also wider than the letter M (traditionally an em was the width of a capital M). The em-square happens to match the spacing between baselines, though whether that’s the font or the render engine is hard to say.

In any event, the size of the font, here shown to be the size of an em-square, has no relationship to the height of the letters or the proportions between upper case letters and lower case ones:

metrics illustration

I first learned about all this because the corefonts, from Microsoft, are cost-free to use, but not libre for changes or redistribution and thus cannot be shipped in Linux distributions like Debian or Fedora. As a replacement, Red Hat obtained and then generously released the Liberation font family. They were advertised as being a libre substitute for the corefonts which were “metric compatible”. Silly me, of course, assumed that they meant aesthetically compatible with each other, but of course they meant what they said — that they were individually metric compatible with the corresponding proprietary Microsoft corefont. Fair enough; they were meant as a “drop in” replacement so your presentations and word docs wouldn’t get screwed up if you switched to Linux. I have my doubts about even that, but in any event as you can see here it didn’t help matters in the relative size department:

Liberation replacement fonts

There it’s Liberation Serif, Liberation Sans, and Liberation Mono. The problem is that again font metrics aren’t calibrated to correspond to each other: the font size (which as we’ve seen is the size of the em-square) has no relation to the cap-height, the x-height, or the ratio between them.

At best, you can hope that the height of the capital letters will be the same, but even that’s a stretch. Needless to say this is chaos for the font libraries to have to deal with; you see mis-sized text all the time in desktop applications; and meanwhile authors of web browsers haven’t got a chance when some things are specified in points, others in pixels, and relative % sizes appearing from time to time just to completely drive your poor renderer to the wall. Worse of all, us poor users trying to write documents achieve nothing but a muddy mess trying to achieve something which should be simple. I just want text that looks right!

Looking at these examples, you realize that what you wan’t isn’t the capital letter height to be aligned (who cares!) but for the lower case letters to be aligned. In typography this is called the “x-height”, the size of the lower case letter x.

Screen writing

We happen to have a font family that was designed to look good together on screen:

Deja Vu font family

That’s Deja Vu Serif (that almost no one seems to know about; try it as your Document font!), Dejav Vu Sans (which has excellent Unicode coverage and as the default GNOME Application font is thus very widely used), and Dejav Vu Sans Mono (the complementary monospaced version, also very nice, and a good choice for your terminals and source code editor). You can see the relative sizes are almost in sync; in fact, on a computer monitor at 10pt at 90-120 dpi, they are aligned, because these are screen fonts designed for low resolution display:

Deja Vu on screen

More to the point, they are clearly a family that go together. That’s hard to come by, and as a family are fantastic on screen. [Droid Serif, Droid Sans, Droid Mono are also nice as a family and are fine on small device screens with even lower density, but for a proper laptop or desktop screen they're a bit lacking in refinement].

But these are all screen fonts, and, frankly, just don’t look right printed at 1200 dpi on paper — look at the glyph spacing and the blockiness of the serifs. To be expected; Deja Vu (based on Bitstream Vera) was designed for readability at low res; things like crossbars on ts and fs and the ear on the g have to be placed so that they work in the limited number of pixels available to draw each glyph. The whats? Found a really nice illustration of typeface anatomy (it’s 1920×1200 though).

Printing for fun and profit

So we’re left trying to find printer fonts that go well together. They don’t have to be from the same typeface family, or originate from the same foundry. They just have to look right.

One of the things that pisses me off about conventional word processors is having to specify fonts for bloody everything. Every time you turn around you have to specify the font for a span of text, a new paragraph… want to create a new style for something? Pick your font. Again. And meanwhile you get a list with every last font installed on the system.

First things first. You don’t need eighteen fonts. You need about three.

  • A beautiful serif font for your normal text. A good one will have excellent Unicode coverage and will handle italics and bold well as a matter of course.

  • Next you need a monospaced font for program source fragments, code literals, and other computer-y stuff.

  • And finally you need a sans-serif font as a tool to indicate things like proper names.

And that’s all you should need to pick!

Of course once we’ve selected a sans font and a mono font we have to figure out different “sizes” for each such that they correspond with the serif font we’ve picked, otherwise we’ll end up with the mishmash we’ve seen above. But once that’s done, we shouldn’t have to do any more font work. In particular, there’s nothing worse than the occasional line being taller than the rest just because there happens to be a superscript or some other font present. No. There should only be one line height (leading) and now that you’ve picked a serif font, you want it to control that line spacing. You know, font size.

How do you get the sizes for sans and mono figured out? Ultimately you’ve got to print to paper and make sure you’re happy with the aesthetics of your choices, but as you are picking fonts, why not get a preview of of the three fonts together so you can converge the x-heights? I’ve been working for a while on a document editor called Quill and Parchment whose rendering engine configuration offers something along these lines:

FontHeightDisplay widget

Once you’ve converged the heights of the Sans and Mono selections, you’re done! The renderer can figure out everything for you from there.

Parchment defaults with Linux Libertine and Inconsolata

Ah, there we go :)

Here we have Linux Libertine as the serif font, Liberation Sans for sans-serif spans, and Inconsolata for monospaced text. And now you can mix and match these as necessary and you know they’ll fit together. You don’t have to specify the metrics and everything else just because you happen to put a filename in the flow of your text. You tell Quill that the span of text is a filename. The default Parchment render engine is configured to use the mono font in an italicized style. And then it Just Does The Right Thing™

The font choices themselves aren’t the point. Getting the relative sizes aligned is; I could have achieved this with just the Liberation family or even the unfortunately ubiquitous Corefonts. The set above are just the defaults suggested out of the box. I think they’re pretty good together, but ultimately the whole point is that you can and will make your own choices. To assess the suitability of fonts — both individually, and to see whether they are compatible working together — you really do have to print out to paper. Laser printers don’t bleed ink like moveable type presses did, but it’s not until you look at the printed page can you decide whether your choices are good ones. Screen previews — including the illustrations here — don’t do high resolution fonts justice. Not at all.

But you shouldn’t have to go through the hassle of selecting a suitable font and figuring out what size it should be every time you create a different semantic markup for a piece of text. Nope. That’s up to the renderer. Funny how the “styles” mechanisms in WYSIWYG word processors — and web browsers — seem to fall short in all this.

Anyway, next time you’re writing a document and battling fonts trying to make them look good together, try ignoring “font size” and instead zoom in and see if you can align the x-heights. You might be pleased with the result.

AfC

Use commas in FontDescriptions

Tip: always use a comma ',' when constructing a new Pango FontDescription. Otherwise, if your font family has a space in its name (as most do), the font name parser may mistakenly assume the last word of the font name is a size or variant or…

The error I was getting was that

desc = new FontDescription("Times New Roman");

was giving me "Times New" — which, since it doesn’t exist, was resulting in a fallback to the system default font "Deja Vu Sans". Shit. The fix was to use the comma separator even though I wasn’t specifying weight or size in the string:

desc = new FontDescription("Times New Roman,");

Of course, if I read the documentation I wrote years ago for our binding of pango_font_description_from_string() I’d see that I said:

The trailing ‘,’ on the family list is optional but a good idea when a font family you’re targeting includes a space in its name

so apparently I already knew this. Grr. Turns out if I’d just used setters I would have been ok too:

desc = new FontDescription();
desc.setFamily("Times New Roman");
desc.setSize(16.0);

and so on. But if you’re using the useful pango-view command line tool, be aware that the --font option is a font description string that’ll be parsed and you’ll need the comma.

Don’t worry. I’m not using corefonts for anything important :). Just needed it to make make a screenshot.

AfC

java-gnome 4.0.18 released

This post is an extract of the release note from the NEWS file which you can read online … or in the sources from Bazaar.


java-gnome 4.0.18 (23 Dec 2010)

My compressed original is better than your uncompressed copy

This was a bug fix release. A serious crasher was occurring when you requested a the underlying [org.gnome.gdk] Window backing a Widget, as is often necessary before popping up context menus. Thanks to Kenneth Prugh and Guillaume Mazoyer for their help in duplicating and isolating the problem.

Better image rendering

While we’re at it, we’ve merged work in progress offering coverage of the librsvg Scalable Vector Graphics loader. This allows you to draw an SVG image as a vector graphic to Cairo (which itself works in vector form, of course), and is a substantial improvement over just loading the .svg with gdk-pixbuf (which rasterizes the graphic to a bitmap first, of course). Load the image with Handle, then draw it with Context’s showHandle().

We’ve also added coverage of Cairo Surface’s new setMimeType(), which allows you to embed the the original [ie JPEG, or to a lesser extent PNG] image in PDF output rather than just the decoded, rasterized, and very huge bitmap image that Cairo uses on screen and would otherwise have used in PDF and SVG output. So 100 kB JPEGs stay JPEGs instead of turning into 12 MB bitmaps. Yeay.

java-gnome now depends on Cairo 1.10 and librsvg 2.32.


You can download java-gnome’s sources from ftp.gnome.org_, or easily checkout a branch from_ ‘mainline:

$ bzr checkout bzr://research.operationaldynamics.com/bzr/java-gnome/mainline java-gnome

though if you’re going to do that you’re best off following the instructions in the HACKING guidelines.

AfC

Which dictionary?

Here’s an interesting problem:

You have a proper name or other term your language’s dictionary doesn’t know about. You Add the word to your dictionary. Then you share a document with someone else. That word will be marked “mispelt” for them.

Well that sucks. After all, you, the author, went to the trouble of annotating that the word was correct! But if you’re using Enchant, say, and the program uses Dictionary‘s add(), the word ends up in your custom word list at ~/.config/enchant/en_CA.dic or so. And that of course doesn’t get sent anywhere when you give someone else your document. Presumably that’s where Ignore comes in, but astonishingly most applications treat ignore as “ignore it for this session” and when you restart your word processor unbelievably the red squiggly is back and you’ve got to ignore it again!

So really, Add should mean “add to this document’s dictionary”, right? Which implies each document has its own dictionary. Ok, we can figure that out; Enchant has something called “personal word lists” via Enchant.requestPersonalWordList() which ought to make things simple.

Except then there’s the question of the next document you write with the same proper name; you’ll be back to having to add your name. Again. So maybe the right thing to do is to add it to both the putative document word list as well as the user’s word list. THAT seems a bit presumptive, though, and doesn’t actually help the next document which you share with someone else. Hm.

If the document’s word list is in an easy to find location (ie, right beside the document) then it would be easy to copy it from one work to the next. That has possibilities.

AfC

java-gnome 4.0.17 released

One of the lovely things that happens in open source is the opportunity to work with people from around the world. I was in Spain last month and took the time to go a bit off the beaten track up to A Coruña in Galicia where Vreixo Formoso Lopes lives. Vreixo was the original architects of java-gnome and one of the most prolific contributors to ensuring the internals and memory management work right. It was a pleasure to finally meet him!

The rest of this post is an extract of the release note from the NEWS file which you can read online … or in the sources from Bazaar.


java-gnome 4.0.17 (18 Nov 2010)

All dictionaries are equal. But some dictionaries are more equal than others!

After some 6 months of development, this release includes substantial improvements across the library. Thanks to Guillaume Mazoyer, Michael Culbertson, Douglas Goulart, Vreixo Formoso, Mauro Galli, Thijs Leibbrand, and Andrew Cowie for their contributions to the library, and also to Yaakov Selkowitz, and Alexander Boström for their updates to the build system.

Enchant Dictionaries

Improve the utility of the Enchant library by exposing functionality to test wither a dictionary exists for a given “language tag”, and to list all available dictionaries. Add speciality functions to the Internationalization class facilitating the translation of language and country names so you can present the list of available languages properly translated in the user’s language.

GTK improvements

Introduce Icon as a strongly typed class to wrap “named icons” available in an icon theme, complementing the previous coverage of “stock icons” provided by the Stock class. Add methods to DataColumn, TreeModel, Image, and Entry making these available. Also in TreeView land, Vreixo Foromso contributed a change to make DataColumnReference generic, noting that this was his “one great irritation” with java-gnome. Itch scratched, apparently. :)

A fair bit of work went into polishing coverage in various classes. We now have coverage for Adjustment’s various properties (necessary if you want to drive a scroll bar around yourself without using one built into a ScrolledWindow). We’ve also introduced a new signal in the Assistant. You can now define the behaviour of an Assistant using the ForwardPage signal with the setForwardPageCallback() method. It can help you to skip pages when you need to. When going back, the Assistant will also skip the previously skipped page. java-gnome now supports GTK+ 2.20 and introduces the new Spinner widget that can be used to display an unknown progress. We added coverage for another utility function, this time the one that escapes text in strings so that it can be safely included when Pango markup is being used. And meanwhile if you need to ensure whatever has been copied to the clipboard is available after your application terminates, you can call Clipboard’s store().

Thread safety

Fixed a fairly serious bug in the interaction between the memory management code and the thread safety mechanism. Amazing we got away with this one so long, really. Thanks to Vreixo Formoso for helping with analysis of the crash dumps, confirming the diagnosis, and double checking the proposed solution. The problem only showed up if you were making extensive use of something like TextViews which (internal to GTK) did its drawing in a background idle handler.

Also fixed a crasher that turned up if your cursor theme didn’t have a certain named cursor. ENOTGNOME, but anyway.

More drawing

The Cairo graphics library continues to be a joy to use and we continue to make minor improvements to our coverage as people use it more. In particular, based on help from Benjamin Otte and others we’ve refined the way you create a Context in a Widget.ExposeEvent, improving efficiency and taking advantage of some of the underlying support functions more effectively.

Looking ahead

With GTK 3.0 coming closer to reality, we’re keeping close track of the activity there. GTK 3.0 is a pretty vast API and ABI break from 2.x with some fairly major changes to the way Widget sizing works, along with an overhaul of the drawing system. We’ll be updating java-gnome to meet these changes in the months to come.


You can download java-gnome’s sources from ftp.gnome.org_, or easily checkout a branch from_ ‘mainline:

$ bzr checkout bzr://research.operationaldynamics.com/bzr/java-gnome/mainline java-gnome

though if you’re going to do that you’re best off following the instructions in the HACKING guidelines.

AfC