http-streams 0.5.0 released

I’ve done some internal work on my http-streams package. Quite a number of bug fixes, which I’m pleased about, but two significant qualitative improvements as well.

First we have rewritten the “chunked” transfer encoding logic. The existing code would accept chunks from the server, and feed them as received up to the user. The problem with this is the server is the one deciding the chunk size, and that means you can end up being handed multi-megabyte ByteStrings. Not exactly streaming I/O. So I’ve hacked that logic so that it yield‘s bites of maximum 32 kB until it has iterated through the supplied chunk, then moves on to the next. Slight increase in code complexity internally, but much smoother streaming behaviour for people using the library.

Secondly I’ve brought in the highly tuned HTTP header parsing code from Gregory Collins’s new snap-server. Our parser was already pretty fast, but this gave us a 13% performance improvement. Nice.

We changed the types in the openConnection functions; Hostname and Port are ByteString and Word16 now, so there’s an API version bump to 0.5.0. Literals will continue to work so most people shouldn’t be affected.


http-streams 0.4.0 released

Quick update to http-streams, making a requested API change to the signature of the buildRequest function as well as pushing out some bug fixes and performance improvements.

You no longer need to pass the Connection object when composing a Request, meaning you can prepare it before opening the connection to the target web server. The required HTTP 1.1 Host: header is added when sendRequest is called, when the request is written to the server. If you need to see the value of the Host: field that will be sent (ie when debugging) you can call the getHostname function.

I’ve added an “API Change Log” to the README file on GitHub, and the blog post introducing http-streams has been updated reflect the signature change.

Thanks to Jeffrey Chu for his contributions and to Gregory Collins for his advice on performance improvement; this second release is about 9% faster than the original.


An HTTP client in Haskell using io-streams

An HTTP client

I’m pleased to announce http-streams, an HTTP client library for Haskell, using the Snap Framework’s new io-streams library to handle the streaming I/O.

Back and there again

I’ve been doing a lot of work lately using Haskell to do reprocessing of data from various back-end web services and then presenting fragments of that information in the specific form needed to drive client-side visualizations. Nothing unusual about that; on one edge of your program you have a web server and on the other you’re making make onward calls to further servers. Another project includes meshes of agents talking to other agents; again, nothing extreme; just a server daemon responding to requests and in turn making its own requests of others. Fairly common in any environment build on (and in turn offering) RESTful APIs.

I’m doing my HTTP server work with the fantastic Snap Framework; it’s a lightweight and decently performing web server library with a nice API. To go with that I needed an web client library, but there the choices are less inspiring.

Those working in Yesod have the powerful http-conduit package, but I didn’t find it all that easy to work with. I soon found myself writing a wrapper around it just so I could use types and an API that made more sense to me.

Because I was happily writing web apps using Snap, I thought it would be cool to write a client library that would use the same types. After much discussion with Gregory Collins and others, it became clear that trying to reuse the Request and Response types from snap-core wasn’t going to be possible. But there was a significant amount of code in Snap’s test suite, notably almost an entire HTTP client implementation. Having used Snap.Test to build test code for some of my own web service APIs, I knew there was some useful material there, and that gave me a useful starting point.

Streaming I/O

One of the exciting things about Haskell is the collaborative way that boundaries are pushed. From the beginnings in iteratee and enumerator, the development of streaming I/O libraries such as conduit and pipes has been phenomenal.

The Snap web server made heavy use of the original iteratee/enumerator paradigm; when I talked to some of the contributors in #snapframework about whether they were planning to upgrade to one of the newer streaming I/O libraries, I discovered from Greg that he and Gabriel were quietly working on a re-write of the internals of the server, based on their experiences doing heavy I/O in production.

This new library is io-streams, aimed at being a pragmatic implementation of some of the impressive theoretical work from the other streaming libraries. io-streams‘s design makes the assumption that you’re working in … IO, which seems to have allowed them to make some significant optimizations. The API is really clean, and my early benchmarks were promising indeed.

That was when I realized that being compatible with Snap was less about the Request and Response types and far more about being able to smoothly pass through request and response bodies — in other words, tightly integrating with the streaming I/O library used to power the web server.

http-streams, then, is an HTTP client library built to leverage and in turn expose an API based on the capabilities of io-streams.

A simple example

We’ll make a GET request of (which is just a tinsy web app which returns the current UTC time). The basic http-streams API is pretty straight forward:

11 {-# LANGUAGE OverloadedStrings #-}
24 import System.IO.Streams (InputStream, OutputStream, stdout)
25 import qualified System.IO.Streams as Streams
26 import Network.Http.Client
28 main :: IO ()
29 main = do
30     c <- openConnection "" 58080
32     q <- buildRequest $ do
33         http GET "/time"
34         setAccept "text/plain"
36     sendRequest c q emptyBody
38     receiveResponse c (\p i -> do
39         Streams.connect i stdout)
41     closeConnection c

which results in

Sun 24 Feb 13, 11:57:10.765Z

Open connection

Going through that in a bit more detail, given that single import and some code running in IO, we start by opening a connection to the appropriate host and port:

30     c <- openConnection "" 58080

Create Request object

Then you can build up the request you need:

32     q <- buildRequest $ do
33         http GET "/time"
34         setAccept "text/plain"

that happens in a nice little state monad called RequestBuilder with a number of simple functions to set various headers.

Having built the Request object we can have a look at what the outbound request would look like over the wire, if you’re interested. Doing:

35     putStr $ show q

would have printed out:

GET /time HTTP/1.1
User-Agent: http-streams/0.3.0
Accept-Encoding: gzip
Accept: text/plain

Send request

Making the request is a simple call to sendRequest. It takes the Connection, a Request object, and function of type

   (OutputStream Builder -> IO α)

which is where we start seeing the System.IO.Streams types from io-streams. If you’re doing a PUT or POST you write a function where you are handed the OutputStream and can write whatever content you want to it. Here, however, we’re just doing a normal GET request which by definition has no request body so we can use emptyBody, a predefined function of that type which simply returns without sending any body content. So:

36     sendRequest c q emptyBody

gets us what we want. If we were doing a PUT or POST with a request body, we’d write to the OutputStream in our body function. It’s an OutputStream of Builders as a fairly significant optimization; the library will end up chunking and sending over an underlying OutputStream ByteString which is wrapped around the socket, but building up the ByteString(s) first in a Builder reduces allocation overhead when smacking together all the small strings that the request headers are composed of; taken together it often means requests will be done in a single sendto(2) system call.

Read response

To read the reply from the server you make a call to receiveResponse. Like sendRequest, you pass the Connection and a function to handle the entity body, this time one which will read the response bytes. It’s type is

   (Response -> InputStream ByteString -> IO β)

This is where things get interesting. We can use the Response object to find out the status code of the response, read various headers, and deal with the reply accordingly. Perhaps all we care about is the status code:

42 statusHandler :: Response -> InputStream ByteString -> IO ()
43 statusHandler p i = do
44     case getStatusCode p of
45         200 -> return ()
46         _   -> error "Bad server!"

The response body it available through the InputStream, which is where we take advantage of the streaming I/O coming down from the server. For instance, if you didn’t trust the server’s Content-Length header and wanted to count the length of the response yourself:

42 countHandler :: Response -> InputStream ByteString -> IO Int
43 countHandler p i1 = do
44     go 0 i1
45   where
46     go !acc i = do
47         xm <- i
48         case xm of
49             Just x  -> go (acc + S.length x) i
50             Nothing -> return acc

Ok, that’s pretty contrived, but it shows the basic idea: when you read from an InputStream a it’s a sequence of Maybe a; when you get Nothing the input is finished. Realistic usage of io-streams is a bit more idiomatic; the library offers a large range of functions for manipulating streams, many of which are wrappers to build up more refined streams from lower-level raw ones. In this case, we could do the counting trick using countInput which gives you an action to tell you how many bytes it saw:

42 countHandler2 p i1 = do
43     (i2, getCount) <- Streams.countInput i1
45     Streams.skipToEof i2
47     len <- getCount
48     return len

For our example, however, we don’t need anything nearly so fancy; you can of course use the lambda function in-line we showed originally. If you also wanted to spit the response headers out to stdout, Response also has a useful Show instance.

38     receiveResponse c (\p i -> do
39         putStr $ show p
40         Streams.connect i stdout)

which is, incidentally, exactly what the predefined debugHandler function does:

38     receiveResponse c debugHandler

either way, when we run this code, it will print out:

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Sun, 24 Feb 2013 11:57:10 GMT
Server: Snap/
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Type: text/plain

Sun 24 Feb 13, 11:57:10.765Z

Obviously you don’t normally need to print the headers like that, but they can certainly be useful for testing.

Close connection

Finally we close the connection to the server:

41     closeConnection c

And that’s it!

More advanced modes of operation are supported. You can reuse the same connection, of course, and you can also pipeline requests [sending a series of requests followed by reading the corresponding responses in order]. And meanwhile the library goes to some trouble to make sure you don’t violate the invariants of HTTP; you can’t read more bytes than the response contains, but if you read less than the length of the response, the remainder of the response will be consumed for you.

Don’t forget to use the conveniences before you go

The above is simple, and if you need to refine anything about the request then you’re encouraged to use the underlying API directly. However, as often as not you just need to make a request of a URL and grab the response. Ok:

61 main :: IO ()
62 main = do
63     x <- get "" concatHandler
64     S.putStrLn x

The get function is just a wrapper around composing a GET request using the basic API, and concatHandler is a utility handler that takes the entire response body and returns it as a single ByteString — which somewhat defeats the purpose of “streaming” I/O, but often that’s all you want.

There are put and post convenience functions as well. They take a function for specifying the request body and a handler function for the response. For example:

66     put "" (fileBody "fozzie.jpg") handler

this time using fileBody, another of the pre-defined entity body functions.

Finally, for the ever-present not-going-to-die-anytime-soon application/x-www-form-urlencoded POST request — everyone’s favourite — we have postForm:

67     postForm "" [("name","Kermit"),("role","Stagehand")] handler

Secure connections

I’ve also completely neglected to mention until now SSL support and error handling. Secure connections are supported using openssl; if you’re working in the convenience API you can just request an https:// URL as shown above; in the underlying API you call openConnectionSSL instead of openConnection. As for error handling, a major feature of io-streams is that you leverage the existing Control.Exception mechanisms from base; the short version is you can just wrap bracket around the whole thing for any exception handling you might need — that’s what the convenience functions do, and there’s a withConnection function which automates this for you if you want.


I’m pretty happy with the http-streams API at this point and it’s pretty much feature complete. A fair bit of profiling has been done, and the code is pretty sound at this point. Benchmarks against other HTTP clients are favourable.

After a few years working in Haskell this is my first go at implementing a library as opposed to just writing applications. There’s a lot I’ve had learn about writing good library code, and I’ve really appreciated working with Gregory Collins as we’ve fleshed out this API together. Thanks also to Erik de Castro Lopo, Joey Hess, Johan Tibell, and Herbert Valerio Riedel for their review and comments.

You can find the API documentation for Network.Http.Client here (until Hackage generates the docs) and the source code at GitHub.



  1. Code snippets updated to reflect API change made to buildRequest as of v0.4.0. You no longer need to pass the Connection object when building a Request.

Defining OS specific code in Haskell

I’ve got an ugly piece of Haskell code:

213 baselineContextSSL :: IO SSLContext
214 baselineContextSSL = do
215     ctx <- SSL.context    -- a completely blank context
216     contextSetDefaultCiphers ctx
217 #if defined __MACOSX__
218     contextSetVerificationMode ctx VerifyNone
219 #elif defined __WIN32__
220     contextSetVerificationMode ctx VerifyNone
221 #else
222     contextSetCADirectory ctx "/etc/ssl/certs"
223     contextSetVerificationMode ctx $
224         VerifyPeer True True Nothing
225 #endif
226     return ctx

this being necessary because the non-free operating systems don’t store their X.509 certificates in a place that openssl can reliably discover them. This sounds eminently solvable at lower levels, but that’s not really my immediate problem; after all, this sort of thing is what #ifdefs are for. The problem is needing to get an appropriate symbol based on what OS you’re using defined.

I naively assumed there would be __LINUX__ and __MACOSX__ and __WIN32__ macros already defined by GHC because, well, that’s just the sort of wishful thinking that powers the universe.

When I asked the haskell-cafe mailing list for suggestions, Krzysztof Skrzętnicki said that I could use in my project’s .cabal file. Nice, but problematic because you’re not always building using Cabal; you might be working in ghci, you might be using a proper Makefile to build your code, etc. Then Henk-Jan van Tuyl pointed out that you can get at the Cabal logic care of Distribution.System. Hey, that’s cool! But that would imply depending on and linking the Cabal library into your production binary. That’s bad enough, but the even bigger objection is that binaries aren’t portable, so what’s the point of having a binary that — at runtime! — asks what operating system it’s on? No; I’d rather find that out at build time and then let the C pre-processor include only the relevant code.

This feels simple and an appropriate use of CPP; even the symbol names look just about like what I would have expected (stackoverflow said so, must be true). Just need to get the right symbol defined at build time. But how?

Build Types

Running cabal install one sees all kinds of packages building and I’d definitely noticed some interesting things happen; some packages fire off what is obviously an autoconf generated ./configure script; others seem to use ghci or runghc to dynamically interpret a small Haskell program. So it’s obviously do-able, but as is often the case with Haskell it’s not immediately apparent where to get started.

Lots of libraries available on Hackage come with a top-level Setup.hs. Whenever I’d looked in one all I’d seen is:

  1 import Distribution.Simple
  2 main = defaultMain

which rather rapidly gave me the impression that this was a legacy of older modules, since running:

$ cabal configure
$ cabal build
$ cabal install

on a project without a Setup.hs apparently just Does The Right Thing™.

It turns out there’s a reason for this. In a project’s .cabal file, there’s a field build-type that everyone seems to define, and of course we’re told to just set this to “Simple”:

 27 build-type:          Simple

what else would it be? Well, the answer to that is that “Simple” is not the default; “Custom” is (really? weird). And a custom build is one where Cabal will compile and invoke Setup.hs when cabal configure is called.


When you look in the documentation of the Cabal library (note, this is different from the cabal-install package which makes the cabal executable we end up running) Distribution.Simple indeed has defaultMain but it has friends. The interesting one is defaultMainWithHooks which takes this monster as its argument; sure enough, there are pre-conf, post-conf, pre-build, post-build, and so on; each one is a function which you can easily override.

 20 main :: IO ()
 21 main = defaultMainWithHooks $ simpleUserHooks {
 22        postConf = configure
 23     }
 25 configure :: Args -> ConfigFlags -> PackageDescription -> LocalBuildInfo -> IO ()
 26 configure _ _ _ _ = do
 27     ...

yeay for functions as first class objects. From there it was a simple matter to write some code in my configure function to call Distribution.Simple’s buildOS and write out a config.h file with the necessary #define I wanted:

  1 #define __LINUX__

Include Paths

We’re not quite done yet. As soon as you want to #include something, you have to start caring about include paths. It would appear the compiler, by default, looks in the same directory as the file it is compiling. Fair enough, but I don’t really want to put config.h somewhere deep in the src/Network/Http/ tree; I want to put it in the project’s top level directory, commonly known as ., also known as “where I’m editing and running everything from”. So you have to add a -I"." option to ghc invocations in your Makefiles, your .cabal file needs to be told in its way:

 61 library
 62   include-dirs:      .

and as for ghci, it turns out you can put a .ghci in your sources:

  1 :set -XOverloadedStrings
  2 :set +m
  3 :set -isrc:tests
  4 :set -I.

and if you put that in your project root directory, running ghci there will work without having to specify all that tedious nonsense on the command line.

The final catch is that you have to be very specific about where you put the #include directive in your source file. Put it at the top? Won’t work. After the pragmas? You’d think. Following the module statement? Nope. It would appear that it strictly has to go after the imports and before any real code. Line 65:

 47 import Data.Monoid (Monoid (..), (<>))
 48 import qualified Data.Text as T
 49 import qualified Data.Text.Encoding as T
 50 import Data.Typeable (Typeable)
 51 import GHC.Exts
 52 import GHC.Word (Word8 (..))
 53 import Network.URI (URI (..), URIAuth (..), parseURI)
 65 #include "config.h"
 67 type URL = ByteString
 69 --
 70 -- | Given a URL, work out whether it is normal or secure, and then
 71 -- open the connection to the webserver including setting the
 72 -- appropriate default port if one was not specified in the URL. This
 73 -- is what powers the convenience API, but you may find it useful in
 74 -- composing your own similar functions.
 75 --
 76 establishConnection :: URL -> IO (Connection)
 77 establishConnection r' = do
 78     ...

You get the idea.


Several people wrote to discourage this practice, arguing that conditional code is the wrong approach to portability. I disagree, but you may well have a simple piece of code being run dynamically that would do well enough just making the choice at runtime; I’d be more comfortable with that if the OS algebraic data type was in base somewhere; linking Cabal in seems rather heavy. Others tried to say that needing to do this at all is openssl’s fault and that I should be using something else. Perhaps, and I don’t doubt that we’ll give tls a try at some point. But for now, openssl is battle-tested crypto and the hsopenssl package is a nice language binding and heavily used in production.

Meanwhile I think I’ve come up with a nice technique for defining things to drive conditional compilation. You can see the complete Setup.hs I wrote here; it figures out which platform you’re on and writes the .h file accordingly. If you have need to do simple portability conditionals, you might give it a try.


Integrating Vim and GPG

Quite frequently, I need to take a quick textual note but when the content is sensitive, even just transiently, well, some things shouldn’t be left around on disk in plain text. Now before you pipe up with “but I encrypt my home directory” keep in mind that that only pretects data against it being read in the event your machine is stolen; if something gets onto your system while it’s powered up and you’re logged in, the file is there to read.

So for a while my workflow there has been the following rather tedious sequence:

$ vi document.txt
$ gpg --encrypt --armour 
    -o document.asc document.txt
$ rm document.txt

and later on, to view or edit the file,

$ gpg --decrypt -o document.txt document.asc 
$ view document.txt
$ rm document.txt

(yes yes, I could use default behaviour for a few things there, but GPG has a bad habit of doing things that you’re not expecting; applying the principle of least surprise seems a reasonable defensive measure, but fine, ok

$ gpg < document.asc

indeed works. Pedants, the lot of you).

Obviously this is tedious, and worse, error prone; don’t be overwriting the wrong file, now. Far more serious, you have the plain text file sitting around while you’re working on it, which from an operational security standpoint is completely unacceptable.

vim plugin

I began to wonder if there was better way of doing this, and sure enough, via the volumous Vim website I eventually found my way to this delightful gem: by James McCoy.

Since it might not be obvious, to install it you can do the following: grab a copy of the code,

$ cd ~/src/
$ mkdir vim-gnupg
$ cd vim-gnupg/
$ git clone git:// github
$ cd github/
$ cd plugin/
$ ls

Where you will see one gnupg.vim. To make Vim use it, you need to put in somewhere vim will see it, so symlink it into your home directory:

$ mkdir ~/.vim
$ mkdir ~/.vim/plugin
$ cd ~/.vim/plugin/
$ ln -s ~/src/vim-gnupg/github/plugin/gnupg.vim .

Of course have a look at what’s in that file; this is crypto and it’s important to have confidence that the implementation is sane. Turns out that the gnupg.vim plugin is “just” Vim configuration commands, though there are some pretty amazing contortions. People give Emacs a bad rap for complexity, but whoa. :). The fact you can do all that in Vim is, er, staggering.

Anyway, after all that, it Just Works™. I give my filename a .asc suffix, and ta-da:

$ vi document.asc

the plugin decrypts, lets me edit clear text in memory, and then re-encrypts before writing back to disk. Nice! For a new file, it prompts for the target address (which is one’s own email for personal use) and then on it’s way. [If you’re instead using symmetrical encryption, I see no way around creating an empty file with gpg first, but other than that, it works as you’d expect]. Doing all of this on a GNOME 3 system, you already have a gpg-agent running, so you get all the sexy entry dialogs and proper passphrase caching.

I’m hoping a few people in-the-know will have a look at this and vet that this plugin doing the right thing, but all in all this seems a rather promising solution for quickly editing encrypted files.

Now if we can just convince Gedit to do the same.


java-gnome 4.1.2 released

This post is an extract of the release note from the NEWS file which you can read online … or in the sources from Bazaar.

java-gnome 4.1.2 (30 Aug 2012)

Applications don’t stand idly by.

After a bit of a break, we’re back with a second release in the 4.1 series covering GNOME 3 and its libraries.

Application for Unique

The significant change in this release is the introduction of GtkApplication, the new mechanism providing for unique instances of applications. This replaces the use of libunique for this purpose, which GNOME has deprecated and asked us to remove.

Thanks to Guillaume Mazoyer for having done the grunt work figuring out how the underlying GApplication mechanism worked. Our coverage begins in the Application class.

Idle time

The new Application coverage doesn’t work with java-gnome’s multi-thread safety because GTK itself is not going to be thread safe anymore. This is a huge step backward, but has been coming for a while, and despite our intense disappointment about it all, java-gnome will now be like every other GUI toolkit out there: not thread safe.

If you’re working from another thread and need to update your GTK widgets, you must do so from within the main loop. To get there, you add an idle handler which will get a callback from the main thread at some future point. We’ve exposed that as Glib.idleAdd(); you put your call back in an instance of the Handler interface.

As with signal handlers, you have to be careful to return from your callback as soon as possible; you’re blocking the main loop while that code is running.

Miscellaneous improvements

Other than this, we’ve accumulated a number of fixes and improvements over the past months. Improvements to radio buttons, coverage of GtkSwitch, fixes to Assistant, preliminary treatment of StyleContext, and improvements to SourceView, FileChooser, and more. Compliments to Guillaume Mazoyer, Georgios Migdos, and Alexander Boström for their contributions.

java-gnome builds correctly when using Java 7. The minimum supported version of the runtime is Java 6. This release depends on GTK 3.4.


You can download java-gnome’s sources from, or easily checkout a branch frommainline:

$ bzr checkout bzr:// java-gnome

though if you’re going to do that you’re best off following the instructions in the HACKING guidelines.


Testing RESTful APIs the not-quite-as-hard way

Last week I wrote briefly about using wget and curl to test RESTful interfaces. A couple people at CERN wrote in to suggest I look at a tool they were quite happy with, called httpie.

I’m impressed. It seems to strike a lovely balance between expressiveness and simplicity. What’s especially brilliant is that it’s written for the common case of needing to customize headers and set specific parameters; you can do it straight off the command line. For what I was doing last week:

$ http GET http://localhost:8000/muppet/6 Accept:application/json

sets the Accept header in your request; sending data is unbelieveably easy. Want to post to a form? -f gets you url encoding, and meanwhile you just set parameters on the command line:

$ http -f POST http://localhost:8000/ name="Kermit" title="le Frog"
Accept: */*
Accept-Encoding: gzip
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Host: localhost:8000
User-Agent: HTTPie/0.2.7



If you’re sending JSON it does things like set the Content-Type and Accept headers to what they should be by simply specifying -j (which sensibly is the default if you POST or PUT and have name=value pairs). And, -v gets you both request and response headers; if you’re testing at this level you usally want to see both. Good show.

$ http -v -j GET http://localhost:8000/muppet/6
GET /muppet/6 HTTP/1.1
Accept: application/json
Accept-Encoding: gzip
Host: localhost:8000
User-Agent: HTTPie/0.2.7

HTTP/1.1 200 OK
Cache-Control: max-age=42
Content-Encoding: gzip
Content-Type: application/json
Date: Thu, 09 Aug 2012 03:52:27 GMT
Server: Snap/0.9.1
Transfer-Encoding: chunked
Vary: Accept-Encoding

    "name": "Fozzie"
    "title": "Bear"

Speaking of bears, I’m afraid to say it turned out to be quite the bear getting httpie installed on Ubuntu. I had to backport pygments, requests, python-oathlib, and pycrypto from Quintal to Precise, and meanwhile the httpie package in Quintal was only 0.1.6; upstream is at 0.2.7 and moving at a rapid clip. I finally managed to get through dependency hell; if you want to try httpie you can add my network tools PPA as ppa:afcowie/network. I had to make one change to httpie: the default compression header in python-requests is

Accept-Encoding: identity, deflate, compress, gzip

which is a bit silly; for one thing if the server isn’t willing to use any of the encodings then it’ll just respond a normal uncompressed entity, so you don’t need identity. More importantly, listing deflate and compress before gzip is inadvisable; some servers interpret the order encodings are specified as an order of preference, and lord knows the intersecting set of servers and clients that actually get defalate right is vanishingly small. So,

Accept-Encoding: gzip

Seems more than sufficient as a default; you can always change it on the command line if you have to for testing. Full documentation at github; that said, once it’s installed, http --help will tell you everything you’d like to know.


Testing RESTful APIs the hard way

RESTful APIs tend to be written for use by other programs, but sometimes you just want to do some testing from the command line. This has a surprising number of gotchas; using curl or wget is harder than it should be.


Wget is the old standby, right? Does everything you’d ever want it to. Bit of minor tweaking to get it not to blab on stdout about what it’s resolving (-q) and meanwhile telling it to just print the entity retrieved to stdout rather than saving it to a file (-O -) is easy enough. Finally, I generally like to see the response headers from the server I’m talking to (-S) so as to check that caching and entity tags are being set correctly:

$ wget -S -q -O -
  HTTP/1.1 200 OK
  Transfer-Encoding: chunked
  Date: Fri, 27 Jul 2012 04:49:17 GMT
  Content-Type: text/plain
Hello world

So far so good.

The thing is, when doing RESTful work all you’re really doing is just exercising the HTTP spec, admittedly somewhat adroitly. So you need to be able to indicate things like the media type you’re looking for. Strangely, there’s no command line option offered by Wget for that; you have to specify the header manually:

$ wget -S -q --header "Accept: application/json" -O -
  HTTP/1.1 200 OK
  Date: Fri, 27 Jul 2012 04:55:50 GMT
  Cache-Control: max-age=42
  Content-Type: application/json
  Content-Length: 27
    "text": "Hello world"

Cumbersome, but that’s what we wanted. Great.

Now to update. This web service wants you to use HTTP PUT to change an existing resource. So we’ll just figure out how to do that. Reading the man page. Hm, nope; Wget is a downloader. Ok, that’s what it said it was, but I’d really come to think of it as a general purpose tool; it does support sending form data up in a POST request with it’s --post-file option. I figured PUT would just be lurking in a hidden corner. Silly me.


Ok, howabout Curl? Doing a GET is dead easy. Turn on response headers for diagnostic purposes (-i), but Curl writes to stdout by default, so:

$ curl -i
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Fri, 27 Jul 2012 05:11:00 GMT
Content-Type: text/plain

Hello world

but yup, we’ve still got to mess around manually supplying the MIME type we want; at least the option (-H) is a bit tighter:

$ curl -i -H "Accept: application/json"
HTTP/1.1 200 OK
Date: Fri, 27 Jul 2012 05:12:32 GMT
Cache-Control: max-age=42
Content-Type: application/json
Content-Length: 27

    "text": "Hello world"

Good start. Ok, what about our update? It’s not obvious from the curl man page, but to PUT data with curl, you have to manually specify the HTTP method to be used with (-X) and then (it turns out) you use the same -d parameter as you would if you were transmitting with POST:

$ curl -X PUT -d name=value

That’s nice, except that when you’re PUTting you generally are not sending “application/x-www-form-urlencoded” name/value pairs; you’re sending actual content. You can tell Curl to pull from a file:

$ curl -X PUT -d

or, (at last), from stdin like you’d actually expect of a proper command line program:

$ curl -X PUT -d @-
You are here.
And then you aren't.

That was great, except that I found all my newlines getting stripped! I looked in in the database, and the content was:

You are here. And then you aren't.


After writing some tests server-side to make sure it wasn’t my code or the datastore at fault, along with finally resorting to hexdump -C to find out what was going on, I finally discovered that my trusty \n weren’t being stripped, they were being converted to \r. Yes, that’s right, mind-numbingly, Curl performs newline conversion by default. Why oh why would it do that?

Anyway, it turns out that -d is short for --data-ascii; the workaround is to use --data-binary:

$ curl -X PUT --data-binary @-

“oh,” he says, underwhelmed. But it gets better; for reasons I don’t yet understand, Curl gets confused by EOF )as indicated by typing Ctrl+D in the terminal). Not sure what’s up with that, but trusty 40 year-old cat knows what to do, so use it as a front end:

$ cat | curl -X PUT --data-binary @-

The other thing missing is the MIME type you’re sending; if for example you’re sending a representation in JSON, you’ll need a header saying so:

$ cat filename.json |
    curl -X PUT --data-binary @-
    -H "Content-Type: application/json"
    "text": "Goodbye cruel world"

which is all a bit tedious. Needless to say I’ve stuck that in a shell script called (with utmost respect to libwww-perl) PUT, taking content type as an argument:

$ ./PUT "application/json" http://localhost:8000/resource/1 < filename.json
HTTP/1.1 204 Updated
Server: Snap/0.9.1
Date: Fri, 27 Jul 2012 04:53:53 GMT
Cache-Control: no-cache
Content-Length: 0

Ah, that’s more like it.


Complaining about GNOME is a new national sport

Bloody hell. GNOME hackers, can someone sit Linus down and get him sorted so he stops whinging?

I mean, we all know GNOME 3 and its Shell have some rough edges. Given that computers users since the stone-age have been adverse to change, it’s not surprising that people complained about GNOME 3.x being different than GNOME 2.x (actually, more to the point, being different than Windows 95. How dare they?). Even though we believe in what we’re doing, we’re up against it for having shipped a desktop that imposes workflow and user experience changes on the aforementioned change-adverse hypothetical user.

BC comic strip

Surely, however, the negative PR impact of Linus constantly complaining about how he’s having such a hard time using GNOME exceeds what it might cost to the GNOME Foundation of getting somebody over to the Linux Foundation to help him out? Oh well, too late now.

Meanwhile, I certainly do agree that is completely useless if the first thing you see on 3.4 release day is “your web browser version isn’t new enough”. It’s not just Fedora; running Epiphany 3.4 here on a current Ubuntu system, same problem. On the other hand, if you add ppa:gnome3-team/gnome3 and ppa:webupd8team/gnome3 to a system running the current Ubuntu release you can completely ditch Unity. You get an up-to-date GNOME 3.4 that works great, and thanks to webupd8 packaging extensions, you get a fair degree of customization over the experience.

Yeah, there is still lots of room for improvement, and of course there are design decisions that make you scratch your head, but come on, it’s not all bad.


Upgrading to Precise

The latest release of Ubuntu, version 12.04 aka Precise, has a lot of updates we’ve been waiting on for a while — GNOME 3.4, Haskell 7.4.1, and a huge stack of bugfixes. On the desktop side, quite a number of Linux kernel vs X video modes vs suspend glitches have gone away. That’s fantastic. During most of Oneiric, my laptop was freezing and needing a hard reset at least once a day. Tedious. So I’m quite pleased to report that running Precise, Linux 3.2, gdm, and GNOME 3.4, things are vastly more stable.

Getting upgraded to Precise, however, has not been a pleasant experience.

First we’ve had unattended-upgrades overwriting any configuration stating “no automatic upgrades”. The number of non-technical friends who were set to “security updates only” calling in wondering why a “big upgrade” happened and now their computers don’t work has been staggering. Needless to say we nuked unattended-upgrades from all of our systems a hurry, but for those people it was already too late.

Several desktop upgrades failed half-way through because dpkg suddenly had unresolved symbol errors. Fortunately I was able to work out the missing library binary and manually copy it in from another machine, which was enough to get package system working. Hardly auspicious.

Server side was fraught with difficulty. You cannot yet upgrade from Lucid to Precise. It breaks horribly.

E: Could not perform immediate configuration on 'python-minimal'. Please
see man 5 apt.conf under APT::Immediate-Configure for details. (2)

Brutal. I tried working around it on one system by manually using dpkg, but that just led me into recursive dependency hell:

# cd /var/cache/apt/archvies
# dpkg -r libc6-i686
# dpkg -i libc6_2.15-0ubuntu10_i386.deb
# dpkg -i libc-bin_2.15-0ubuntu10_i386.deb
# dpkg -i multiarch-support_2.15-0ubuntu10_i386.deb
# dpkg -i xz-utils_5.1.1alpha+20110809-3_i386.deb
# dpkg -i liblzma5_5.1.1alpha+20110809-3_i386.deb
# dpkg -i dpkg_1.16.1.2ubuntu7_i386.deb
# apt-get dist-upgrade

Huh. That actually worked on one system. But not on another. Still slammed into the python-minimal failure. For that machine I couldn’t mess around, so I had to give up and did a re-install from scratch. That’s not always feasible and certainly isn’t desirable; if I wanted to be blowing systems away all the time and re-installing them I’d be running Red Hat.

Anyway, I then located this bug about being unable to upgrade (what the hell kind of QA did these people do before “releasing”?) where, very helpfully, Stefano Rivera suggested a magic incantation that gets you past this:

# apt-get install -o APT::Immediate-Configure=false -f apt python-minimal
# apt-get dist-upgrade

(I had tried something very close to this, but didn’t think of doing both apt and python-minimal. Also, it hadn’t occurred to me to use -f. Ahh. For some reason one always sees apt-get -f install not apt-get -f install whatever-package-name).