Defining OS specific code in Haskell

I’ve got an ugly piece of Haskell code:

213 baselineContextSSL :: IO SSLContext
214 baselineContextSSL = do
215     ctx <- SSL.context    -- a completely blank context
216     contextSetDefaultCiphers ctx
217 #if defined __MACOSX__
218     contextSetVerificationMode ctx VerifyNone
219 #elif defined __WIN32__
220     contextSetVerificationMode ctx VerifyNone
221 #else
222     contextSetCADirectory ctx "/etc/ssl/certs"
223     contextSetVerificationMode ctx $
224         VerifyPeer True True Nothing
225 #endif
226     return ctx

this being necessary because the non-free operating systems don’t store their X.509 certificates in a place that openssl can reliably discover them. This sounds eminently solvable at lower levels, but that’s not really my immediate problem; after all, this sort of thing is what #ifdefs are for. The problem is needing to get an appropriate symbol based on what OS you’re using defined.

I naively assumed there would be __LINUX__ and __MACOSX__ and __WIN32__ macros already defined by GHC because, well, that’s just the sort of wishful thinking that powers the universe.

When I asked the haskell-cafe mailing list for suggestions, Krzysztof Skrzętnicki said that I could use in my project’s .cabal file. Nice, but problematic because you’re not always building using Cabal; you might be working in ghci, you might be using a proper Makefile to build your code, etc. Then Henk-Jan van Tuyl pointed out that you can get at the Cabal logic care of Distribution.System. Hey, that’s cool! But that would imply depending on and linking the Cabal library into your production binary. That’s bad enough, but the even bigger objection is that binaries aren’t portable, so what’s the point of having a binary that — at runtime! — asks what operating system it’s on? No; I’d rather find that out at build time and then let the C pre-processor include only the relevant code.

This feels simple and an appropriate use of CPP; even the symbol names look just about like what I would have expected (stackoverflow said so, must be true). Just need to get the right symbol defined at build time. But how?

Build Types

Running cabal install one sees all kinds of packages building and I’d definitely noticed some interesting things happen; some packages fire off what is obviously an autoconf generated ./configure script; others seem to use ghci or runghc to dynamically interpret a small Haskell program. So it’s obviously do-able, but as is often the case with Haskell it’s not immediately apparent where to get started.

Lots of libraries available on Hackage come with a top-level Setup.hs. Whenever I’d looked in one all I’d seen is:

  1 import Distribution.Simple
  2 main = defaultMain

which rather rapidly gave me the impression that this was a legacy of older modules, since running:

$ cabal configure
$ cabal build
$ cabal install

on a project without a Setup.hs apparently just Does The Right Thing™.

It turns out there’s a reason for this. In a project’s .cabal file, there’s a field build-type that everyone seems to define, and of course we’re told to just set this to “Simple”:

 27 build-type:          Simple

what else would it be? Well, the answer to that is that “Simple” is not the default; “Custom” is (really? weird). And a custom build is one where Cabal will compile and invoke Setup.hs when cabal configure is called.

Ahh.

When you look in the documentation of the Cabal library (note, this is different from the cabal-install package which makes the cabal executable we end up running) Distribution.Simple indeed has defaultMain but it has friends. The interesting one is defaultMainWithHooks which takes this monster as its argument; sure enough, there are pre-conf, post-conf, pre-build, post-build, and so on; each one is a function which you can easily override.

 20 main :: IO ()
 21 main = defaultMainWithHooks $ simpleUserHooks {
 22        postConf = configure
 23     }
 24 
 25 configure :: Args -> ConfigFlags -> PackageDescription -> LocalBuildInfo -> IO ()
 26 configure _ _ _ _ = do
 27     ...

yeay for functions as first class objects. From there it was a simple matter to write some code in my configure function to call Distribution.Simple’s buildOS and write out a config.h file with the necessary #define I wanted:

  1 #define __LINUX__

Include Paths

We’re not quite done yet. As soon as you want to #include something, you have to start caring about include paths. It would appear the compiler, by default, looks in the same directory as the file it is compiling. Fair enough, but I don’t really want to put config.h somewhere deep in the src/Network/Http/ tree; I want to put it in the project’s top level directory, commonly known as ., also known as “where I’m editing and running everything from”. So you have to add a -I"." option to ghc invocations in your Makefiles, your .cabal file needs to be told in its way:

 61 library
 62   include-dirs:      .

and as for ghci, it turns out you can put a .ghci in your sources:

  1 :set -XOverloadedStrings
  2 :set +m
  3 :set -isrc:tests
  4 :set -I.

and if you put that in your project root directory, running ghci there will work without having to specify all that tedious nonsense on the command line.

The final catch is that you have to be very specific about where you put the #include directive in your source file. Put it at the top? Won’t work. After the pragmas? You’d think. Following the module statement? Nope. It would appear that it strictly has to go after the imports and before any real code. Line 65:

 47 import Data.Monoid (Monoid (..), (<>))
 48 import qualified Data.Text as T
 49 import qualified Data.Text.Encoding as T
 50 import Data.Typeable (Typeable)
 51 import GHC.Exts
 52 import GHC.Word (Word8 (..))
 53 import Network.URI (URI (..), URIAuth (..), parseURI)
 64 
 65 #include "config.h"
 66 
 67 type URL = ByteString
 68 
 69 --
 70 -- | Given a URL, work out whether it is normal or secure, and then
 71 -- open the connection to the webserver including setting the
 72 -- appropriate default port if one was not specified in the URL. This
 73 -- is what powers the convenience API, but you may find it useful in
 74 -- composing your own similar functions.
 75 --
 76 establishConnection :: URL -> IO (Connection)
 77 establishConnection r' = do
 78     ...

You get the idea.

Choices

Several people wrote to discourage this practice, arguing that conditional code is the wrong approach to portability. I disagree, but you may well have a simple piece of code being run dynamically that would do well enough just making the choice at runtime; I’d be more comfortable with that if the OS algebraic data type was in base somewhere; linking Cabal in seems rather heavy. Others tried to say that needing to do this at all is openssl’s fault and that I should be using something else. Perhaps, and I don’t doubt that we’ll give tls a try at some point. But for now, openssl is battle-tested crypto and the hsopenssl package is a nice language binding and heavily used in production.

Meanwhile I think I’ve come up with a nice technique for defining things to drive conditional compilation. You can see the complete Setup.hs I wrote here; it figures out which platform you’re on and writes the .h file accordingly. If you have need to do simple portability conditionals, you might give it a try.

AfC