Open source mindset meets e-books

I’ve had a Kindle for a while now.  It is invaluable when one spends hours upon hours in specialist waiting rooms, Emergency Departments, blood transfusions, chemotherapy, etc.  But one thing really bugs me: I can’t fix typographical errors, and I can’t send my patches upstream (OK, two things).

Many of the books I read on my Kindle contain typographical errors, usually ones you would associate with OCR training e.g. “bum” for “burn” or “back” for “book”.  This even happens in books bought and paid for via Amazon.  (From which I deduce that many publishing houses still require dead tree manuscripts from authors, and then scan them redundantly, instead of simply accepting an electronic copy of the text author worked on for the last year or two, but that’s s different rant.)

Sometimes the flow of the story allows my eye to just skip over these glitches, but some are so jarring, so egregious, that my immersion in the story evaporates and I’m looking at black marks on white pseudo-paper.  Wrong black marks.

If this were open source software, I’d download the source, fix the bug, and email a patch upstream.  Now, with documents, what you see so often is the source, for any reasonable definition, and my OSS expectations are are that I can choose the “Correct…” menu item, and fix that sucker.  I then have the option of (selfishly) “Save Locally” or (generously) “Send Correction Upstream”.  The increasing emergence of DRM-free e-books makes this even easier.

Once this idea occurred to me, the conspicuous lack of such a feature is becoming increasingly irritating.  I’ve just re-read the six Safehold books by David Weber, and I can’t wait for the next one.  However, every one of them had appalling OCR problems and formatting errors, which I’d like to fix.  And I can, but there is no mechanism to share the fixes.  Now, in the OSS world, we can already cope with “may not distribute changed versions” projects, by providing patch sets against known versions.

It’s a solved problem.

But book publishers are decades behind OSS state-of-the-art: they are shite at version tracking and version control, and they appear to be oblivious to crowdsourcing repairs to their products, let alone having mechanisms in place.  (Not even an email address provided in the end papers that you can send corrections to.)

(And then there’s the whole “content vs presentation” debacle that HTML solved in the 1990s.  It would be sweet to be able to toss the publisher’s stupid paragraph formatting and replace it with something my aged tradition-expecting eyes can consume more easily.  A related rant, not the point of this one.)

So now I’m sorely tempted to hack my Kindle to drop an open-source book reader on it, one that would allow me to edit typographical errors in situ.  Feeding them “upstream” is also problematic, I’d have to also hack Calibre to look for new patches (upstream and downstream) when I “sync” my Kindle with my Calibre library.  And where do I then share my fixes?  Especially if the originals didn’t come… um… directly from the publisher?  Or changes to books out of copyright, but not yet in Project Gutenberg?  Alternatively, rather than bricking my Kindle, it may be a good use for a cheap 7-inch Android tablet.

I don’t recall anything resembling editing-as-you-read for Project Gutenberg.  That is, something integrated into the e-book reader itself, sending fixes to URIs and/or email addresses given within the e-book itself.  The Distributed Proofreaders project may have some lessons for us, too.

Is there such an OSS project?  Could an existing OSS project be re-purposed?  Enquiring minds need to know.