At WebVivant Press, we use Adobe InDesign CS3 for producing both print books and ePub format e-books. InDesign (ID) is a great piece of software for laying out magazines and books. And for creating e-books it’s … well, broken.
When you use the ‘Export to Digital Editions’ option from InDesign the resulting ePub file (generally) opens fine in Adobe Digital Editions (ADE). However, if you look at the file info from within ADE, you’ll probably see the message: ‘The document appears to have minor errors that might cause it to be displayed incorrectly’. This is a message you come to fear greatly when creating e-books.
So this post is about how we dealt with these ‘minor errors’.
Now, it’s possible that CS4 does a much better job. But Adobe’s prices are such that upgrading would soak up a lot of book profits. And I suspect we’d still end up tweaking the e-book files by hand. (If there are any InDesign CS4 users out there who can tell us how well it deals with the issues outlined below, we’d love to hear from you.)
Running into problems
I actually encountered these minor errors the first time I output an ePub file from InDesign. The book wouldn’t load into ADE. It turned out that the ‘minor’ errors were, in fact, fatal. Then I discovered that removing a copyright symbol from the metadata of the ID file fixed the problem. This shouldn’t be so: ePub files created with the normal, default utf-8 encoding are perfectly capable of including the copyright symbol, using the © HTML entity. But removing it seemed to work and I got on with my life.
Then I noticed that, while the files would load, there was still a warning about ‘minor errors’. Also, ADE showed the publisher as ‘unknown’. ID has no metadata field where you can enter the publisher’s name.
I think the problem is that CS3 is simply out of date and is producing files to an older, obsolete standard. Whatever the source of the problems, I decided to fix them by hand.
Opening the ePub file
These notes are based on the ePub file we created recently for a free e-book – Make Do & Cook: Savvy Shopping [update: not currently available].
As I discussed in a previous post, an ePub file is just a zip file. You can unzip it and play with the contents. The tweaks I discuss here are those I used to fix ID CS3-produced files, but may also help with files produced by other means.
One way of testing corrected ePub files is simply to load them into ADE or some other e-book reader and see if there are any complaints.
A much better method is to use epubcheck, a free utility that tests your ePub file for compliance with the current standard. It’s not without its own issues: epubcheck has a habit of issuing annoyingly vague error messages. For example, you might be told that a section of the file has ‘missing elements’ but are given no clue as to what these are (presumably, epubcheck knows because it’s recognised that they’re missing).
It is possible to run epubcheck as a web-based app, but the easiest method is to run it from the command line. It’s Java-based, so should run on pretty much any platform. As discussed in Zipping ePub files, I’ve created a short shell script for zipping the various files back into an ePub package, so my workflow is:
- Unzip the ePub file created by InDesign.
- Edit the content.opf and toc.ncx files.
- Zip up the files into an ePub package again using a single command. This leaves the unzipped files still available if further editing is needed.
- Test using epubcheck. If this fails, go to 2. If it succeeds, you’re done.
So, if you’re ready to start playing, copy your ePub file into a new directory all by itself (easier to see what you’re doing this way) and unzip it. Inside, you’ll find a file called ‘mimetype’ and two sub-directories – META-INF and OEBPS. (If the ePub file was created by some method other than outputting from InDesign CS3, the file structure, and even some filenames, may be different, but the principles will be the same.)
Fixing ID CS3’s ePub shortcomings involves editing two files in the OEBPS directory – content.opf and toc.ncx. In the next two parts, we’ll look at editing those files and also take a quick look at the CSS file.
- Open Packaging Format (OPF) 2.0 v1.0: Recommended Specification September 11, 2007 – essential reading if you want to understand the format.
- Epub Format Construction Guide by Harrison Ainsworth – a good overview of the file formats.
- Specifications for the Digital Talking Book – in spite of the title, has some excellent general information, especially on things like the toc.ncx file.
- epubcheck – invaluable Java-based tool for checking your ePub files, even if its error messages can be infuriatingly ambiguous or misleading at times.