Calibre is an ebook management application. It comes with a nice ebook reader too, which I use all the time. [Calibre]
Calibre is also the most common ePub generator. Its format converters are robust and battle-hardened.
This post is a record of what I actually did to make the ePub for Libra Shrugged. There’s almost certainly bits I could have done better some other way, and a lot of bits where I got way too deep into technical twiddling just because I could.
(Comments suggesting using LATEX instead will get you slapped over the Internet.)
If you don’t understand any of the technical detail here, don’t worry about it — there must surely be better ways than this.
Since this is on computers, there are some gotchas — specifically, that ePub is an absolute shower of a format, and you will be editing XHTML by hand if you want to get onto Apple Books and the other minor ebook stores.
The good news is that Calibre has a pretty good XHTML editor — right-click on book title, “Edit book”. The bad news is that you’ll need it.
If you’ve lived your life wrong enough that you’re hand-editing ePub XHTML, you should probably install epubcheck. There’s an online version, but I just installed the Java .jar file locally — it’s much faster. [EPub Validator; Github]
The developer of Calibre considers epubcheck broken, which it is, and wrong about the ePub specification, which it is. Unfortunately, Apple Books requires your book to pass epubcheck anyway, with no errors or warnings. [Apple]
At one point I unzipped the ePub into separate XHTML files. This let me hand-tweak the files directly in vim, then add them back to the ePub using zip -f (freshen). Nobody who isn’t me should expect to have to do this sort of thing, but I’m a control addict.
(I’m using zip -f and not just making a zip file of the separate files because that way, the mimetype file stays both uncompressed and first in the zip file — if it isn’t, epubcheck complains. ePub is weird and annoying.)
I use Xubuntu. Unfortunately, Ubuntu 20.04 has a broken version of Calibre that can’t possibly start or work — Ubuntu pulled a development version from Debian, nobody noticed before release time that it literally didn’t work at all, and now the broken version’s stuck in place for the next five years. [Launchpad; ubuntu-devel mailing list]
(The broken version still has a functional ebook-viewer.)
If you have Windows or Mac, just download the latest version (5.4.2 as I write this) from Calibre and use that. [Calibre]
If you’re like me and insist on using Linux, you can run the Linux install instructions without running it as root. I did the isolated install per the Linux download page: [Calibre]
wget -nv -O- https://download.calibre-ebook.com/linux-installer.sh | sh /dev/stdin install_dir=~/calibre-bin isolated=y
(Calibre doesn’t offer distro packages, because the author has had so many bug reports from broken distro versions that he tells users to get the official binary instead.)
After installing Calibre to my home directory in this way, I start it from a terminal.
Convert your DOCX in Calibre
I wrote both books in LibreOffice in its native ODT format. Calibre’s conversion of LibreOffice ODT files is much better in 2020 than it was for Attack of the 50 Foot Blockchain in 2017.
But I wanted clickable indexes — so I saved the book file in LibreOffice as DOCX, and sent that to Calibre for conversion.
This is the easy part. You click the “Add book” button to import the DOCX, you right-click on the book name and go Convert books→Convert individually.
Choose the following options:
- Metadata: Output format: EPUB. Enter Title, Author(s), Tags. Add the cover image.
- Page setup: Output profile: Generic e-ink HD. Input profile: Default profile.
- DOCX input: Do not add a page after every endnote.
- EPUB output: Flatten EPUB file structure; ePub version 3
Then click “OK” to generate your ePub!
I wanted the ePub to finish with the back cover image from the paperback. So I added the image in the Calibre editor, which called it images/image.jpeg, and added the following code near the end of the final XHTML file:
<div><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="100%" height="100%" viewBox="0 0 1640 2550"><image width="1640" height="2550" xlink:href="images/image.jpeg"/></svg></div>
I also needed to declare my use of SVG in content.opf by adding properties="svg" for the XHTML file it was in:
<item id="id11" href="index_split_031.xhtml" media-type="application/xhtml+xml" properties="svg"/>
In LibreOffice and Word, if you want multiple references to a single footnote or endnote, you create the footnote or endnote and then you add cross-references to it. These are clunky and inconvenient, but they work fine in the original program, and the links work in a PDF.
If you convert DOCX to ePub in Calibre, the cross-reference becomes a link — not to the desired footnote, but to the spot in the text where the footnote or endnote of that number is.
If you have both footnotes and endnotes, this may not even be the right footnote or endnote, because Calibre mixes your footnotes and endnotes together into a single list — then doesn’t renumber cross-references to them.
If you convert ODT to ePub in Calibre, the cross-reference becomes just plain text with the wrong footnote or endnote number.
The answer is to edit the XHTML. I had to go through my fifteen cross-references and cut-and-paste in the XHTML from the correct endnote in place of the erroneous cross-reference anchor.
Alphabetical index entries are plain text, not clickable links, in both LibreOffice and Word. This is obviously silly, which is why making index entries into clickable links is an open feature request in LibreOffice. [Document Foundation]
Q. What’s duller than indexing?
A. Indexing a second time, to get the ePub index right.
You might correctly note that indexes are superfluous in ebooks, which have a search function — but professionally-published nonfiction ePubs tend to have indexes with page numbers and links. And having an index does look professional as hell. (And in self-publishing, you need every advantage you can get.)
Calibre will import an index from ODT as … plain text. It shows page numbers, without hyperlinks — which is doubly useless. So don’t import from ODT if you want an index.
Calibre will import an index from DOCX, and construct hyperlinks from it! It’ll use its own linking, not the page numbers. The links all work, but the result looks visually like an HTML conversion error.
So if you want page numbers, but you also want working links: import your book as DOCX, and you get to edit the XHTML directly again. You’ll need a copy of the index with page numbers, ‘cos you’re going to need to put every single page number into your XHTML by hand.
This will also force you to closely proofread your index, so … good?
Calibre creates ePub 3.2 books with .html filenames, but epubcheck requires .xhtml filenames. I fixed this with shell scripts applied to the unzipped files.
(If you just cut’n’paste these lines without understanding what I did here, you may wreck your book files, and have to start over with exporting to ePub.)
for j in `seq -f %03g 0 31`; do for i in `seq -f %03g 0 31`; do sed -i s/index_split_$i.html/index_split_$i.xhtml/g ./index_split_$j.html ; done; done for j in toc.ncx nav.xhtml content.opf; do for i in `seq -f %03g 0 31`; do sed -i s/index_split_$i.html/index_split_$i.xhtml/g $j ; done ; done
Then zip -f to freshen the files into the ePub.
<li> in headings
Calibre adds an <ol><li></ol> to every heading and subheading. Every ePub reader seems to handle this fine — except FBReader, my favoured ebook reader on Android, which displays a “1.” before each header.
The actual XHTML looks something like:
<ol class="list_"> <li id="id_RefHeading___Toc28800_897132658" value="2" class="block_10"><b class="calibre5">Introduction: Taking over the money</b></li></ol>
Solution: after you’ve unzipped the files, go through and remove every <ol></ol>, convert the <li></li> to <p></p> and remove the value= attribute from the <p> or else epubcheck complains.
Alternate or additional solution: check stylesheet.css for display: list-item; on styles that shouldn’t have it, and replace those with display: block; .
If you wrote the book in a particular font, the index generated from a DOCX may be in whatever Calibre thinks is a good default font — and this default font may show up elsewhere. The quickest way to fix this is to edit stylesheet.css and remove the wrong font.
Remove back-arrows from footnotes
Calibre puts a back arrow ← character on every footnote or endnote. This renders fine on most ePub readers, but fails on some old ones. I removed it entirely from the file containing the endnotes. I think the endnotes also look better without the arrows.
If you use Calibre’s ebook-viewer, it’ll add a file called META-INF/calibre_bookmarks.txt to your ePub. Remove this or epubcheck will complain.
Amazon provides Kindle Previewer for Windows or Mac. It doesn’t work in Wine, so I put it in my Windows 10 VM under VirtualBox — you can just download Windows 10 and run it unactivated. [Amazon; Microsoft Windows 10]
Look over every page of your ePub extremely carefully — this is precisely what Amazon will make of your ePub.
I also checked in ebook-viewer and FBReader. You should check in whichever ePub readers you personally use.
Draft2Digital requires that you not have the following:
- “Competitor Links: The content contains links to sales channels that are in direct competition with the chosen sales channels.”
- “Competitor Reference: The content contains references to sales channels that are in direct competition with the chosen sales channels.”
This means that Apple doesn’t like links to Amazon, or even mentioning it. The only such link was in the bit at the end advertising Attack, so, fine — I removed that line.
Smashwords didn’t want page numbers on the table of contents, so I removed those for the Smashwords upload. They didn’t fuss about links to Amazon, though.
Why should I bother to do all of this?
(You probably shouldn’t. I’m just like this.)
An ePub that passes epubcheck with no errors or warnings is a thing of joy! Probably.
A more robust file will work on more ebook readers, and your customers will be happier.
But mostly, you’ll bother doing this if you can tap your inner reserves of extreme fussiness and perfectionism and wanting to make your beautiful literary baby as well-presented as possible. That works too.
Also, you probably have to be a huge nerd. But at least the book will be pretty and work everywhere.
Your subscriptions keep this site going. Sign up today!