Prior to OS X 10.9 (Mavericks) books could be managed, and copied onto an iPad with iTunes. It was a bit fiddly, but it did manage to track files in the original location (so I could keep a backup directory of all my books to move between machines), and with a bit of prompting (one had to add the Book to iTunes then go find it in the "Books" tab of the device and add it) copy books onto the iPad. At least it worked in a predictible manner even if that manner was not ideal.

In OS X 10.9 (Mavericks) this functionality has been replaced by the iBooks.app, which is much more user hostile. At first I thought that the Books option remained in iTunes (the tab was still present), but I found out this morning that the tab only existed to one tell you that the functionality had been moved to iBooks (click "OK" to acknowledge you've been told, and the tab vanishes).

Since I'd just bought a new book, I naively opened the iBooks.app, declined to sign into Cloud, and then watched in horror as iBooks stole all my books. It said "moving NNN books" -- and I hoped it was just importing references to them from iTunes (like iTunes had). But no, it physically removed the books from the location I was keeping them and hid them away inside its own database. To do that without even asking is incredibly user hostile (at least iTunes has a preference: Advanced -> Copy files to iTunes Media folder when adding to library).

From some searching I found the books in:

~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks/Books

which is the kind of directory you'd easily forget to copy over to a new computer. So I set about replicating the files back to where they were minutes ago (but leaving a copy/link for iBooks to avoid it doing this all over again).

Restoring the PDFs

For the PDF files, the problem is relatively simple: they retain their original names, and I just need to know where to put them back -- and fortunately many of my books are PDFs:

cd ~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks/Books
ln f11-magazine-0* ~/misc/books/photography/f11-magazine/
ln Stark_Magazine* ~/misc/books/photography/stark-magazine/
ln *GIMP* ~/misc/books/photography/gimp-magazine/
...

which took a while (I have 125 PDFs sync'd to my iPad), but was just a bunch of rote work. It helped to do:

cd ~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks/Books
ls *.pdf | egrep -v 'f11-magazine-|Stark|GIMP|....'

or:

cd ~/misc/books
find . -type f -name "*.pdf" -print0 | xargs -0 basename | tee /tmp/got

cd ~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks/Books
ls *.pdf | egrep -vf /tmp/got 

to figure out what still had to be sorted out. As well as referring to a known-good backup copy of the directory to make sure I didn't miss anything, or put it back in the wrong place. (I could have just restored from backup, but with 10G of books to restore -- because many of them are very photo heavy -- I didn't want to duplicate 10G of data :-( )

The crazy thing is that iBooks.app could have done this itself. It didn't have to move the books from outside the iTunes library into its own library; it could have just linked to the files that were outside the library if it didn't want to remember the paths.

In the end I had a handful of PDF files which "didn't have a home", that I think I must have added on the iPad or via email. None of them seemed important enough to try to find a "proper" location to store them, so I left them just in the iBooks dungeon.

To verify that I'd got everything I did a "dry run" rsync (-avn) from a known good copy into ~/misc/books and checked it didn't show any missing/misplaced PDF files. (And searching on an old computer mostly showed those files as "located in the iTunes library" tending to confirm this theory of how they ended up there. Those "located in the iTunes library" legitimately could have been moved into the iBooks library -- it's just all the rest that I have a problem with!)

Restoring ibooks/epub

That just left the *.ibooks and *.epub files. Both of which were much more difficult, because iBooks had "helpfully" renamed the files to long hexadecimal strings (similar to what happens with iPad/iPhone backups).

Restoring those involves doing a "match by content" instead, or in this case "match by checksum". The "ibooks" ones were easier, as I only had a few:

# On machine with full original copy
find ~/misc/books -name "*ibooks" | xargs md5 | tee /tmp/original

# On machine with iBooks
scp -p GOODHOST:/tmp/original /tmp/original
cd ~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks/Books
md5 *.ibooks | tee /tmp/stolen

# Restore *.ibooks back to original location
for CHECKSUM in $(cut -f 4 -d ' ' /tmp/stolen); do 
  SRC=$(grep ${CHECKSUM} /tmp/stolen   | sed 's/^.*(//; s/).*$//;') 
  DST=$(grep ${CHECKSUM} /tmp/original | sed 's/^.*(//; s/).*$//;')
  if [ -n "${SRC}" -a -n "${DST}" ]; then 
    echo ln "${SRC}" "${DST}"
  else 
    echo "Cannot find ${CHECKSUM}"
  fi
done

When you're happy with the result, either remove the "echo" before the link command, or pipe it through sh -x (only if your filenames have no spaces in them!).

I've put that script in my util directory as ibooks-relink since I'm guessing I might need it again, with an extension that runs it in "echo" mode if there are no arguments and "for real" if there is an argument, eg ibooks-relink go.

The "epub" files are conceptually the same, but because I have more of them it required more manual checking before I was ready to let it restore the files. Viz:

# On machine with full original copy
find ~/misc/books -name "*epub" | xargs md5 | tee /tmp/original

# On machine with iBooks
scp -p GOODHOST:/tmp/original /tmp/original
cd ~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks/Books
md5 *.epub | tee /tmp/stolen

# Restore *.epub files back to original location
ibooks-relink          # Verify output
ibooks-relink go       # Actually restore files

After all of that work a rsync from a known good location reveals no missing files. Without a recent backup this would have been much more frustrating to sort out.

There was one mystery epub file (D404F992EA4A7D972D8B6AC1791BD17A.epub with checksum 81b903c4502e1cc766a2d8ac2364a3b5) still left over afer this which I still wanted to figure out what it was.

Fortunately epub files are zip files, so one can do:

unzip -p D404F992EA4A7D972D8B6AC1791BD17A.epub iTunesMetadata.plist| less

to get some information about what the file is. And:

unzip -v D404F992EA4A7D972D8B6AC1791BD17A.epub | awk '{print $7, $8; }'

to get a list of file checksums and filenames. That let me determine that it was a slightly modified version of another "epub" file, where I have the original separately. My guess is that perhaps this is another file that I originally added on the iPad and then downloaded the book again for reference. (Fortunately it's from a bookseller that allows downloading repeatedly, so I could always get another copy if required.)

So the misfeature of iBooks.app turned a "5 minute" task of adding another book to my iPad into a 2 hour data recovery exercise, requiring me to manually do something Apple could have just done itself in the beginning (ie link all the files). Thanks Apple.

Adding new books

After all of this, I found I still had to add my new book into iBooks. Interestingly it appears adding new books does not steal the books; it copies them (ie, they're new inodes with a duplicate copy of the file). So the pain described above appears to be the result of a very poorly implemented "Move books from iTunes" option, combined with a terrible user interface choice to use that poor implementation automatically without prompting :-(

To add new books to iBooks, and keep a copy in the original location and not duplicate files taking up twice the space, the best option seems to be:

  • Put the book onto the Desktop for ease of reference

  • Open /Applications/iBooks.app

  • Use File -> Add to Library... in iBooks to add the book from the Desktop

  • Go to the iBooks hidden folder, and link it to the permanent "purchased books" location:

    cd ~/Library/Containers/com.apple.BKAgentService/Data/Documents/iBooks/Books
    ln $BOOKFILE ~/misc/books/....
    
  • Remove the copy on the Desktop (perhaps after verifying the checksums of the files).

(There is of course a risk with not keeping a copy, in that iBooks might decide to modify the file. But with the volume of books I have, it's a risk I'm willing to take to avoid doubling the storage requirements.)

After adding the book to iBooks.app it then appeared in the "Books" tab of iTunes, and I could select it for sync onto my iPad. The Collections I had on the iPad didn't make it into iBooks.app, but fortunately the Collections also remained on the iPad. (Some day when I have more time, if I use iBooks.app as more than a conduit to get books onto the iPad, I might replicate the Collections from my iPad into iBooks.app by hand.)

ETA, 2014-05-30: It appears that newly added epub files will be automatically unpacked into directories, and thus not keep the same checksum, so this trick won't work. It does still appear to work for PDF files, which are the main books that I put on my iPad (the other text-based books I tend to read on a dedicated book reader). But this hard link approach has worked so far for imported epub files, saving a fair amount of disk space.