After migrating to a new laptop I found that I'd made several duplicates of some fairly large (but important) data, during the migration, to make sure I did not lose it. But now that everything is copied across, and I've been using the new laptop for a couple of months, the drive is looking more full (90% full is a steady state for harddrives!). So I wanted to tidy up some of the duplication, and recover some space.
An obvious thing to do with "similar, but not quite identical"
directory trees, is to turn all the common files into hardlinks to
each other. There are specialised tools (like
dupfiles) for doing this, and I've written
my own in the past to deal with various file renames (link iPhone/iPad
backups which use hashed filenames). But for the simple case of
identical filenames in an identical structure, there is an easy
solution: rsync --link-dest=....
(which can be used to make a Poor
Man's Time
Machine;
another backup
example).
An example (using something similar to my actual case):
cd /bkup/
mv photos photos.old
mkdir photos
rsync -av --link-dest=/photos photos.old/ photos/
What this does is for every file in common between /bkup/photos.old
and
/photos
, it hard links in the file from /photos
. And for every
other file it copies the file from /bkup/photos.old
. If the two are
mostly common (eg, one is an older copy of the other, as was my case),
most of the files will end up hard linked. You can check the quantity
of hard links with:
cd /bkup
find photos -type f -links 1 | wc -l
find photos -type f | wc -l
Where the first number tells you the files that are "one of a kind" (ie not hard linked), and the second tells you the total number of files. Ideally you'll find that, eg, 95% of the files are now hard links.
When you're happy the hard links are in place, you can then double check that the same files are found in the old version and the new version:
cd /bkup
diff -r photos/ photos.old/
and if doesn't show any differences, you should be find to remove the old version:
cd /bkup
rm -r photos.old
At this point you'd expect to recover the space used by the one off copies
that were in /bkup/photos
(which became /bkup/photos.old
), since they
are no longer separate copies -- just references to files that already
existed on the hard drive.
But if you're running OS X 10.9 (Mavericks; or it seems 10.7 -- Lion -- or later) then the magic space recovery does not happen as immediately as expected on other systems. Emptying the trash, logging out, or rebooting, to try to free up references to the now deleted files, do not meaningfully help. Which is surprising.
The answer to this surprise seems to be Time Machine Local
Snapshots (more
detail; they show as "white" ticks in
the Time Machine view, as opposed to the "purple" ticks in the Time
Machine view for backups on an external drive). These Local Snapshots
are references to files held on the local hard drive (and apparently
mounted as a lookback mount on
/Volumes/MobileBackups
). The
effect is as if there are still copies of the files that you have
deleted -- so they keep taking up space on the hard drive. (Other
lost storage space possibilities.)
It is possible to see how much space is being consumed by these
local
backups
by going to: Apple -> About this Mac -> More Info... ->
Storage. That is the Storage
tab of the more detailed
"About This Mac" window. (Do not go into "System Report..." on
the Overview
tab, no matter how tempting it looks, as that is
only a hardware breakdown -- not a usage breakdown.)
In the resulting graphic, the "Backups" section on the main hard drive (usually "Macintosh HD", and/or "Flash Storage") is the Local Backups.
In my case these "Local Snapshots" are currently taking up about 20% of
the drive! I assume that if I had looked just before deleting the
/bkup/photos.old
, it would have been taking up much less, and after
deleting /bkup/photos.old
it would have jumped up to almost the size
of /bkup/photos.old
.
Time Machine will keep these local snapshots while there is at least
20% free space on the local drive. Once the free space drops below
20%, it will start removing older snapshots. And if it falls below
10% free, then removing snapshots will be given a higher priority.
In addition the local snapshots are consolidated
regularly down to one per day
(after a day or so), and one per week every week. In my case there
are backups held back a full month on the local drive. (In
/Volumes/MobileBackups/Backups.backupdb/${HOSTNAME}
there are
directories named YYYY-MM-DD-HHMMSS
of when the snapshot is taken,
which helps identify how new or old they are.)
Eventually the last reference to a deleted files will be removed, at which point the free space will magically go up. But it is most likely that this will happen either after about a month, or when the space is needed by something else (the local snapshots act like a weak reference) -- in which case there may not be much change in free space.
So it turns out the steady state of an OS X Mavericks system drive really is 80-90% full!
(It is possible to turn off local snapshots
manually from the command line,
using tmutil
, but on a laptop they're actually quite useful so
I've left them on. I just need to remember to check the disk usage
of snapshots before wondering why deleting lots of files has not
freed up much space :-) )