Debian 7.0 ("Wheezy") was originally released about four years ago, in May 2013; the last point release (7.11) was released a year ago, in June 2016. While Debian 7.0 ("Wheezy") has benefited from the Debian Long Term Support with a further two years of support -- until 2018-05-31 -- the software in the release is now pretty old, particularly software relating to TLS (Transport Layer Security) where the most recent version supported by Debian Wheezy is now the oldest still reasonably usable on the Internet. (The Long Term Support also covered only a few platforms -- but they were the most commonly used platforms including x86 and amd64.)

More recently Debian released Debian 8.0 ("Jessie"), originally a couple of years ago in May 2015 (with the latest update, Debian 8.8, released last month, in May 2017). Debian are also planning on releasing Debian Stretch (presumably as Debian 9.0) mid June 2017 -- in a couple of weeks. This means that Debian Stretch is still a "testing" distribution, which does not have security support, but all going according to plan later this month (June 2017) it will released and will have testing support after the release -- for several years (between the normal security support, and likely Debian Long Term Support).

Due to a combination of lack of spare time last year, and the Debian LTS providing some additional breathing room to schedule updates, I still have a few "legacy" Debian installations currently running Debian Wheezy (7.11). At this point it does not make much sense to upgrade them to Debian Jessie (itself likely to go into Long Term Support in about a year), so I have decided to upgrade these systems from Debian Wheezy (7.11) through Debian Jessie (8.8) and straight on Debian Stretch (currently "testing', but hopefully soon 9.0). My plan is to start with the systems least reliant on immediate security support -- ie, those that are not exposed to the Internet directly. I have done this before, going from Ubuntu Lucid (10.04) to Ubuntu Trusty (14.04) in two larger steps, both of which were Ubuntu LTS distributions.

Most of these older "still Debian Wheezy" systems were originally much older Debian installs, that have already been incrementally upgraded several times. For the two hosts that I looked at this week, the oldest one was originally installed as Debian Sarge, and the newest one was originally installed as Debian Etch, as far as I can tell -- although both have been re-homed on new hardware since the originally installs. From memory the Debian Sarge install ended up being a Debian Sarge install only due to the way that two older hosts were merged together some years ago -- some parts of that install date back to even older Debian versions, around Debian Slink first released in 1999. So there are 10-15 years of legacy install decisions there, as well as both systems having a number of additional packages installed for specific long-discarded tasks that create additional clutter (such is the disadvantage of the traditional Unix "one big system" approach, versus the modern approach of many VMs or containers). While I do have plans to gradually break the remaining needed services to separate, automatically built, VMs or containers, it is clearly not going to happen overnight :-)

The first step in planning such an update is to look at the release notes:

The upgrade instructions are relatively boilerplate (prepare for an upgrade, check system status, change apt sources, minimal package updates then full package updates) but do contain hints as to possible upgrade problems with specific packages and how to work around them.

The "issues to be aware of" contain a lot of compatibility hints of things which may break as a result of the upgrade. In particular Debian 8 (Jessie) brings:

  • Apache 2.4 which both has significantly different configuration syntax and only includes files ending in .conf (breaking, eg, naming virtual servers after just the domain name); as does the Squid proxy configuration (see Squid 3.2, 3.3, and 3.4release notes, particularly Helper Name Changes).

  • systemd (in the form of systemd-sysv) by default, which potentially breaks local init changes (or custom scripts) and halt no longer powering off by default -- that behaviour apparently being declared "a bug that was never fixed" in the old init scripts, after many many years of it working that way. It got documented, but that is about it. (IMHO the only use of "halt but do not power of is in systems like Juniper JunOS where a key on the console can be used on the halted system to cause it to boot again in the case of accidental halts; it is not clear that actually works with systemd. systemd itself has of course been rather controversial, eventually leading to Devuan Jessie 1.0 which is basically Debian Jessie without systemd. While I am not really a fan of many of systemds technical decisions, the adoption by most of the major Linux distributions makes interaction with it inevitable, so I am not going out of my way to avoid it on these machines.)

  • The "nobody" user (and others) will have their shell changed to /usr/sbin/nologin -- which mostly affects running commands like:

    sudo su -c /path/to/command nobody
    

    Those commands instead need to be run as:

    sudo su -s /bin/bash -c /path/to/command nobody
    

    Alternatively you can choose to decline the change for just the nobody user -- the upgrade tool asks per user change in an interactive upgrade if your debconf question priority is medium or lower. In my case nobody was the last user shell change mentioned.

  • systemd will start, fsck, and mount both / and /usr (if it is a separate device) during the initramfs. In particularly this means that if they are RAID (md) or LVM volumes they need to be started by the time that initramfs runs, or startable by initramfs. There also seem to be some races around this startup, which may mean that not everything starts correctly; at least once I got dumped into the systemd rescue shell, and had to run "vgchange -a y" for systemd, wait for everything to be automatically mounted, and then tell it to continue booting (exit), but one boot it booted correctly by itself so it is defintely a race. (See, eg, Debian bug #758808, Debian bug #774882, and Debian bug #782793. The latter reports a fix in lvm2 2.02.126-3 which is not in Debian Jessie, but is in Debian Stretch, so I did not try too hard to fix this in Debian Jessie before moving on. The main system I experienced this on booted correctly, first time, on Debian Stretch, and continued to reboot automatically, where as on Debian Jessie it needed manual attention pretty much every boot.)

Debian 9 (Stretch) seems to be bringing:

  • Restrictions around separate /usr (it must be mounted by initramfs if it is separate; but the default Debian Stretch initramfs will do this)

  • net-tools (arp, ifconfig, netstat, route, etc) are deprecated (and not installed by default) in favour of using iproute2 (ip ...) commands. Which is a problem for cross-platform finger-macros that have worked for 20-30 years... so I suspect net-tools will be a common optional package for quite a while yet :-)

  • A warning that a Debian 8.8 (Jessie) or Debian 9 (Stretch) kernel is needed for compatibility with the PIE (Position Independent Executable) compile mode for executables in Debian 9 (Stretch), and thus it is extra important to (a) install all Debian 8 (Jessie) updates and reboot before upgrading to Debian 9 (Stretch), and (b) to reboot very soon after upgrading to Debian 9 (Stretch). This also affects, eg, the output of file -- reporting shared object rather than executable (because the executables are now compiled more like shared libraries, for security reasons). (Position independent code (PIC) is also somewhat slower on registered limited machines like 32-bit x86 -- but gcc 5.0+ contains some performance improvements for PIC which apparently help reduce the penalty. This is probably a good argument to prefer amd64 -- 64-bit mode -- for new installs. And even the x86 support is i686 or higher only; Debian Jessie is the last release to support i586 class CPUs.)

  • SSH v1, and older ciphers, are disabled in OpenSSH (although it appears Debian Stretch will have a version where they can still be turned back on; the next OpenSSH release is going to remove SSH v1 support entirely, and it is already removed from the development tree). Also ssh root password login is disabled on upgrade. These ssh changes are particularly an upgrade risk -- one would want to be extra sure of having an out of band console to reach any newly upgraded machines before rebooting them.

  • Changes around apt package pinning calculations (although it would be best to remove all pins and alternative package repositories during the upgrade anyway).

  • The Debian FTP Servers are going away which means that ftp URLs should be changed to http -- the ftp.CC.debian.org names seem likely to remain for the foreseeable future for use with http.

I have listed some notes on issues experienced below, for future reference and will update this list with anything else I find as I upgrade more of the remaining legacy installs over the next few months.

Debian 7 (Wheezy) to Debian 8 (Jessie)

  • webkitgtk (libwebkitgtk-1.0-common) has limited security support. To track down why this is needed:

    apt-cache rdepends libwebkitgtk-1.0-common
    

    which turns up libwebkitgtk-1.0-0, which is used by a bunch of packages. To find the installed packages that need it:

    apt-cache rdepends --installed libwebkitgtk-1.0-0
    

    which gives libproxy0 and libcairo2, and repeating that pattern indicates many things installed depending on libcairo2. Ultimately iceweasel / firefox-esr are one of the key triggering packages (but not the only one). I chose to ignore this at this point until getting to Debian Stretch -- and once on Debian Stretch I will enable backports to keep firefox-esr relatively up to date.

  • console-tools has been removed, due to being unmaintained upstream, which is relatively unimportant for my systems which are mostly VMs (with only serial console) or okay with the default Linux kernel console. (The other packages removed on upgrade appear to just be, eg, old versions of gcc, perl, or other packaged replaced by newer versions with a new name.)

  • /etc/default/snmpd changed, which removes custom options and also disables the mteTrigger and mteTriggerConf features. The main reason for the change seems to be to put the PID file into /run/snmpd.pid instead of /var/run/snmpd.pid. /etc/snmp/snmpd.conf also changes by default, which will probably need to be merged by hand.

    On SNMP restart a bunch of errors appeared:

    Error: Line 278: Parse error in chip name
    Error: Line 283: Label statement before first chip statement
    Error: Line 284: Label statement before first chip statement
    Error: Line 285: Label statement before first chip statement
    Error: Line 286: Label statement before first chip statement
    Error: Line 287: Label statement before first chip statement
    Error: Line 288: Label statement before first chip statement
    Error: Line 289: Label statement before first chip statement
    Error: Line 322: Compute statement before first chip statement
    Error: Line 323: Compute statement before first chip statement
    Error: Line 324: Compute statement before first chip statement
    Error: Line 325: Compute statement before first chip statement
    Error: Line 1073: Parse error in chip name
    Error: Line 1094: Parse error in chip name
    Error: Line 1104: Parse error in chip name
    Error: Line 1114: Parse error in chip name
    Error: Line 1124: Parse error in chip name
    

    but snmpd apparently started again. The line numbers are too high to be /etc/snmp/snmpd.conf, and as bug report #722224 notes, the filename is not mentioned. An upstream mailing list message implies it relates to lm_sensors object, and the same issue happened on upgrade from SLES 11.2 to 11.3. The discussion in the SLES thread pointed at hyphens in chip names in /etc/sensors.conf being the root cause.

    As a first step, I removed libsensors3 which was no longer required:

    apt-get purge libsensors3
    

    That appeared to be sufficient to remove the problematic file, and then:

    service snmpd stop
    service snmpd start
    service snmpd restart
    

    all ran without producing that error. My assumption is that old /etc/sensors.conf was from a much older install, and no longer in the preferred location or format. (For the first upgrade where I encountered it, the machine was now a VM so lm-sensors reading "hardware" sensors was not particularly relevant.)

  • libsnmp15 was removed, but not purged. The only remaining file was /etc/snmp/snmp.conf (note not the daemon configuration, but the client configuration), which contained:

    #
    # As the snmp packages come without MIB files due to license reasons, loading
    # of MIBs is disabled by default. If you added the MIBs you can reenable
    # loading them by commenting out the following line.
    mibs :
    

    on default systems to disable of the SNMP MIBs from being loaded. Typically one would want to enable SNMP MIB usage and thus to get names of things rather than just long numeric OID strings. snmp-mibs-downloader appears to still exist in Debian 8 (Jessie), but it is in non-free.

    The snmp client package did not seem to be installed, so I installed it manually along with snmp-mibs-downloader:

    sudo apt-get install snmp snmp-mibs-downloader
    

    which caused that, rather than libsnmp15 to own the /etc/snmp/snmp.conf configuration file, which makes more sense. After that I could purge both libsnmp15 and console-tools:

    sudo apt-get purge libsnmp15 console-tools
    

    (console-tools was an easy choice to purge as I had not actively used its configuration previously, and thus could be pretty sure that none of it was necessary.)

    To actually use the MIBs one needs to comment out the "mibs :" line in /etc/snmp/snmp.conf manually, as per the instructions in the file.

  • Fortunately it appeared I did not have any locally modified init scripts which needed to be ported. The suggested check is:

    dpkg-query --show -f'${Conffiles}' | sed 's, /,\n/,g' | \
       grep /etc/init.d | awk 'NF,OFS="  " {print $2, $1}' | \
       md5sum --quiet -c
    

    and while the first system I upgraded had one custom written init script it was for an old tool which did not matter any longer, so I just left it to be ignored.

    I did have problems with the rsync daemon, as listed below.

  • Some "dummy" transitional packages were installed, which I removed:

    sudo apt-get purge module-init-tools iproute
    

    (replaced by udev/kmod and iproute2 respectively). The ttf-dejavu packages also showed up as "dummy" transitional packages but owned a lot of files so I left them alone for now.

  • Watching the system console revealed the errors:

    systemd-logind[4235]: Failed to enable subscription: Launch helper exited with unknown return code 1
    systemd-logind[4235]: Failed to fully start up daemon: Input/output error
    

    which some users have reported when being unable to boot their system, although in my case it happened before rebooting so possibly was caused by a mix of systemd and non-systemd things running.

    systemctl --failed reports:

    Failed to get D-Bus connection: Unknown error -1
    

    as in that error report, possibly due to the wrong dbus running; the running dbus in this system is from the Debian 7 (Wheezy) install, and the systemd/dbus interaction changed a lot after that. (For complicated design choice reasons, historically dbus could not be restarted, so changing it requires rebooting.)

    The system did reboot properly (although it appeared to force a check of the root disk), so I assume this was a transitional update issue.

  • There were a quite a few old Debian 7 (Wheezy) libraries, which I found with:

    dpkg -l | grep deb7
    

    that seemed no longer to be required, so I removed them manually. (Technically that only finds packages with security updates within Debian Wheezy, but those seem the most likely to be problematic to leave lying around.)

    At one point after the upgrade apt-get offered a large selection of packages to autoremove, but after some other tidy up and rebooting it no longer showed any packages to autoremove; it is unclear what happened to cause that change in report. I eventually found the list in my scrollback and pasted the contents into /tmp/notrequired, then did:

    for PKG in $(cat /tmp/notrequired); do echo $PKG; done | tee /tmp/notrequired.list
    dpkg -l | grep -f /tmp/notrequired.list
    

    to list the ones that were still installed. Since this included the libwebkitgtk-1.0-common and libwebkitgtk-1.0-0 packages mentioned above, I did:

    sudo apt-get purge libwebkitgtk-1.0-common libwebkitgtk-1.0-0
    

    to remove those. Then I went through the remainder of the list, and removed anything marked "transitional" or otherwise apparently no longer necessary to this machine (eg, where there was a newer version of the same library installed). This was fairly boring rote cleanup, but given my plan to upgrade straight to Debian 9 (Stretch) it seemed worth starting with a system as tidy as possible.

    I left installed the ones that seemed like I might have installed them deliberately (eg, -perl modules) for some non-packaged tool, just to be on the safe side.

  • I found yet more transitional packages to remove with:

    dpkg -l | grep -i transitional
    

    and removed them with:

    sudo apt-get purge iceweasel mailx mktemp netcat sysvinit
    

    after using "dpkg -L PACKAGE" to check that they contained only documentation; sysvinit contained a couple of helper tools (init and telinit) but their functionality has been replaced by separate systemd programs (eg systemctl) so I removed those too.

    Because netcat is useful, I manually installed the dependency it had brought in to ensure that was selected as an installed package:

    sudo apt-get install netcat-traditional
    

    While it appeared that multiarch-support should also be removable as a no-longer required transitional package, since it was listed as transitional and contained only manpages, in practice attempts to remove it resulted in libc6 wanting to be removed too, which would rapidly lead to a broken system. (On my system the first attempt failed on gnuplot, which was individually fixable by installing, eg, gnuplot-nox explicitly and removing the gnuplot meta package, but since removing multiarch-support lead to removing libc6 I did not end up going down that path.)

    For consistency I also needed to run aptitude and interactively tell aptitude about these decisions.

  • After all this tidying up, I found nothing was listening on the rsync port (tcp/873) any longer. Historically I had run the rsync daemon using /etc/init.d/rsync, which still existed, and still belonged to the rsync package.

    sudo service rsync start
    

    did work, to start the rsync daemon, but it did not start at boot. Debian Bug #764616 provided the hint that:

    sudo systemctl enable rsync
    

    was needed to enable it starting at boot. As Tobias Frost noted on Debian Bug #764616 this appears to be a regression from Debian Wheezy. It appears the bug eventually got fixed in rsync package 3.1.2-1, but that did not get backported to Debian Jessie (which has 3.1.1-3) so I guess the regression remains for everyone to trip over :-( If I was not already planning on upgrading to Debian Stretch then I might have raised backporting the fix as a suggestion.

  • inn2 (for UseNet) is no longer supported on 32-bit (x86); only the LFS (Large File Support) package, inn2-lfs is supported, and it has a different on-disk database format (64-bit pointers rather than 32-bit pointers). The upgrade is not automatic (due to the incompatible database format) so you have to touch /etc/news/convert-inn-data and then install inn2-lfs to upgrade:

    You are trying to upgrade inn2 on a 32-bit system where an old inn2 package
    without Large File Support is currently installed.
    
    Since INN 2.5.4, Debian has stopped providing a 32-bit inn2 package and a
    LFS-enabled inn2-lfs package and now only this LFS-enabled inn2 package is
    supported.
    
    This will require rebuilding the history index and the overview database,
    but the postinst script will attempt to do it for you.
    
    [...]
    
    Please create an empty /etc/news/convert-inn-data file and then try again
    upgrading inn2 if you want to proceed.
    

    Because this fails out the package installation it causes apt-get dist-upgrade to fail, which leaves the system in a partially upgraded messy state. For systems with inn2 installed on 32-bit this is probably the biggest upgrade risk.

    To try moving forward:

    sudo touch /etc/news/convert-inn-data
    sudo apt-get -f install
    

    All going well the partly installed packages will be fixed up, then:

    [ ok ] Stopping news server: innd.
    Deleting the old overview database, please wait...
    Rebuilding the overview database, please wait...
    

    will run (which will probably take many minutes on most non-trivial inn2 installs; in my case these are old inn2 installs, which have been hardly used for years, but do have a lot of retained posts, as a historical archive). You can watch the progress of the intermediate files needed for the overview database being built with:

    watch ls -l /var/spool/news/incoming/tmp/
    watch ls -l /var/spool/news/overview/
    

    in other windows, but otherwise there is no real indication of progress or how close you are to completion. The "/usr/lib/news/bin/makehistory -F -O -x" process that is used in rebuilding the overview file is basically IO bound, but also moderately heavy on CPU. (The history file index itself, in /var/lib/news/history.* seems to rebuild fairly quickly; it appears to be the overview files that take a very long time, due to the need to re-read all the articles.)

    It may also help to know where makehistory is up to reading, eg:

    MKHISTPID=$(ps axuwww | awk '$11 ~ /makehistory/ && $12 ~ /-F/ { print $2; }')
    sudo watch ls -l "/proc/${MKHISTPID}/fd"
    

    which will at least give some idea which news articles are being scanned. (As far as I can tell one temporary file is created per UseNet group, which is then merged into the overview history; the merge phase is quick, but the article scan is pretty slow. Beware the articles are apparently scanned in inode order rather than strictly numerical order, which makes it harder to tell group progress -- but at least you can tell which group it is on.)

    In one of my older news servers, with pretty slow disk IO, rebuilding the overview file took a couple of hours of wall clock time. But it is slow even given the disk bandwidth, because it makes many small read transactions. This is for about 9 million articles, mostly in a few groups where a lot of history was retained, including single groups with 250k-350k articles retained -- and thus stored in a single directory by inn2. On ext4 (but probably without directory indexes, due to being created on ext2/ext3).

    Note that all of this delay blocks the rest of the upgrade of the system, due to it being done in the post-install script -- and the updated package will bail out of the install if you do not let it do the update in the post-install script. Given the time required it seems like a less disruptive upgrade approach could have been chosen, particularly given the issue is not mentioned at all as far as I can see in the "Issues to be aware of for Jessie" page. My inclination for the next one would be to hold inn2, and upgrade everything else first, then come back to upgrading inn2 and anything held back because of it.

    Some searching turned up enabling ext4 dir_index handling to speed up access for larger directories:

    sudo service inn2 stop
    sudo umount /dev/r1/news
    sudo tune2fs -O dir_index,uninit_bg /dev/r1/news
    sudo tune2fs -l /dev/r1/news
    sudo e2fsck -fD /dev/r1/news
    sudo mount /dev/r1/news
    sudo service inn2 start
    

    I apparently did not do this on the previous OS upgrade to avoid locking myself out of using earlier OS kernels; but these ext4 features have been supported for many years now.

    In hindisght this turned out to be a bad choice, causing a lot more work. It is unclear if the file system was already broken, or if changing these options and doing partial fscks broke it :-( At minimum I would suggest doing a e2fsck -f /dev/r1/news before changing any options, to at least know whether the file system is good before the options are changed.

    In my case when I first tried this change I also set "-O uninit_bg" since it was mentioned in the online hints, and then after the first e2fsck, tried to do one more "e2fsck -f /dev/r1/news" to be sure the file system was okay before mounting it again. But apparently parts of the file system need to be initialised by a kernel thread when "uninit_bg is set.

    I ended up with a number of reports of like:

    Inode 8650758, i_size is 5254144, should be 6232064.  Fix? yes
    Inode 8650758, i_blocks is 10378, should be 10314.  Fix? yes
    

    followed by a huge number of reports like:

    Pass 2: Checking directory structure
    Directory inode 8650758 has an unallocated block #5098.  Allocate? yes
    Directory inode 8650758 has an unallocated block #5099.  Allocate? yes
    Directory inode 8650758 has an unallocated block #5100.  Allocate? yes
    Directory inode 8650758 has an unallocated block #5101.  Allocate? yes
    

    which were so numerous to allocate by hand (although I tried saying "yes" to a few by hand), and they could not be fixed automatically (eg, not fixable by "sudo e2fsck -pf /dev/r1/news").

    It is unclear if this was caused by "-O uninit_bg", or some earlier issue on the file system (this older hardware has not been entirely stable), or whether there was some need for more background initialisation to happen which I interrupted by mounting the disk, then unmounting it, and then deciding to check it again.

    Since the file system could still be mounted, so I tried making a new partition and using tar to copy everything off it first before trying to repair it. But the tar copy also reported many many kernel messages like:

    Jun 11 19:12:10 HOSTNAME kernel: [24027.265835] EXT4-fs error (device dm-3): __ext4_read_dirblock:874: 
    inode #9570798: block 6216: comm tar: Directory hole found
    

    and in general the copy proceeded extremely slowly (way way below the disk bandwidth). So I gave up on trying to make a tar copy first, as it seemed like it would take all night with no certainty of completing. I assume these holes are the same "unallocated blocks" that fsck complained about.

    Given that the news spool was mostly many year old articles which I also had not looked at in years, instead I used dd to make a bitwise copy of the partition:

    dd if=/dev/r1/news of=/dev/r1/news_backup bs=32768
    

    which ran at something approaching the underlying disk speed, and at least gives me a "broken" copy to try a second repair on if I find a better answer later.

    Running a non-interactive "no change" fsck:

    e2fsck -nf /dev/r1/news
    

    indicated the scope of the problem was pretty huge, with both many unallocated block reports as above, and also many errors like:

    Problem in HTREE directory inode 8650758: block #1060 has invalid depth (2)
    Problem in HTREE directory inode 8650758: block #1060 has bad max hash
    Problem in HTREE directory inode 8650758: block #1060 not referenced
    

    which I assume indicate dir_index directories that did not get properly indexed, as well as a whole bunch of files that would end up in lost+found. So the file system was pretty messed up.

    Figuring backing out might help, I turned dir_index off again:

    tune2fs -O ^dir_index /dev/r1/news
    tune2fs -l /dev/r1/news
    

    There were still a lot of errors when checking with e2fsck -nf /dev/r1/news, but at least some of them were that there were directories with the INDEX_FL flag set on filesystem without htree support, so it seemed like letting fsck fix that would avoid a bunch of the later errors.

    So as a last ditch attempt, no longer really caring about the old UseNet articles (and knowing they are probably on the previous version of this hosts disks anyway), I tried:

     e2fsck -yf /dev/r1/news
    

    and that did at least result in fewer errors/corrections, but it did throw a lot of things in lost+found :-(

    I ran e2fsck -f /dev/r1/news again to see if it had fixed everything there was to fix, and at least it did come up clean this time. On mounting the file system, there were 7000 articles in lost+found, out of several million on the file system. So I suppose it could have been worse. Grepping through them, they appear to have been from four Newsgroups (presumably the four inodes originally reported as having problems), and all are ones I do not really care about any longer. inn2 still started, so I declared success at this point.

    At some point perhaps I should have another go at enabling dir_index, but definitely not during a system upgrade!

  • python2.6 and related packages, and squid (2.x; replaced by squid3) needed to be removed before db5.1-util could be upgraded. They are apparently linked via libdb5.1, which is not provided in Debian Jessie, but is specified as broken by db5.1-util unless it is a newer version than was in Debian Wheezy. In Debian Jessie only the binary tools are provided, and it offers to uninstall them as an unneeded package.

    Also netatalk is in Debian Wheezy and depends on libdb5.1, but is not in Debian Jessie at all. This surprised other people too, and netatalk seems to be back in Debian Stretch. But it is still netatalk 2.x, rather than netatalk 3.x which has been released for years; some has attempted to modify the netatalk package to netatalk 3.1, but that also seems to have been abandoned for the last couple of years. (Because I was upgrading through to Debian Stretch, I chose to leave the Debian Wheezy version of netatalk installed, and libdb5.1 from Debian Wheezy installed until after the upgrade to Debian Stretch.)

Debian 8 (Jessie) to Debian 9 (Stretch)

  • Purged the now removed packages:

    # dpkg -l | awk '/^rc/ { print $2 }'
    fonts-droid
    libcwidget3:i386
    libmagickcore-6.q16-2:i386
    libmagickwand-6.q16-2:i386
    libproxy1:i386
    libsigc++-2.0-0c2a:i386
    libtag1-vanilla:i386
    perl-modules
    #
    

    with:

    sudo apt-get purge $(dpkg -l | awk '/^rc/ { print $2 }')
    

    to clear old the old configuration files.

  • Checked changes in /etc/default/grub:

    diff /etc/default/grub.ucf-dist /etc/default/grub
    

    and updated grub using update-grub.

  • Checked changes in /etc/ssh/sshd_config:

    grep -v "^#" /etc/ssh/sshd_config.ucf-old | grep '[a-z]'
    grep -v "^#" /etc/ssh/sshd_config | grep '[a-z]'
    

    and checked that the now commented out lines are the defaults. Check that sshd stops/starts/restarts with the new configuration:

    sudo service ssh stop
    sudo service ssh start
    sudo service ssh restart
    

    and that ssh logins work after the upgrade.

  • The isc-dhcp-server service failed to start because it wanted to start both IPv4 and IPv6 service, and the previous configuration (and indeed the network) only had IPv4 configuration:

    dhcpd[15518]: No subnet6 declaration for eth0
    

    Looking further back in the log I saw:

    isc-dhcp-server[15473]: Launching both IPv4 and IPv6 servers [...]
    

    with the hint "(please configure INTERFACES in /etc/default/isc-dhcp-server if you only want one or the other)".

    Setting INTERFACES in /etc/default/isc-dhcp-server currently works to avoid starting the IPv6 server, but it results in a warning:

    DHCPv4 interfaces are no longer set by the INTERFACES variable in
    /etc/default/isc-dhcp-server.  Please use INTERFACESv4 instead.
    Migrating automatically for now, but this will go away in the future.
    

    so I edited /etc/default/isc-dhcp-server and changed it to set INTERFACESv4 instead of INTERFACES.

    After that:

    sudo service isc-dhcp-server stop
    sudo service isc-dhcp-server start
    sudo service isc-dhcp-server restart
    

    worked without error, and syslog reported:

    isc-dhcp-server[15710]: Launching IPv4 server only.
    isc-dhcp-server[15710]: Starting ISC DHCPv4 server: dhcpd.
    
  • The /etc/rsyslog.conf has changed somewhat, particularly around the syntax for loading modules. Lines like:

    $ModLoad imuxsock # provides support for local system logging
    

    have changed to:

    module(load="imuxsock") # provides support for local system logging
    

    I used diff /etc/rsyslog.conf /etc/rsyslog.conf.dpkg-dist to find these changes and merged them by hand. I also removed any old commented out sections no longer present in the new file, but kept my own custom changes (for centralised syslog).

    Then tested with:

    sudo service rsyslog stop
    sudo service rsyslog start
    sudo service rsyslog restart
    
  • This time, even after reboot, apt-get reported a whole bunch of unneeded packages, so I ran:

    sudo apt-get --purge autoremove
    

    to clean them up.

  • An aptitude search:

    aptitude search '~i(!~ODebian)'
    

    from the Debian Stretch Release Notes on Checking system status provided a hint on finding packages which used to be provided, but are no longer present in Debian. I went through the list by hand and manually purged anything which was clearly an older package that had been replaced (eg old cpp and gcc packages) or was no longer required. There were a few that I did still need, so I have left those installed -- but it would be better to find a newer Debian packaged replacement to ensure there are updates (eg, vncserver).

  • Removing the Debian 8 (Jessie) kernel:

    sudo apt-get purge linux-image-3.16.0-4-686-pae
    

    gave the information that the libc6-i686 library package was no longer needed, as in Debian 9 (Stretch) it is just a transitional package, so I did:

    sudo apt-get --purge autoremove
    

    to clean that up. (I tried removing the multiarch-support "transitional" package again at this point, but there were still a few packages with unmet dependencies without, including gnuplot, libinput10, libreadline7, etc, so it looks like this "transitional" package is going to be with us for a while yet.)

  • update-initramfs reported a wrong UUID for resuming (presumably due to the swap having been reinitialised at some point):

    update-initramfs: Generating /boot/initrd.img-4.9.0-3-686-pae
    W: initramfs-tools configuration sets RESUME=UUID=22dfb0a9-839a-4ed2-b20b-7cfafaa3713f
    W: but no matching swap device is available.
    I: The initramfs will attempt to resume from /dev/vdb1
    I: (UUID=717eb7a5-b49c-4409-9ad2-eb2383957e77)
    I: Set the RESUME variable to override this.
    

    which I tracked down to config in /etc/initramfs-tools/conf.d/resume, that contains only that one single line.

    To get rid of the warning I updated the UUID in /etc/initramfs-tools/conf.d/resume to match the new auto-detected one, and tested that worked by running:

    sudo update-initramfs -u
    
  • The log was being spammed with:

    console-kit-daemon[775]: missing action
    console-kit-daemon[775]: GLib-CRITICAL: Source ID 6214 was not found when attempting to remove it
    console-kit-daemon[775]: console-kit-daemon[775]: GLib-CRITICAL: Source ID 6214 was not found when attempting to remove it
    

    messages. Based on the hint that consolekit is not necessary since Debian Jessie in the majority of cases, and knowing almost all logins to this server are via ssh, I followed the instructions in that message to remove consolekit:

    sudo apt-get purge consolekit libck-connector0 libpam-ck-connector
    

    to silence those messages. (This may possibly be a Debian 8 (Jessie) related tidy up, but I did not discover it until after upgrading to Debian 9 (Stretch).)

  • A local internal (ancient, Debian Woody vintage) apt repository no longer works:

    W: The repository 'URL' does not have a Release file.
    N: Data from such a repository can't be authenticated and is therefore potentially dangerous to use.
    N: See apt-secure(8) manpage for repository creation and user configuration details.
    

    since the one needed local package was already installed long ago, I just commented that repository out in /etc/apt/sources.list. The process for building apt repositories has been updated considerably in the last 10-15 years.

  • After upgrading and rebooting, on one old (upgraded many times) system systemd-journald and rsyslogd were running flat out after boot, and lpd was running regularly. Between them they were spamming the /var/log/syslog file with:

    lpd[847]: select: Bad file descriptor
    

    lines, many, many, many times a second. I stopped lpd with:

    sudo service lpd stop
    

    and the system load returned to normal, and the log lines stopped. The lpd in this case was provided by the lpr package:

    ewen@HOST:~$ dpkg -S /usr/sbin/lpd
    lpr: /usr/sbin/lpd
    ewen@HOST:~$
    

    and it did not seem to have changed much since the Debian Jessie lpr package -- Debian Wheezy had 1:2008.05.17+nmu1, Debian Jessie had 1:2008.05.17.1, and Debian Stretch has 1:2008.05.17.2. According to the Debian Changelog the only difference between Debian Jessie and Debian Stretch is that Debian Stretch's version was updated to later Debian packaging standards.

    Searching on the Internet did not turn up anyone else reporting the same issue in lpr.

    Doing:

    sudo service lpd start
    

    again a while after boot did not produce the same symptoms, so for now I have left it running.

    However some investigation in /etc/printcap revealed that this system had not been used for printing for quite some time, as its only printer entries referred to printers that had been taken out of service a couple of years earlier. So if the problem reoccurs I may just remove the lpr package completely.

    ETA, 2017-07-14: This happened again after another (unplanned) reboot (caused by multiple brownouts getting through the inexpensive UPS). Because I did not notice in time, it then filled up / with a 4.5GB /var/log/lpr.log, with endless messages of:

    Jul 14 06:25:25 tv lpd[844]: select: Bad file descriptor
    Jul 14 06:25:25 tv lpd[844]: select: Bad file descriptor
    Jul 14 06:25:25 tv lpd[844]: select: Bad file descriptor
    

    so since I had not used the printing functionality on this machine since I ended up just removing it completely:

    sudo cp /dev/null /var/log/lpr.log
    sudo cp -p /etc/printcap /etc/printcap-old-2017-07-14
    sudo apt-get purge lpr
    sudo logrotate -f /etc/logrotate.d/rsyslog
    sudo logrotate -f /etc/logrotate.d/rsyslog
    

    which seemed more time efficient than trying to debug the problem of which file descriptor it was talking about (my guess is maybe one which systemd closed for lpd, that the previous init system did not close, but I have no detailed investigation of that). I kept a copy of /etc/printcap in case I do want to try to restore the printing functionality (or debug it later), but most likely I would just set up printing from scratch.

    The two (forced) log rotates were to force compression of the other copies of the 4GB of log messages (in /var/log/syslog, which rotates daily by default, and /var/log/messages which rotates weekly by default), having removed /var/log/lpr.log which was another 4.5GB. Unsurprisingly they compress quite well given the logs were spammed with a single message -- but even compressed they are still about 13MB.

After fixing up those upgrade issues the first upgraded system seems to have been running properly on Debian 9 (Stretch) for the last few days, including helping publish this blog post :-)

ETA, 2017-06-11: Updates, particularly around inn2 upgrade issues.

ETA, 2017-06-17: Updates on boot issues in jessie, fixed by stretch.