Fundamental Interconnectedness

This is the occasional blog of Ewen McNeill. It is also available on LiveJournal as ewen_mcneill, and Dreamwidth as ewen_mcneill_feed.

Repairing the Amstrad CPC6128 floppy drives

The Amstrad CPC6128 (and Amstrad CPC664 for about 6 months before the CPC6128), was supplied with a relatively rare 3" (not 3.5") single sided floppy drive. They were used in a few other computers, including the Amstrad PCW82xx range, and some Sinclair Spectrum computers (Amstrad bought Sinclair's computer assets in the mid-1980s).

Thanks to a generous donation by a LCA2018 attendee, I now have two Amstrad CPC6128 units -- my original one, and a second one that had spent 20 years sitting in someone's garage (it was very dusty and needed a lot of cleaning!).

Because the 3" drives are relatively rare, both the 3" floppy discs (CF2) are fairly rare and rather expensive, and the drives need to be repaired, rather than trying to find a replacement drive. The drives essentially only exist in 30+ year old machines, and are likely to need repair themselves. The 3" floppy discs were over NZ$10 each new in the 1980s, and used 3" floppy discs are still around NZ$10 each second hand in 2018; I bought five (5) "new old stock" -- ie, unopened shrink wrap -- 3" discs earlier this year for NZ$75, which seemed about the going rate.

Other than general cleaning of the floppy drive, of 30+ years of accumulated dust and grime, they also need lubricating the stepper motor worm drive, and inevitably the floppy drive belt needs replacing. The floppy drive belt runs between the capstan connected to the motor for the spindle (near the back of the drive) and the drive spindle (in the centre!) to spin the disc, and is made of thin (about 0.5mm rubber) and over time dries out, goes tacky, and breaks (like most rubber).

DataServe Retro, in the United Kingdom provides an excellent service supplying replacement parts/kits for servicing the drives, as well as detailed instructions on servicing the drives. A few other people have also posted their instructions on repairing the drive, and there is also a video demonstration of the most common problem with the Amstrad CPC6128 drive, and replacing the drive belt to repair it. (See also repairing an Amstrad FD1 external drive, and replacing the Amstrad CPC6128 drive write protect sense pin; also for completeness an Amstrad CPC464 overview from the same site.)

I had previously tried to replace the drive belt in my original Amstrad CPC6128 in 2004, with a drive belt purchased locally based on trying to measure the (broken) disc belt -- and it almost worked, but only sort of worked if the drive was already spun up to speed, so it required a lot of "R"(etry) attempts to do anything. That unreliability was the main reason why I did not try to copy the 3" discs back in 2004 when I imaged a lot of other floppy discs.

Given that I now had two Amstrad CPC6128s, neither with a working disc drive, I decided it was worth trying to replace both drive belts. Due to the distance and international postage costs, I bought the Amstrad CPC6128 drive servicing kit and a twin pack of drive belts (for a total of three drive belts) as well as a spare write protect pin just in case (which turned out not to be needed; I believe one of my drives does use the write protect sensing pin, but I was very careful not to turn the drives over during the belt replacement, having been forewarned of problems, so it appears the existing pin stayed in place).

I also bought some second hand CF2 3" discs, and another set of Amstrad CPC6128 original system discs (CP/M Plus, CP/M 2.2, utilities, etc) just in case my originals were unreadable due to age. (It appears they are now sold out of second hand CF2 3" discs.)

In both cases, replacing the drive belt was fiddly, but possible. I had two different 3" drive models (archived version) to work on in the two systems (assembled about 9-12 months apart), but the same drive belt worked for both of them, despite the fairly different internal geometry. Based on the pictures I believe the drive in my original Amstrad CPC6128 is an EME-150A (archived version; the original model, manufactured by Matsushita); the drive in the Amstrad CPC6128 I was given was labelled as an EME-156 (archived version; which does use a mechanical sensing pin for write protect; in hindsight, I did see the sense switch, but I did not see the pin itself -- fortunately I was very careful not to turn the drive upside down while it was open, and writing to the discs still works, so apparently the pin is still in place).

My original Amstrad CPC6128

The locally obtained (2004) replacement (wrong size) drive belt, in my original Amstrad CPC6128, was fortunately still intact, so I had very little cleaning to do on that drive and it was a fairly quick replacement. The drive appeared to work properly immediately after reassembling it with the correct drive belt replacement, which was extremely pleasing -- I was able to copy the spare system discs onto one of the spare floppies immediately without errors.

Comparing the locally obtained drive belt from 2004 with the exact replacement part from DataServe Retro, I found that my guessed replacement part was 10mm too long (circumference) and 0.5mm too wide. I assume my original drive belt had stretched in addition to breaking, before I measured it, causing me to overestimate the required length of the drive belt (I had ordered a 72mm x 0.5mm x 3mm flat belt in 2004; I believe the the 72mm is a notional diameter, assuming the belt is laid out as a circle. That dimension probably relates to the manufacturing process.)

As best I can tell from measuring it, the exact replacement belt is about 70mm x 0.5mm x 2.5mm -- about 2mm smaller in diameter, which works out to 7-10mm shorter in circumference (it may be a notional 69mm, but I think it is 70mm; I measured it as about 220mm circumference). That explains why my guessed replacement drive belt in 2004 sort of worked -- it was slipping a bit, but if the disc was up to speed it was almost close enough to keep it spinning. (Ironically if I had realised at the time I could have ordered the slightly shorter belt; I suspect the 2.5mm/3mm width difference might work as the capistan is pretty wide, but the 2.5mm width fits the spindle better. Since 2004 the local distributor of drive belts I used appears to have gone out of business, or at least taken their ordering website down.)

Second (donated) Amstrad CPC6128

The second Amstrad CPC6128 -- very dusty from 20 years in a garage -- was more difficult to get working properly. The drive belt had snapped, but fortunately just in one place, so it was fairly easy to clean the capistan and replace the drive belt. However, getting the head stepper motor/worm drive working properly was rather more difficult, and after a small amount of use the drive started squealing loudly at times when the disc was spinning. I ended up:

  • taking the drive apart a second time, and cleaning it again, including taking the drive belt off again and cleaning the capistan/spindle a second time;

  • lubricating the stepper motor worm drive a second time;

  • lubricating the top mounting point pressure plate of the drive spindle and the botton of the spindle mount point, as well as the capistan itself

  • exercising the drive a lot to encourage the stepper motor worm drive to pick up the lubrication added.

Initially after the first cleaning/drive belt replacement the head stepper motor was totally stuck. It took it took multiple very noisey attempts to use the drive before it managed to break free of its stiction and get back to track 0 so "cat" would work. After that it took several more attempts to get it to even step to the second track reliably, and many more attempts to get to the point where it could format/verify/copy an entire disc and sound like it was working properly.

After several hours of effort it appears the second drive is pretty much working reliably, but I do not yet trust it as much as my original Amstrad CPC6128 drive given how long it spent accumulating dust in a garage.

For cleaning I just used isopropyl alcohol (fairly easily obtained from electronics supply stores), and cotton buds. For lubrication I used Abloy Lock Oil, mostly because I already had it and it is designed not to pick up additional dust in use (DataServe Retro recommend a grease like vaseline, on the stepper motor worm drive, but I did not have any at hand). I guess time will tell whether I made the right choice on lubrication (the main thing I would stress is to use small amounts of lubricant -- the last thing you want is excess lubricant spun out over the floppy discs during use!).

Exercising the Amstrad CPC6128 floppy drive


From BASIC the main way to exercise the floppy drive is to try loading or saving BASIC programs from the floppy disc. To do this I used one of the secondhand CF2 3" discs that I had bought, so I could test with a disc that I knew did not contain anything important. (While it would have been possible to attempt to read from a disc with existing data on it, I was reluctant to trust a disc with existing data on it to the "not yet working properly" drive, so stuck with just trying simple cat/load/save commands on known reformatted disc.)

My initial attempts with the dusty Amstrad CPC6128 were:

  • get a disc catalogue -- with cat -- to force the drive to read from track 0 (and thus pulll the drive head back to track 0)

  • write a tiny BASIC program (eg, a few lines) and then save that to the disc under multiple names:

    save "test1
    save "test2
    save "test3

    and so on. It appears the fourth or fifth copy of a 1KiB program saved involves stepping to the second track, which proved problematic for a while (and at one point it appeared to write track 1 over track 0, which caused "cat" to no longer work on that disc -- eventually fixed again by reformatting that side of the disc again after I had the drive working better; see below).

  • try loading BASIC programs back off the disc (eg, reset and load the tiny program I had written).

Doing this repeatedly got the drive to the point where it could move the head enough to read in the CP/M boot disc (which it was not able to do so at first). Then I was able to exercise the drive more from CP/M.

From CP/M Plus

On the Amstrad CPC6128, the only official way to format or copy a whole disc is from CP/M Plus, which needs to be booted from disc -- the first system disc supplied with the Amstrad CPC6128 (or in my case also bought as a second hand copy). It took me some time to find/remember how to format discs or copy them, and it does not appear to be very well documented online, so I am recording the instructions from the Amstrad CPC6128 manual -- Chapter 1 Pages 39-42 -- here for future reference (by me, or someone else).

To boot up CP/M plus, insert side 1 of the system discs, and then type:


where the "|" introduces a RSX command -- basically a extension command for AMSDOS, the disc system for the built in Amstrad CPC BASIC; there were about a dozen commands built in to the system ROMs, and lots of third party ROMs, discs, peripherals, etc, defined their own RSX commands.

Typing "|cpm" causes the Amstrad CPC6128 to read CP/M off the disc in the first disc drive -- which is expected to be a "System Disc" -- ie with boot sectors, and a copy of the CPM system image. Actually booting CP/M Plus involves reading about 30KiB off the disc, which requires reading multiple tracks -- so on the rather dirty Amstrad CPC6128 drive it took multple attempts and other drive exercise/cleaning before it would seek well enough to boot CP/M Plus.

Once CP/M Plus has booted to the prompt announcing it is drive A, it is possible to work with whole floppy discs by running:


which is a menu-based program for copying, formatting, and verifying floppy discs. It can work either with one disc drive (in which case copying discs involves changing discs several times) or with two disc drives (if you have an external drive).

From the disckit3 menu, you have the choices of:

  • f7: Copy disc

  • f4: Format disc

  • f1: Verify disc

  • f0: Exit from program

These are selected using the function keys at the right hand side of the keyboard, even though the menu displayed on screen shows just 7 / 4 / 1 / 0 -- you just have to know to use the function keys :-( Invalid keys just result in a system bell beep as the only feedback, so it was initially very confusing until I found the relevant manual pages and reminded myself to use the function keys at the right.

For exercising the drive (ie, the dirty problematic Amstrad CPC6128 drive) the best options are:

  • Verify disc (which will try to read the whole disc); and

  • Format disc (which will reformat each track of the disc)

both of which involve seeking to every track on the disc. I did both about a dozen times on the problematic drive, and eventually -- after a second cleaning -- it seemed to start moving freely enough to work properly. (I believe "Verify disc" only checks that it can read each sector/track of the disc, without actually checking the contents beyond the sector checksums; but for forcing the drive head to step it is ideal if you have an already formatted disc to test with.)

When formatting a disc you have three choices of format:

  • f9: System format

  • f6: Data format

  • f3: Vendor format

  • .: Exit menu (back to the top menu)

The system format, data format, and vendor format use different sector numbers, but have the same native capacity -- 180KiB. The system format has a copy of CP/Ms boot information on it (and when formatting a system disc it will first ask for an existing system disc to read the required system tracks); the vendor format is the same as the system format, but without the system tracks initialised (just the space reserved). The data format uses different sector numbers, and makes most of the native space available for storing data (178KiB usable, with 2KiB reserved for directory information; by comparison the system and vendor formats have 169KiB usable).

Once I had exercised the problematic drive enough that it appeared to be working reliably, I then copied the original CP/M Plus system disc that I had purchased onto one of the second hand CF2 3" reformatted discs, so that I could use that instead of the originals (this is also recommended early on in the Amstrad CPC6128 manual).

To do this, boot CP/M Plus and start disckit3 (as above), then:

  • Insert the CP/M Plus system disc (if it is not already inserted)

  • Press f7 to Copy a disc

  • Press "Y" to confirm you want to start copying the disc

  • On a single drive system, disckit3 will now read the first 15 tracks into RAM (67 KiB, which interestingly is larger than the 61KiB CP/M TPA, so I assume it is also buffering some of the data in the second 64KiB RAM bank...)

  • When prompted to insert the disc for WRITE, remove the original and insert the blank disc to be overwritten, then press a key to continue. It will write out the first 15 tracks to the new disc.

  • After the first 15 tracks are written, it will prompt you to insert the disc for READ, and then when you press a key to continue, it will read the next 15 tracks into RAM, and prompt you to insert the disc for WRITE. When you swap discs and press a key to continue the next 15 tracks are written out.

  • On the final pass, you are requested to insert the disc to READ, and when you press a key the final 10 tracks are read into memory, and then written out to the copy after you insert the disc to WRITE and press a key.

If you are doing this on a single drive machine it is very good practice to write protect the READ disc, and write enable the WRITE disc, to minimise the chances of inserting the wrong disc at the wrong time in the six disc swaps. (The program does check that you removed/inserted a disc, but obviously cannot easily check which disc you inserted!)

With a second hand disc and second hand drive, using the "Verify disc" function at the main menu after copying the disc, to check that the copy is readable, is probably also a very good idea.

(From memory with two disc drives connected, it will simply copy all tracks from A: to B: without needing further user intervention beyond putting the discs in the right drives and confirming you want to overwrite the destination disc. No disc swaps required, which means the chances of inserting the wrong disc at the wrong time are pretty low; but I would still suggest write protecting the source disc just to be safe!)

Alternative disc formatting

While there is no user accessible routines to format a disc from AMSDOS, there are some hidden RSX functions, which can be called from machine code, to format a disc, using a combination of Seek Track and Format Track. (They cannot be called directly from AMSDOS, as their names are non-ASCII characters.)

While hunting for a way to reformat floppy discs on the Amstrad CPC6128 (partly to exercise the drive, partly because non-working seeking had caused track 1's information to overwrite track 0 on one of my test blank discs), I came across a blog post providing a BASIC listing that loaded a RSX to format a disc from AMSDOS (archived version. It is a fairly short assembly program, with a HEX loader in BASIC, small enough that it could be typed in by hand if necessary. I have not used it as I eventually got CP/M Plus to boot even on the partly working drive, so used disckit3 from the CP/M Plus system discs instead. But it could be useful for systems without any working system discs. (The blog post includes a simple, uncommented, disassembly of the source code, which does not confirm it is using the hidden RSX functions, but given the short length that seems likely. There is also a format.dsk linked from this CPC forum thread, apparently including source, which I asume is likely the same one.)

Of course given that the Amstrad CPC has no memory protection/IO controller, at all, with care (to avoid anything else accessing the drive) it would also be possible to send raw seek/track format commands to the NEC 765 floppy controller directly. I suspect that would also require some machine code assistance, to issue IN and OUT instructions (see also Z80 Assembly Guide and other Z80 Programming Information; Locomotive BASIC has PEEK, POKE, to access memory, but not anything to access the IO bus. I assume direct instructions to the NEC 765 floppy controller is how the various alternative disc format tools worked.

Posted Sun Mar 18 17:04:18 2018 Tags:

Today, "Do Not Reply" at the Inland Revenue Department (IRD) got in touch (again) to let me know that they were updating the look of their website for filing GST returns:

What’s changing?

Here’s a summary of the myIR changes:

* The ‘My GST’ section will change to ‘My business’.

* The design, layout and tab labels will change to make it
  clearer and easier for you to find your way around and use our
  online services, as well as give you assurance of where you are
  in the process. You’ll notice we’ve:

  * Created more whitespace

  * Increased the text size

  * Changed the tabs on the first page of the new ‘My business’
    section.  They will change to: Accounts, Submitted (where
    you can see any saved drafts, submitted or processed returns),
    Correspondence (messages or letters previously under ‘Activity
    centre’), Registration details and Logons

  * Changed ‘Quick links’ to be 'I want to...' 

* Continue to file, pay and amend your GST.

and advises "Information about these new changes is also available on our website." (archived version) for which the description of the changes is also online (archived version) and it does include a few non-GST related changes.

The email goes on to advise that there will be four and a half days of downtime for the IRD website to make these changes:

Some of our services will be unavailable while we make these changes

In order to make these important changes some of our key services
will be unavailable between the afternoon of Thursday 12 April
and the morning of Tuesday 17 April. During this time you will
not be able to access myIR Secure Online Services or contact
us through our contact centres. Any secure mail messages saved
as drafts and any draft returns within myIR will be deleted as
part of this process. Please therefore check your secure mail
messages and submit any draft returns before Thursday 12 April.

Those date are confirmed on the IRD website (archived version), and repeated on a second IRD web page (archived version) whose URL hints that they are treating this as a fork lift upgrade.

As someone who works in the IT infrastructure industry, and watches the technical blogs of major online technical sites, that appears to be an outage window at least four days longer than is common in modern IT practice, particularly for an upgrade that is described to end users as changing the names/ordering of some tabs on a website, and updating the layout. (Many high tech online sites do regular A/B testing of site layout changes all the time, in production, without any interruption.) Even major banking web sites, which do periodically announce outages, seem to manage to do their work with only 12-24 hour maintenance windows.

It is also conventional to schedule an extrememly long maintenance window on a public holiday long weekend, such as Easter occurring in a couple of weeks, rather than on a random weekend where a four and a half day outage inevitably ends up crossing at least two business days (thus resulting in the need for a lot of business process to manage being effectively "closed for business" on a regular business day).

To add to the fun, this apparently major upgrade -- four and a half days, presumably ending in a mainframe being wheeled out the door -- is being done immediately after the end of the financial year for most New Zealanders and New Zealand businesses. If it is, as it appears, a bigger upgrade than "changed the names of some tabs" and "updated the layout", then one would hope that IRD is very very certain that the new version works perfectly; major upgrades are usually scheduled for a quieter time of year where possible, for good reasons!

I guess I will not be filing my GST return in the middle of April....

Posted Thu Mar 15 11:11:55 2018 Tags:


Almost 32 years ago (on 1986-04-30), after much begging, my parents bought an Amstrad CPC6128 with an Amstrad GT65 Green Screen Monitor from Porterfield Computers Ltd in Wellington. Together with a word processor (Tasword 6128, the Pitman Typing Tutor, and two 3" Floppy Discs (narrower than the much more common 3.5" floppy discs), this all cost NZ$1500.65 -- in 1986 dollars (over NZ$3500 in 2017 dollars, according to the Reserve Bank Inflation Calculator; also packing slip from Grandstand). An Amstrad DMP-2000 printer came a week later, for NZ$695.00 (again 1986 dollars; over NZ$1600 in 2017 dollars), and about a year later an Amstrad FD1 3" external second drive for NZ$299.00 (in 1987; around NZ$700 in 2017 dollars).

For a middle class family in New Zealand in the 1980s that was quite a bit of money. It changed my life forever. While it was originally a "family" computer, it fairly rapidly became my computer, because I was the one using it all the time.

I learnt to touch type on the Amstrad CPC, learnt to program on the Amstrad CPC, and even learnt about computer hardware from expanding the Amstrad CPC6128. For about 6-7 years the Amstrad CPC6128 was my "everyday" computer -- a long time for any computer, and an especially long time when the personal computer market was changing so rapidly. Over the years I added two Dk'tronics 256KiB Memory Expansion modules (for a total of 576KiB -- bank switched on a 8-bit microprocessor! -- as the first 64KiB of the expansion overlaid the second 64KiB on the Amstrad CPC6128), an Amstrad Serial Interface, an external 5.25" floppy drive and even a MFM hard drive via a SCSI controller board and a home built SCSI adapter.

The Amstrad Serial Interface (and the poor software with it, that could not even keep up with 2400bps let alone display IBM PC extended characters and formatting) led to me write EwenTerm, the largest Z80 Assembler program that I wrote (and one I used daily for several years to keep up with multiple BBSes) -- which also changed my life forever). EwenTerm's source was also the basis of Glenn Wilton's BBS Terminal, developed by another Wellington, New Zealand local. Glenn took the source I wrote and polished it into a more useful communication program, including file transfers (EwenTerm development basically stopped once I had reliable "ANSI terminal" functionality working).

(For several years I had "EwenTerm" online with a URL that suggested it was "ansiterm" -- a URL I had chosen because the program did not really have a name and was written to be an ANSI terminal. More recently this has led to some confusion, as there was an AnsiTerm by Paul Martin, which was well reviewed in Amstrad Computer User Issue 4 on pages 48 and 49, but seems to be a different program, independently developed at about the same time. Unfortunately I appear to have added to the confusion by responding to an email from Kevin Thacker saying "I would like to contact Ewen McNeill concerning his program for the Amstrad CPC called 'Ansiterm'." with the link to my software, resulting in an AnsiTerm page on the CPC that links to both my EwenTerm source and the review about Paul Martin's AnsiTerm. I have emailed Kevin Thacker again to try to get this corrected, and it appears to be corrected there too now.)

The hard drive leads us to our story today. Around 1989 I bought a second hand ST-506 interface MFM hard drive to add to my Amstrad CPC6128. It was 10MB, and even second hand cost a few hundred dollars. The Amstrad CPC6128 did not have a hard drive interface (although various third parties created some later), but I eventually found out about SCSI to MFM Controller cards and ordered one of those second hand (from the USA). Then to link the Amstrad CPC6128 to the SCSI 1 interface I built a simple adapter card that plugged into the Amstrad CPC6128 expansion interface. On a 4MHz Z80 CPU reaching even the 3.5MB/s to 5MB/s of SCSI 1 was a challenge, but due to the buffering in the SCSI to MFM controller, it worked although probably not as fast as the controller and drive were capable of working.

I used the hard drive on the Amstrad CPC6128, under CP/M Plus along with the two Dk'tronics RAM expansion modules, and CP/M BIOS extensions written by Bevan Arps (then of Christchurch, New Zealand) for a couple of years, from around 1991 to 1993. Eventually the call of IBM PC Compatible hardware became too strong, and I put together the first of many PC Compatible machines -- from second hand parts -- which gradually supplanted my Amstrad CPC6128 as my "daily compute".

Floppy disc backup

In the middle of 1993 I made a backup of my Amstrad CPC6128 hard drive, onto floppy disks, and then on 1994-03-31 (date I wrote on the box!) the Amstrad CPC was packed away in boxes. I got those boxes out about 14 years ago -- in 2004 -- and made an attempt at transferring some of the floppy disks onto a PC, but between issues with reading the floppy disks (due to age), and difficulty reading the contents of the floppy disks (due to CP/M file formats) that project gradually got overtaken by events and sat on the hard drive of one of my Linux machines for a decade.

Several recent events encouraged me to take another look, including randomly being given a second Amstrad CPC6128 at a conference earlier this year, as well as being contacted by Kevin Thacker about AnsiTerm, and Jason Scott's repeated encouragement to Close the Air Gap and get things online (someone "closing the air gap" gave us the Walnut Creek CP/M CDROM from 1994 back again -- I bought a copy of it in 1994 when it came out, but somewhere in the last decade I put it "somewhere safe" and cannot remember where that is... :-( ).

Fortunately the backup of my Amstrad CPC6128 hard drive was one of the sets of disks where I managed to copy all the backup disks that I could find into disk images back in 2004, so the "air gap" had already been closed -- they just had not been unpacked. (Unfortunately I could find only discs 1/11 to 11/15, suggesting that 4 discs are missing -- but I no longer remember if there ended up being 11 discs total or 15 discs total. It appears maybe the disks were re-used from an earlier backup, which maybe did require 15 disks.)

Unfortunately, the backups were on 5.25" double sided floppy discs in an extended disc format for Nigdos, developed in New Zealand, with CP/M Plus support released on WACCI PD Disc 7. There is also a NigDos 2.24 ROM image available, but that has no documentation. It seemed like extracting the data from the backup disks would be non-trivial.

It turned out that WACCI PD Disk 7 can be downloaded (Side A CRC A2CDE9B5; Side B CRC 12FEA3FC; two 180KiB disk images. The disk image checksums can be checked with crc32:

ewen@linux:~$ crc32 *
a2cde9b5    WACCI-PD-Disc-07-Side-A.dsk
12fea3fc    WACCI-PD-Disc-07-Side-B.dsk

which match the ones listed on the download page.)

So I thought it was worth reading those disks to see if I could find out any more information about the disk format used.

Side quest: Accessing Amstrad CPC .dsk files

Which means that we get to find out (again!) how to read files from Amstrad CPC .dsk images on Linux. There seem to be a few options:

As it turns out the .dsk images downloaded above were created with SamDisk, and appear to be unreadable with anything other than SamDisk. So in order to extract them we need to use SamDisk to convert them. (I did try converting them with dskconv from libdsk, but it also was unable to read them :-( I am unclear how SamDisk extended the extended disk format, but the extension appears to confuse other tools...).

However SamDisk works with full disk images, rather than the file system within the disk image, so it cannot extract the files from the disk image. Which means that we need to use SamDisk to convert the disk image to something that we can read with another tool. The output formats of SamDisk are relatively limited -- it will write .dsk files, but it appears only in its Extended format, which is what we already have.

After conversion to a .raw file, we can then use cpmtools to read files out of the disk image.


To build SamDisk:

git clone git clone
cd samdisk
sudo apt-get install cmake
cmake -DCMAKE_BUILD_TARGET=Release .

which after a while, will give us a samdisk binary, that can be installed with:

sudo make install

(The build takes a couple of minutes as SamDisk is written in C++ which always compiles relatively slowly on Linux.)

After installaton, we can test that it recognises the disk images with:

samdisk dir WACCI-PD-Disc-07-Side-A.dsk

and then convert them to .raw disk images (ie, no sector headers) which cpmtools can read with:

samdisk copy WACCI-PD-Disc-07-Side-A.dsk WACCI-PD-Disc-07-Side-A.raw
samdisk copy WACCI-PD-Disc-07-Side-B.dsk WACCI-PD-Disc-07-Side-B.raw

There is a complaint on these files that "source missing 27 sectors from 43/1/9/512 regular format", but this appears to be due to the .dsk headers indicating there are 43 tracks, but the actual data contains only the standard 40 tracks, so I ignored it.


cpmtools has been packaged in Debian Linux for years, so once we have a .raw image file, we can just use the packaged version directly and specify the implicit disk format of the sectors. To do this:

sudo apt-get install cpmtools
cpmls -f cpcdata WACCI-PD-Disc-07-Side-A.raw
cpmls -f cpcdata WACCI-PD-Disc-07-Side-B.raw
(mkdir side-a && cd side-a && cpmcp -f cpcdata ../WACCI-PD-Disc-07-Side-A.raw 0:* .)
(mkdir side-b && cd side-b && cpmcp -f cpcdata ../WACCI-PD-Disc-07-Side-B.raw 0:* .)

Other tools

Ultimately SamDisk and cpmtools was the only combination that worked with the *.dsk files that I had downloaded (imaged by SamDisk). But I did try several of the other tools listed above before determining this, and record how to build them, so I am keeping those details below for future reference. They do appear to work with other .dsk images, presumably those extracted earlier or with different tools.


To build dskinfo:

cd dskinfo
mv makefile Makefile
rm -f dskinfo *.o

which builds a single dskinfo binary in the source directory (and the archive ships with a pre-built version, built in 2011).

dskinfo usage is simple:

dskinfo disk_image.dsk

dskinfo outputs verbose (!) information about the sectors in the .dsk file, but also confirms that it is an "Extended CPCEMU style" .dsk file image. However the tool just outputs metadata; it does not actually read the data out of the disk.


Encouraged by that I built cpcxfs:

cd cpcxfs/src
mv makefile.lnx Makefile
make clean

(There are lots of warnings, mostly about const safeness and signedness safeness, but it does build.)

cpcxfs usage is either via the command line or interactive (if run without a command it goes into interactive mode). To use from the command line:

  • List files on disk: cpcxfs DISK_IMAGE.dsk -d

  • Put a file on the disk: cpcxfs DISK_IMAGE.dsk -g ...

  • Get a file from the disk: cpcxfs DISK_IMAGE.dsk -p ...

  • Put multiple files on the disk: cpcxfs DISK_IMAGE.dsk -mg ...

  • Get multiple files from the disk: cpcxfs DISK_IMAGE.dsk -mp ...

(-mg and -mp take CP/M style wildcards, eg, *.*; -g and -p take single filenames.)

Unfortunately (a) cpcxfs also does not support the "Extended CPCEMU style" .dsk format, at least as used in the WACCI PD Disk 7 .dsk files created by SamDisk that I downloaded, and (b) it will complain it cannot open the disk image, even for a directory listing, if the file is write protected which seems unfortunate.

The lack of support for (all? some?) Extended CPCEMU style disk images is particularly unfortunate as Kevin Thacker (who released cpcxfs) is one of the creators of the Amstrad CPCEMU Extended Disk Format.

That format information helped me confirm that the WACCI PD .dsk images I had downloaded were created with SamDisk under Microsoft Windows, because "SAMdisk130107" appears in the .dsk image header. Given that these disks were public domain disks the use of the "Extended" .dsk format appears to be accidental (the "Extended" .dsk format was intended for copy protected disks), but one needs to work with what one can find. And unfortunately those .dsk images appear to be the only one available online. (SamDisk supports a lot of formats.)


To build iDSK:

tar -xzf iDSK.0.13-src.tgz
cd iDSK.0.13/iDSK
./configure --prefix=/usr/local
make clean    # the downloaded archive includes .o files

Then the resulting binary is in src/iDSK which you could copy somewhere on your PATH manually, or use sudo make install (which appears to contain a lot of shell magic to do the same thing).

iDSK is written in French, including French help, but the commands are fairly obvious from context:

  • List files on disk: iDSK disk_image.dsk -l

  • Import file: iDSK -i file.bin -t 1 -s disk_image.dsk

  • Export file: iDSK -g file.bas -s disk_image.dsk

where the "type" (-t) is 0 for ASCII and 1 for binary.

Unfortunately iDSK did not support the Amstrad/Spectrum Extended .DSK data files used by the WACCI PD Disk files I downloaded so it was not useful to me here, and since the output is in French (which I do not read very well) I am more likely to use other tools.


To build libdsk:

tar -xzf libdsk-1.5.8.tar.gz
cd libdsk-1.5.8
automake --add-missing
./configure --prefix=/usr/local

That then gives a library, and a series of tools:

  • dskconv: disk image conversion (new in 1.5.x)

  • dskdump: sector level copy from disk/image to another disk/image

  • dskform: format a floppy disk/image

  • dskid: identify a floppy disk/image

  • dskscan: scan a floppy disk for sectors

  • dsktrans: transfer from one disk/image to another disk/image

  • dskutil: a sector editor

which can be installed into /usr/local/ by running:

sudo make install

To actually run them on modern Linux it is necessary to ensure they can find the shared library:

echo /usr/local/lib | sudo tee /etc/
sudo ldconfig

After that, in theory we can use dskconv (from the libdsk 1.5.x series) to convert from the Extended .dsk image file to a standard .dsk image file. There is no man page for dskconv yet, so the help is just output by running the raw command:

ewen@linux:/var/tmp/amstrad$ dskconv
      dskconv {options} in-image out-image

Options are:
-itype <type>   type of input disc image
-otype <type>   type of output disc image
                'dskconv -types' lists valid types.
-format         Force a specified format name
                'dskconv -formats' lists valid formats.

Default in-image type is autodetect.
Default out-image type is LDBS.

eg: dskconv diag800b._01 diag800b.ldbs
    dskconv -otype raw diag800b._01 diag800b.ufi

The supported image file types are:

ewen@linux:/var/tmp/amstrad$ dskconv -types
Disk image types supported:

   remote     : Remote LibDsk instance
   rcpmfs     : Reverse CP/MFS driver
   floppy     : Linux floppy driver
   dsk        : CPCEMU .DSK driver
   edsk       : Extended .DSK driver
   apridisk   : APRIDISK file driver
   copyqm     : CopyQM file driver
   tele       : TeleDisk file driver
   ldbs       : LibDsk block store
   qrst       : Quick Release Sector Transfer
   imd        : IMD file driver
   ydsk       : YAZE YDSK driver
   raw        : Raw file driver (alternate sides)
   rawoo      : Raw file driver (out and out)
   rawob      : Raw file driver (out and back)
   myz80      : MYZ80 hard drive driver
   simh       : SIMH disc image driver
   nanowasp   : NanoWasp image file driver
   logical    : Raw file logical sector order
   jv3        : JV3 file driver
   cfi        : CFI file driver

And we could in theory convert a disk image with:

mkdir edsk dsk
# Put source files in edsk
dskconv -otype dsk edsk/WACCI-PD-Disc-07-Side-A.dsk dsk/WACCI-PD-Disc-07-Side-A.dsk

but unfortunately libdsk also reports the disk image is unreadable :-(

At this point I looked harder at SamDisk and discovered that there was source for a development version on GitHub, so built that and used SamDisk and cpmtools to extract the files (as described above). It appears that SamDisk (and maybe some emulators) support an Extended Extended version of the .dsk format, and relatively few other tools support that specific format.


More searching turned up DiskImager, which looks like it should be able to visualise the extended .dsk files, and is written using the Lazarus FreePascal system (a Delphi-compatible free IDE). It does not support file extraction directly, but it appears it can convert between Extended and Standard .dsk file formats. Because of the need to install a custom development environment I decided to skip past this one for now (but it does look very useful for preservation work). Because I have not tried it I do not know if it supports the same Extended .dsk format variations as SamDisk.


Other searching turned up dsktools originally written by Andreas Micklei), and originally released on SourceForge and Berlios (now gone). The most recent version of dsktools is on GitHub

To build dsktools:

git clone
cd dsktools/dsktools
make clean     # Ignore errors of files not existing

(Due to the way that the source was rescued from Berlios, both the git repository and the directory are called dsktools, so there is effectively a second level in the git repository.)

This gives two tools dskread and dskwrite, which can be installed in /usr/local/bin with:

sudo make install

Usage is dskread and dskwrite, but they will only read from /dev/fd0 and write to /dev/fd0, so they are not useful for working with already imaged .dsk files :-(

Back to the main quest: reading the NigDos backups

Nigdos disk format

As mentioned above, I previously imaged the NigDos backup floppies on Linux with a simple C program (readnigdos -- readnigdos.c, and Makefile) that read raw sectors using the Linux floppy driver and output the raw sector data to a file, in the order I believed was in use.

"format.doc" and "extdisc.doc" from the WACCI-PD-Disc-07-Side-B extracted above (with SamDisk and cpmtools) give information on the NigDos formats:

The Extra Formats you can create are as follows:

        Name:                          Usable Space:

     V  Vendor Format                      169K
     D  Data Format                        178K
     P  PCW A Drive Format                 173K
     B  Nigdos Big Format                  208K
     T  Nigdos Two Side Format             416K
     C  CPM Big Format                     416K
     L  CPM Large Format                   716K
     H  CPM Huge Format                    816K

and that reminded me that the most likely format was the "Nigdos Big Format". But it provided no information on the precise layout of the disk format.

From my earlier guesses when I wrote readnigdos (14 years ago!), I believe the Nigdos Big Format is:

  • 512 byte sectors (size_ind=2 in CP/M terminology)

  • 10 sectors per track, numbered 0x91 to 0x9a (following the Amstrad CPC convention of using the sector number as a hint of the format)

  • 42 tracks per side (numbered 0 through 41)

  • 2 sides (0 and 1)

  • Tracks ordered "out then out", ie all of side 0 is used and then all of side 1 is used, ie:

    Track 0, Side 0
    Track 1, Side 0
    Track 2, Side 0
    Track 41, Side 0
    Track 0, Side 1
    Track 1, Side 1
    Track 2, Side 1
    Track 41, Side 1

I actually tried asking Bevan Arps if he remembered the track layout, but after 25 years he was guessing as much as I was. Bevan's thought was that it was probably alternating sides -- ie Track 0, Side 0 then Track 0, Side 1, to reduce seeking, and that probably would have been my first guess too for the same reason. However it turns out that both were supported -- "NigDos 420K Double Sided" was "all side 0 then all side 1" (in order, as above), with sectors 0x91 to 0x9a, and "CPM 420K Double Sided" was "interleaved side 0 and side 1" as we first guessed, with sectors 0xa1 to 0xaa (see end of blog post for some more detail).

Having actually installed cpmtools (above), it turned out that one of its disk definitions (in /etc/cpmtools/diskdefs) included the Nigdos format:

diskdef nigdos
  seclen 512
  # NigDos double sided disk format, 42 tracks * 2 sides
  tracks 84
  sectrk 10
  blocksize 2048
  maxdir 128
  skew 1
  boottrk 0
  # this format wastes half of the directory entry
  logicalextents 1
  os 3

which seemed to match what I had detected (512 bytes * 10 sectors/track * 42 tracks/side * 2 sides = 430080 bytes; 512 bytes * 10 sectors/track * 84 tracks = 430080). That means that assuming the tracks are in the right order in the disk image file cpmtools should be able to extract the files out of the backups. (And I would assume for both "NigDos 420K" and "CPM 420K" assuming the tracks off the floppy disk end up in the image file in logical order rather than physical order.)

I believe the comment that "this format wastes half of the directory entry" relates to using 16-bit block numbers, rather than the 8-bit block numbers which would be sufficient for a 420KiB floppy disk using 2KiB blocks. This appears to have happened for backwards compatibility, and compatibility with the larger formats (eg, 716KiB and 816KiB above) which had over 256 * 2 KiB blocks and thus did need the 16-bit block numbers. In practice there were sufficiently many directory entires, and sufficiently little space, that running out of directory entries was not a problem in practice; it is just an odd data recovery quirk to be aware of later.

Reading the backup disk images

According to the notes from the disk labels (which fortunately I also transcribed back in 2004) the backups contained:

  • User 0: disks 1-5

  • User 1: disks 5-9

  • User 2: disk 9

  • User 3: disk 9-10

  • User 4: disk 10-11 (and implicitly 12/13, but I could not find them back in 2004 when I imaged the backup floppy discs)

This tends to suggest that some files will be missing from the backup :-( (At least unless I eventually find the missing floppies, and they are still readable 25 years later.) However the last disk is not full (and still contains "deleted" data) so possibly these were old backup disks that were overwritten but the labels never updated to indicate that fewer disks were required.

Anyway I could list the files on the backup disks with:

for DISK_IMAGE in *.nigdos; do
    cpmls -f nigdos -F "${DISK_IMAGE}"

which gave sensible output, although only about 400 files in total. However 11 disks at 420KiB each works out to about 4.5MiB, which is roughly the portion of the hard drive I made available (because some information from the hard drive tables needed to be held in RAM, and holding more in RAM took up too much of the Amstrad CPC RAM space -- so I only ever made 5MiB available). Certainly the disk images seemed to all have 370-400KiB used.

I believe that cpmls will by default list the files in all users areas. That, plus scanning the CP/M directories in the "hexdump -C" output suggested that all the files on the backups were in User 0 of the floppy disks.

So I extracted all the files into a specific directory per disk with:

for DISK in $(seq -f "%02.0f" 1 11); do
    (mkdir -p "${DISK}" && cd "${DISK}" &&
     cpmcp -f nigdos -p ../"cpc-hd-backup-1993-06-19-disk-${DISK}-of-15.nigdos" "0:*" .)

and then zipped that into a more modern archive for future reference:

zip -9r [01]*

Looking through the backup it appears that it is mostly CP/M programs which were the main thing I would have been likely to store on the hard drive rather than the floppy drive. But there are few other gems included in there.

It appears that the files have been extracted properly, but the only easy way to tell is by reading text files, and looking for files stored with checksums (.ark/.arc, .zoo, .zip, etc).

Reading .ark/.arc files (the original System Enhancement Associates archive format) on Linux is possible with nomarch:

sudo apt-get install nomarch
nomarch -l ARCHIVE    # List files
nomarch -t ARCHIVE    # Test checksums of files in archive

Confirming the NigDos Double Sided Format (sectors 0x91 to 0x9a)

One of the gems from looking around was the source code to Bevan Arp's format, extdisc, etc, that he had apparently sent to me in 1989. So I sent that back to him. It seemed only fair :-)

The "extdisc" source file contained the definitions of the extra floppy disk formats:

DataFormat      equ   0
Block2K         equ 128
Sectors10       equ  32
Tracks8082      equ  64
NigDos          equ  16
Interlace       equ   4
TwoSideRev      equ   8
TwoSideNorm     equ  12
Reserve1Track   equ   1
Reserve2Track   equ   2

; Format Of Data

;               byte FirstSectorNumber
;               byte Flags
;               word DiscSpace


; NigDos 420K Double Sided

                byte &91
                byte DataFormat+Sectors10+NigDos+TwoSideNorm+Block2K
                word &d1

; CPM 420K Double Sided

                byte &a1
                byte DataFormat+Sectors10+Interlace+Block2K
                word &d1

The "SetFormat" routine in that source ends up rotating the flags values twice ("rra; rra" in Z80 assembly), and then storing them in the Amstrad Extended Disk Parameter Block (XDPB) as the "sidedness" field -- which is called "SID" in the source code.

As described in the [ XDPB documentation]](

DEFB        sidedness
                        ;Bits 0-1:  0 => Single sided
                        ;           1 => Double sided, flip sides
                        ;              ie track   0 is cylinder   0 head 0
                        ;                 track   1 is cylinder   0 head 1
                        ;                 track   2 is cylinder   1 head 0
                        ;               ...
                        ;                 track n-1 is cylinder n/2 head 0
                        ;                 track   n is cylinder n/2 head 1
                        ;           2 => Double sided, up and over
                        ;              ie track   0 is cylinder 0 head 0
                        ;                 track   1 is cylinder 1 head 0
                        ;                 track   2 is cylinder 2 head 0
                        ;               ...
                        ;                 track n-2 is cylinder 2 head 1
                        ;                 track n-1 is cylinder 1 head 1
                        ;                 track   n is cylinder 0 head 1

which means:

  • "4" (Interlace) turns into "1" when rotated right twice, and so is interleaved/interlaced;

  • "8" (TwoSideRev) turns into "2" when rotated right twice, and so is up/over/back

  • "12" (TwoSideNorm) turns into "3" when rotated twice, and is specially handled in the patched "TranslateTrack in the source code, to implement "out and out", the format that I had guessed (it skips over the Interleave routine, and then skips over the "Up and Over" track number reversal logic, resulting in just "track = track - tracks_per_side").

This confirms my guesses that the disk format that I had uses all of side 0 in order, then all of side 1 in order, which means my readnigdos.c implements the right logic for the sector values it is reading.


Overall this was a long adventure (14+ years if you count from when the floppy disks were read in; 25+ years from when they were written), but I appear to have successfully recovered about 4.5MB of data backed up from my CP/M hard drive, to explore later. I have also confirmed the format of the "NigDos 420K Two Sided" floppy disks that I created, which gives me confidence to recover data from the other floppy disk images I made in 2004. And I have several tools to extract files from Amstrad CPC .dsk files produced in various eras with various tools. Overall a very successful quest, but rather time consuming!

ETA, 2018-03-05: History of Amstrad CPC464 creation -- the first model, that led to the Amstrad CPC6128 a year or so later. (Archived Version Page 1 and Page 2)

ETA, 2018-03-11: Added date Amstrad CPC was packed away1g

Posted Sat Mar 3 21:25:47 2018 Tags:


Last year (2017) I bought a Numato Mimas v2 with the intention of running MicroPython on it. After a bit of guess work and some assistance on IRC, I managed to get MicroPython running on Mimas V2 (Spartan-6) FPGA.

A lot has changed in the 9 months since that original blog post, including:

So it seemed worth summarising the current build process for both FPGA MicroPython (FμPy) on the Numato Mimas v2 and also the Digilent Arty A7 in one place.

The process below has been tested on an Ubuntu 16.04 LTS x86_64 system, as with the previous build process last year.

USB Mode Switch udev rules

The Digilent Arty A7 benefits from the timvideos USB Mode Switch udev rules. So it is useful to install these before starting if you will be targetting the Arty A7:

git clone
cd HDMI2USB-mode-switch/udev
make install
sudo udevadm control --reload-rules

And also to ensure that your user has permission to access the relevant device files:

sudo adduser $USER video
sudo adduser $USER dialout
sudo reboot

(it may be sufficient to log out/log in again to pick up the relevant additional group permissions, but rebooting will ensure udev, etc, also have the updated configuration files active).

(The Numato Mimas v2 has a physical switch to move instead of relying on USB mode switching, so the above udev rules should not be needed if you are only working with a Numato Mimas v2.)

Downloading the build environment repository

To get the top level repository used for building, run, eg:

cd /src
git clone

(If you check it out in a different directory use that directory instead of cd /src/litex-buildenv below.)

This repository is common to both the Numato Mimas v2 (Spartan-6) and the Digilent Arty A7 (Artix-7), but contains both several git submodules referencing other git repositories (some slightly different between the two platforms) as well as a lot of conda install instructions to install relevant tools (again some slightly different between the two platforms).

Numato Mimas v2

The Numato Mimas v2 is a based around a Xilinx Spartan-6 FPGA, which uses the Xilinx ISE WebPACK proprietary FPGA synthesis software. The zero-cost "ISE WebPACK" license is sufficient for this process.

These instructions Assam you have:


To get started building FPGA MicroPython for the Numato Mimas v2, start in the timvideos/litex-buildenv cloned above:

cd /src/litex-buildenv

and then configure the build targets:



Once this is done, download the other bits of the build environments required for the Numato Mimas v2:


And then when that finishes, enter the build environment ready to build gateware and firmware for the Numato Mimas v2:

source scripts/

All going well you should now have a prompt which begins with "(LX P=mimasv2 F=micropython)" (since the lm32 CPU is the default, as is the base TARGET). If anything is reported missing you may need to run scripts/ again, or ensure that you have the Xilinx ISE WebPACK installed and reachable from /opt/Xilinx.

Once you have completed scripts/ one time, you can then skip that step in later sessions, just doing:

cd /src/litex-buildenv
source scripts/

to get started in a new terminal session.

Building the gateware and micropython for the Numato Mimas v2

To build the required gateware (FPGA configuration, including a lm32 soft CPU) and micropython run:

make gateware

and after several minutes (depending on the speed of your build machine; the ISE WebPACK synthesis is very CPU intensive) you should have both a gateware/top.bin file, and also a software/micropython/firmware.bin file, inside build/mimasv2_base_lm32:

(LX P=mimasv2 F=micropython) ewen@parthenon:/src/litex-buildenv$ ls -l build/mimasv2_base_lm32/gateware/top.bin
-rw-rw-r-- 1 ewen ewen 340884 Jan 17 12:32 build/mimasv2_base_lm32/gateware/top.bin
(LX P=mimasv2 F=micropython) ewen@parthenon:/src/litex-buildenv$
(LX P=mimasv2 F=micropython) ewen@parthenon:/src/litex-buildenv$ ls -l build/mimasv2_base_lm32/software/micropython/firmware.bin
-rwxrwxr-x 1 ewen ewen 167960 Jan 17 12:32 build/mimasv2_base_lm32/software/micropython/firmware.bin
(LX P=mimasv2 F=micropython) ewen@parthenon:/src/litex-buildenv$

Loading the gateware and micropython onto the Numato Mimas v2

On the Numato Mimas v2, the gateware (hardware configuration, including the lm32 soft CPU), BIOS (boot loader, etc), and an application ("firmware", eg micropython) all need to be loaded together in one large combined image file.

This combined image file is loaded in the programming mode of the Mimas v2, which is controlled by the slide switch SW7 near the power and USB connectors.

To load the combined gateware/BIOS/firmware (application) image:

  • Slide the Mimas v2 SW7 switch to "programming" mode (switch in position closest to the USB connector), and connect the USB cable between the Mimas v2 and the build machine if it is not already connected

  • Run:

    make image-flash

    to load the combined image file. This loads over a 19200 bps serial link, so it takes many seconds to load.

You should see something like:

(LX P=mimasv2 F=micropython) ewen@parthenon:/src/litex-buildenv$ make image-flash
python $(which /dev/ttyACM0 build/mimasv2_base_lm32//image-gateware+bios+micropython.bin
* Numato Lab Mimas V2 Configuration Tool *
Micron M25P16 SPI Flash detected
Loading file build/mimasv2_base_lm32//image-gateware+bios+micropython.bin...
Erasing flash sectors...
Writing to flash 100% complete...
Verifying flash contents...
Flash verification successful...
Booting FPGA...
(LX P=mimasv2 F=micropython) ewen@parthenon:/src/litex-buildenv$

Then when it is installed, to reach the micropython REPL:

  • Slide the Mimas v2 SW7 switch to "operation" mode (switch in position furthest from the USB connector)

  • Run:

    make firmware-connect

    and press enter a few times to get the MicroPython REPL prompt (or press the SW6 lm32 CPU reset on the Mimas v2 to see the boot prompts).

This looks something like:

(LX P=mimasv2 F=micropython) ewen@parthenon:/src/litex-buildenv$ make firmware-connect
flterm --port=/dev/ttyACM0 --speed=19200
[FLTERM] Starting...

LiteX SoC BIOS (lm32)
(c) Copyright 2012-2018 Enjoy-Digital
(c) Copyright 2007-2018 M-Labs Limited
Built Jan 17 2018 12:29:35

BIOS CRC passed (5338fc86)
Initializing SDRAM...
Memtest OK
Booting from serial...
Press Q or ESC to abort boot completely.
Booting from flash...
Loading 167960 bytes from flash...
Executing booted program at 0x40000000
MicroPython v1.8.7-465-g7a4245d on 2018-01-17; litex with lm32

when the SW6 button was pressed after hitting enter a few times.

The advantage of this approach is that both the lm32 soft CPU gateware and micropython are stored in the SPI flash on the Numato Mimas v2, and thus will start automatically when power is applied to the Numato Mimas v2, so the only steps to start interacting with the micropython REPL is

cd /src/litex-buildenv
source scripts/
make firmware-connect

and then hitting enter a few times to bring up the micropython REPL prompt.

The disadvantage is that the "make image-flash" step can take a very long time to upload the combined image.

Testing micropython updates with serialboot:

Because "make image-flash" takes such a long time, you may prefer to test interactive changes to the micropython application by loading them via serial boot, as then only the micropython application needs to be transferred with each change. To do this:

  • Slide the Mimas v2 SW7 switch to "operation" mode (switch in position furthest from the USB connector)

  • Run:

    make firmware-load

    to prepare to serialboot the lm32 software CPU into micropython.

  • Press the Mimas v2 lm32 "reset" switch, which is SW6 (near the SD card connector) to reset the software CPU and kick off a serialboot.

You should see something like:

(LX P=mimasv2 F=micropython) ewen@parthenon:/src/litex-buildenv$ make firmware-load
flterm --port=/dev/ttyACM0 --kernel=build/mimasv2_base_lm32//software/micropython/firmware.bin --speed=19200
[FLTERM] Starting...

LiteX SoC BIOS (lm32)
(c) Copyright 2012-2018 Enjoy-Digital
(c) Copyright 2007-2018 M-Labs Limited
Built Jan 17 2018 12:29:35

BIOS CRC passed (5338fc86)
Initializing SDRAM...
Memtest OK
Booting from serial...
Press Q or ESC to abort boot completely.
[FLTERM] Received firmware download request from the device.
[FLTERM] Uploading kernel (167960 bytes)...
[FLTERM] Upload complete (1.6KB/s).
[FLTERM] Booting the device.
[FLTERM] Done.
Executing booted program at 0x40000000
MicroPython v1.8.7-465-g7a4245d on 2018-01-17; litex with lm32
>>> print("Hello World!")
Hello World!
(LX P=mimasv2 F=micropython) ewen@parthenon:/src/litex-buildenv$

The serialboot of micropython will still take many seconds, as loading about 170KB at 19.2kbps is relatively slow! But at least you do not also need to load the gateware/firmware too.

ETA, 2018-02-10: An earlier version of this blog post suggested using "make gateware-flash" and then "make firmware-load" as a quicker method; unfortunately (a) "make gateware-flash" does not include the BIOS or firmware (application), and (b) make firmware-load relies on the BIOS to work, so that combination only works if the Mimas v2 board in question happens to have been previously programmed with "make image-flash", so the BIOS is already on the board. (In which case "make gateware-flash" is redundant unless the hardware definition has changed, and still fits in the gap left for it in the flash.)

Digilent Arty A7

The Digilent Arty A7 is a based around a Xilinx Artix-7 FPGA, which uses the Xilinx Vivado HL WebPACK proprietary FPGA synthesis software. The zero-cost "Vivado HL WebPACK" license is sufficient for this process.

These instructions assume you have:

  • a Digilent Arty A7 Artix-7 based FPGA board

  • an USB A to USB Micro cable, to connect Digilent Arty A7 to the Ubuntu 16.04 LTS build system

  • the Xilinx Vivado HL WebPACK synthesis tool installed, reachable from /opt/Xilinx (eg, via a symlink) with at least:

    • Design Tools / Vivado Design Suite / Vivado

    • Devices / Production Devices / 7 Series / Artix-7

    features installed; the other features are not needed and do not need to be installed.


To get started building FPGA MicroPython for the Digilent Arty A7, start in the timvideos/litex-buildenv cloned above:

cd /src/litex-buildenv

and then configure the build targets:



Once this is done, download the other bits of the build environment required for the Digilent Arty A7:


(If you have already done this for the Numato Mimas v2 above, then most of the extras required will be already installed, but there are a few board-specific items, so be sure to run scripts/ again to be sure.)

When that finishes, enter the build environment ready to build gateware and firmware for the Numato Mimas v2:

source scripts/

All going well you should now have a prompt which begins with "(LX P=arty T=base)" (since the lm32 CPU is the default CPU). If anything is reported missing you may need to run scripts/ again, or ensure that you have the Xilinx Vivado HL WebPACK installed and reachable from /opt/Xilinx.

Once you have completed scripts/ one time, you can then skip that step in later sessions, just doing:

cd /src/litex-buildenv
source scripts/

to get started in a new terminal session.

Building the gateware and micropython for the Digilent Arty A7

To build the required gateware (FPGA configuration, including a lm32 soft CPU) and micropython run:

make gateware

and after several minutes (depending on the speed of your build machine; the Vivado HL WebPACK synthesis is very CPU intensive) you should have both a gateware/top.bin file, and also a software/micropython/firmware.bin file, inside build/arty_base_lm32:

(LX P=arty T=base F=micropython) ewen@parthenon:/src/litex-buildenv$ ls -l build/arty_base_lm32/gateware/top.bin
-rw-rw-r-- 1 ewen ewen 2192012 Jan 17 16:06 build/arty_base_lm32/gateware/top.bin
(LX P=arty T=base F=micropython) ewen@parthenon:/src/litex-buildenv$ ls -l build/arty_base_lm32/software/micropython/firmware.bin
-rwxrwxr-x 1 ewen ewen 167816 Jan 17 16:07 build/arty_base_lm32/software/micropython/firmware.bin
(LX P=arty T=base F=micropython) ewen@parthenon:/src/litex-buildenv$

Loading the gateware and micropython onto the Digilent Arty A7

There are two stages to getting micropython running on the Digilent Arty A7: loading the gateware (hardware configuration, including the lm32 soft CPU), and loading the micropython firmware.

On the Digilent Arty A7 these two steps can be done one after another, thanks to the USB mode switching:

To load the gateware:

  • Run:

    make gateware-load

    to load the hardware configuration. This loads over the JTAG path to the Digilent Arty A7, so loads in a few seconds (much faster than the equivalent step with the Numato Mimas v2).

You should see something like:

(LX P=arty T=base F=micropython) ewen@parthenon:/src/litex-buildenv$ make gateware-load
openocd -f board/digilent_arty.cfg -c "init; pld load 0 build/arty_base_lm32//gateware/top.bit; exit"
Open On-Chip Debugger 0.10.0+dev-00267-gf7836bbc7-dirty (2018-01-15-07:40)
Licensed under GNU GPL v2
For bug reports, read
none separate
adapter speed: 10000 kHz
Info : auto-selecting first available session transport "jtag". To override use 'transport select <transport>'.
Info : ftdi: if you experience problems at higher adapter clocks, try the command "ftdi_tdo_sample_edge falling"
Info : clock speed 10000 kHz
Info : JTAG tap: xc7.tap tap/device found: 0x0362d093 (mfg: 0x049 (Xilinx), part: 0x362d, ver: 0x0)
Info : Listening on port 3333 for gdb connections
loaded file build/arty_base_lm32//gateware/top.bit to pld device 0 in 1s 833353us
(LX P=arty T=base F=micropython) ewen@parthenon:/src/litex-buildenv$

To load micropython via serialboot:

  • Run:

    make firmware-load

    to prepare to serialboot the lm32 software CPU into micropython.

  • You may (or may not) then need to press the red "reset" button on the Digilent Arty A7 board (in the corner diagonally opposite the power connector) to get it to start serial booting (or if you have just powered the board on and loaded the gateware, it might be sitting waiting to serial boot).

You should see something like:

(LX P=arty T=base F=micropython) ewen@parthenon:/src/litex-buildenv$ make firmware-load
flterm --port=/dev/ttyUSB1 --kernel=build/arty_base_lm32//software/micropython/firmware.bin --speed=115200
[FLTERM] Starting...

LiteX SoC BIOS (lm32)
(c) Copyright 2012-2018 Enjoy-Digital
(c) Copyright 2007-2018 M-Labs Limited
Built Jan 17 2018 16:03:05

BIOS CRC passed (df2e52a7)
Initializing SDRAM...
Memtest OK
Booting from serial...
Press Q or ESC to abort boot completely.
[FLTERM] Received firmware download request from the device.
[FLTERM] Uploading kernel (167816 bytes)...
[FLTERM] Upload complete (7.7KB/s).
[FLTERM] Booting the device.
[FLTERM] Done.
Executing booted program at 0x40000000
MicroPython v1.8.7-465-g7a4245d on 2018-01-17; litex with lm32
>>> print("Hello World!")
Hello World!
(LX P=arty T=base F=micropython) ewen@parthenon:/src/litex-buildenv$

The serialboot of micropython will take several seconds, as loading about 170KB even at 115.2kbps is slow (but a lot faster than loading at 19.2kbps on the Mimas v2!).

Avoiding serialbooting

Unlike the Numato Mimas v2 there does not appear to be an easy way to avoid the serialboot process of micropython; but fortunately the Arty A7 serialboot (at 115.2kbps) is much faster than the Mimas v2 serialboot (at 19.2kbps).

However if the Arty A7 is already powered on with the gateware and micropython loaded, you can reconnect to the micropython REPL by running:

cd /src/litex-buildenv
source scripts/
make firmware-connect

and then hitting enter a few times to bring up the micropython REPL prompt.

Not that the Digilent Arty A7 does not store the gateware configuration in a SPI flash (unlike the Numato Mimas v2) so if the Arty A7 is power cycled you will need to go through the:

cd /src/litex-buildenv
source scripts/
make gateware-load
make firmware-load

cycle from the beginning, most likely including hitting the Arty A7 reset button to initiate serial booting. Fortunately this only takes a few tens of seconds to do.

ETA, 2018-01-19: Swapped instructions to using make firmware-load, now that it supports alternate firmware on Mimas v2 and Arty A7, as it is much easier to type :-)

Posted Wed Jan 17 18:29:01 2018 Tags:


On modern storage media, and storage subsystems, file system alignment to "larger than 512 byte" boundaries is increasingly important for achieving good write performance, and on some media for avoiding excessive wear on the underlying media (due to additional write amplification). For about the last 8 years, the Linux kernel has supported an "/sys/block/*/alignment_offset" metric which indicates the number of bytes needed to get a particular layer back into alignment with the underlying storage media, and Linux distributions have included tools that attempt to do automatic alignment, where possible, for the last few years. Which helps newer systems, particularly new installations, but cannot automatically fix older systems.

I had one older system (originally installed at least 15 years ago, and "grandfather's axe" upgraded through various versions of hardware), that ended up having both:

  • 4KB (4096 byte) physical sectors (on a pair of WDC WD20EFRX-68A 2TB drive)

    ewen@linux:~$ cat /sys/block/sda/device/model 
    WDC WD20EFRX-68A
    ewen@linux:~$ cat /sys/block/sda/queue/physical_block_size 
    ewen@linux:~$ cat /sys/block/sda/queue/logical_block_size 

    although curiously the second apparently identical drive detects with 512-byte physical sectors, at least at present:

    ewen@tv:~$ cat /sys/block/sdb/device/model 
    WDC WD20EFRX-68A
    ewen@tv:~$ cat /sys/block/sdb/queue/physical_block_size 
    ewen@tv:~$ cat /sys/block/sdb/queue/logical_block_size 

    for reasons I do not understand (both drives were purchased approximately the same time, as far as I can recall)

  • 1990s-style partition layout, with partitions starting on a "cylinder" boundary:

    (parted) unit s
    (parted) print
    Model: ATA WDC WD20EFRX-68A (scsi)
    Disk /dev/sda: 3907029168s
    Sector size (logical/physical): 512B/4096B
    Partition Table: msdos
    Disk Flags:
    Number  Start        End          Size         Type      File system  Flags
     1      63s          498014s      497952s      primary   ext4         raid
     2      498015s      4498199s     4000185s     primary                raid
     3      4498200s     8498384s     4000185s     primary                raid
     4      8498385s     3907024064s  3898525680s  extended
     5      8498448s     72501344s    64002897s    logical                raid
     6      72501408s    584508959s   512007552s   logical                raid
     7      584509023s   1096516574s  512007552s   logical                raid
     8      1096516638s  1608524189s  512007552s   logical                raid
     9      1608524253s  2120531804s  512007552s   logical                raid
    10      2120531868s  2632539419s  512007552s   logical                raid
    11      2632539483s  3144547034s  512007552s   logical                raid
    12      3144547098s  3656554649s  512007552s   logical                raid
    13      3656554713s  3907024064s  250469352s   logical                raid
  • Linux MD RAID-1 and LVM with no adjustments for the partition offsets to the physical block boundaries (due to being created with old tools), and

  • Linux file systems (ext4, xfs) created with no adjustments to the physical block boundaries (due to being created with old tools)

I knew that this misalignment was happening at the time I swapped in the newer (2TB) disks a few years ago, but did not have time to try to manually figure out the correct method to align all the layers, so I decided to just accept the lower performance at the time. (Fortunately being magnetic storage rather than SSDs, there was not an additional risk of excessive drive wear caused by the misalignment.)

After upgrading to a modern Debian Linux version, including a new kernel, this misalignment was made more visible again, including in kernel messages on every boot:

device-mapper: table: 253:2: adding target device (start sect 511967232 len 24903680) caused an alignment inconsistency
device-mapper: table: 253:4: adding target device (start sect 511967232 len 511967232) caused an alignment inconsistency
device-mapper: table: 253:4: adding target device (start sect 1023934464 len 511967232) caused an alignment inconsistency
device-mapper: table: 253:4: adding target device (start sect 1535901696 len 36962304) caused an alignment inconsistency
device-mapper: table: 253:4: adding target device (start sect 1572864000 len 209715200) caused an alignment inconsistency
device-mapper: table: 253:5: adding target device (start sect 34078720 len 39321600) caused an alignment inconsistency

so I planned to eventually re-align the partitions on the underlying drives to match the modern "optimal" conventions (ie, start partitions at 1 MiB boundaries). I finally got time to do that realignment over the Christmas/New Year period, during a "staycation" that let me be around periodically for all the steps required.

Overall process

The system in question had two drives (both 2TB), in RAID-1 for redundancy. Since I did not want to lose the RAID redundancy during the process my general approach was:

  • Obtain a 2TB external drive

  • Partition the 2TB external drive with 1 MiB aligned partitions matching the desired partition layout, marked as "raid" partitions

  • Extend the Linux MD RAID-1 to cover three drives, including the 2TB external drive, and wait for the RAID arrays to resync.

  • Then for each drive to be repartitioned, remove the drive from the RAID-1 sets (leaving the other original drive and the external drive), repartition it optimally, and re-add the drive back into the RAID-1 sets and wait for the RAID arrays to resync (then repeat for the other drive).

  • Remove the external 2TB drive from all the RAID sets

  • Reboot the system to ensure it detected the original two drives as now aligned.

This process took about 2 days to complete, most of which was waiting for the 2TB of RAID arrays to sync onto the external drive, as the system in question had only USB-2 (not USB-3), and thus copies onto the external drive went at about 30MB/s and took 18-20 hours. The last stage of repartioning and resyncing the original drives went much faster as the copies onto those drives went at over 120MB/s (reasonable for SATA-1.5 Gbps connected drives: the drives are SATA-6Gbps capable, but the host controller is only SATA 1.5 Gbps).

NOTE: If you are going to attempt to follow this process yourself, I strongly recommend that you have a separate backup of the system -- other than on the RAID disks you are modifying -- as accidentally removing or overwriting the wrong thing at the wrong time during this process could easily lead to a difficult to recover system or permanent data loss.

Partition alignment

Modern parted is capable of creating "optimal" aligned partitions if you give it the "-a optimal" flag when start it up. However, due to the age of this system I needed to recreate a MBR partition table with 13 partitions on it -- necessitating several logical partitions -- which come with their own alignment challenges (it turns out that the Extended Boot Record logical partitions require a linked list of partition records between the logical partitions, thus requiring some space between each partition); on a modern system using a GUID Partition Table avoids most of these challenges. (Some choose not to align the extended partition, as there is no user-data in the extended partition so only the logical partition alignment matters; but this is only a minor help if you have multiple logical partitions.)

After some thought I concluded that my desired partitioning had:

  • The first partition starting at 1MiB (2048s)

  • Every other partition starting on a MiB boundary

  • Every subsequent partition as close as possible to the previous one

  • All partitions at least a multiple of the physical sector size (4KiB)

  • All partitions larger than the original partitions on the disk, so that the RAID-1 resync would trivially work

  • Minimise wasted "rounding up" space

  • The last partition on the disks absorbing all of the reductions in disk space needed to meet the other considerations

Unfortunately it turned out that the need for multiple "logical" partitions, the desire to start every partition on a MiB boundary, parted most easily supporting creating partitions that were N MiB long, and a desire not to waste space between partitions, ended up conflicting with the need to have an Extended Boot Record between every logical partition. I could either accept that I would lose 1 MiB between each logical partition -- to hold the 512 byte EBR and have everything start/end on a MiB boundary -- or use more advanced methods to describe to parted what I needed. I chose to use more advanced methods.

Creating the first part of the partition table was pretty easy (here /dev/sdc was the 2TB external drive; but I followed the same partitioning on the original drives when I got to rebuilding them):

ewen@linux:~$ sudo parted -a optimal /dev/sdc
(parted) mklabel msdos
Warning: The existing disk label on /dev/sdc will be destroyed and all data on
this disk will be lost. Do you want to continue?
Yes/No? yes
(parted) quit

ewen@linux:~$ sudo parted -a optimal /dev/sdc
(parted) unit s
(parted) print
Model: WD Elements 25A2 (scsi)
Disk /dev/sdc: 3906963456s
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:

Number  Start  End  Size  Type  File system  Flags

(parted) mkpart primary 1MiB 245MiB
(parted) set 1 raid on
(parted) mkpart primary 245MiB 2199MiB
(parted) set 2 raid on
(parted) mkpart primary 2199MiB 4153MiB
(parted) set 3 raid on
(parted) mkpart extended 4153MiB 100%

This gave a drive with three primary partitions starting on MiB boundaries, and an extended partition which covered the remainder (majority) of the drive. Each primary partition was marked as a RAID partition.

After that it got more complicated. I needed to start each logical partition on a MiB boundary, and then finishing them just before the MiB boundary, to allow room for the Extended Boot Record to sit in between. For the first logical partition I chose to sacrifice 1 MiB, and simply start it on the next MiB boundary, but for the end position I needed to figure out "4KiB less than next MiB" (ie, one physical sector) so as to leave room for the EBR and then starting the next logical partition on a MiB boundary -- with minimal wasted space.

I calculated the first one by hand, as it needed a unique size, and specified it in sectors (ie, 512-byte units -- logical sectors):

(parted) mkpart logical 4154MiB 72511484s      # 35406MiB - 4 sectors
(parted) set 5 raid on

then for most of the rest, they were all the same size, and so the pattern was quite repetitive. To solve this I wrote a trivial, hacky Python script to generate the right parted commands:


for i in range(7):
  start = base + (i * inc)
  end   = base + ((i+1) * inc)
  last  = (end * 2 * 1024) - 4
  print("mkpart logical {0:7d}MiB {1:10d}s    # {2:7d}MiB - 4 sectors".format(start, last, end))

and then fed that output to parted:

(parted) mkpart logical   35406MiB  584519676s    #  285410MiB - 4 sectors
(parted) mkpart logical  285410MiB 1096527868s    #  535414MiB - 4 sectors
(parted) mkpart logical  535414MiB 1608536060s    #  785418MiB - 4 sectors
(parted) mkpart logical  785418MiB 2120544252s    # 1035422MiB - 4 sectors
(parted) mkpart logical 1035422MiB 2632552444s    # 1285426MiB - 4 sectors
(parted) mkpart logical 1285426MiB 3144560636s    # 1535430MiB - 4 sectors
(parted) mkpart logical 1535430MiB 3656568828s    # 1785434MiB - 4 sectors

to create all the consistently sized partitions (this would have been easier if parted had supported start/size values, which is what is actually stored in the MBR/EBR -- but it requires start/end values, which need more manual calculation; seems like a poor UI to me).

After that I could create the final partition to use the remainder of the disk, which is trivial to specify:

(parted) mkpart logical 1785434MiB 100%

and then mark all the other partitions as "raid" partitions:

(parted) set 6 raid on
(parted) set 7 raid on
(parted) set 8 raid on
(parted) set 9 raid on
(parted) set 10 raid on
(parted) set 11 raid on
(parted) set 12 raid on
(parted) set 13 raid on

which gave me a final partition table of:

(parted) unit s
(parted) print
Model: WD Elements 25A2 (scsi)
Disk /dev/sdc: 3906963456s
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags:

Number  Start        End          Size         Type      File system  Flags
 1      2048s        501759s      499712s      primary                raid, lba
 2      501760s      4503551s     4001792s     primary                raid, lba
 3      4503552s     8505343s     4001792s     primary                raid, lba
 4      8505344s     3906963455s  3898458112s  extended               lba
 5      8507392s     72511484s    64004093s    logical                raid, lba
 6      72511488s    584519676s   512008189s   logical                raid, lba
 7      584519680s   1096527868s  512008189s   logical                raid, lba
 8      1096527872s  1608536060s  512008189s   logical                raid, lba
 9      1608536064s  2120544252s  512008189s   logical                raid, lba
10      2120544256s  2632552444s  512008189s   logical                raid, lba
11      2632552448s  3144560636s  512008189s   logical                raid, lba
12      3144560640s  3656568828s  512008189s   logical                raid, lba
13      3656568832s  3906963455s  250394624s   logical                raid, lba


As a final double check I also used the parted "align-check" to check the alignment (and manually divided each starting sector by 2048 to ensure it was on a MiB boundary -- 2048 = 2 * 1024, as the values are in 512-byte sectors):

(parted) align-check optimal 1
1 aligned
(parted) align-check optimal 2
2 aligned
(parted) align-check optimal 3
3 aligned
(parted) align-check optimal 4
4 aligned
(parted) align-check optimal 5
5 aligned
(parted) align-check optimal 6
6 aligned
(parted) align-check optimal 7
7 aligned
(parted) align-check optimal 8
8 aligned
(parted) align-check optimal 9
9 aligned
(parted) align-check optimal 10
10 aligned
(parted) align-check optimal 11
11 aligned
(parted) align-check optimal 12
12 aligned
(parted) align-check optimal 13
13 aligned

And then exited to work with this final partition table:

(parted) quit

For more on partition alignment, particularly with MD / LVM layers as well see Thomas Krenn's post on Partition Alignment, and a great set of slides on partition / MD LVM alignment. Of note, both Linux MD RAID-1 metadata (1.2) and LVM Physical Volume metadata will take up some space at the start of the partition if you accept the modern defaults.

For Linux MD RAID-1, metadata 1.2 is at the start of the partition, and then the Data will begin at the "Data Offset" within the partition. (Linux MD RAID metadata 0.9 is at the end of the disk, so there is no offset, which is sometimes useful including for /boot partitions.) You can see the offset in use by examining the individual RAID-1 elements: on metadata 1.2 RAID sets there is a "Data Offset" value reported, which is typically 1 MiB (2048 * 512-byte sectors):

ewen@linux:~$ sudo mdadm -E /dev/sda2 | egrep "Version|Offset"
        Version : 1.2
    Data Offset : 2048 sectors
   Super Offset : 8 sectors

although RAID sets created with more modern mdadm tools might have larger offsets (possibly for bitmaps to speed up resync?):

ewen@linux:~$ sudo mdadm -E /dev/sda13 | egrep "Version|Offset"
        Version : 1.2
    Data Offset : 131072 sectors
   Super Offset : 8 sectors

These result in unused space in the MD RAID-1 elements which can be seen:

ewen@linux:~$ sudo mdadm -E /dev/sda2 | egrep "Unused"
   Unused Space : before=1960 sectors, after=1632 sectors
ewen@linux:~$ sudo mdadm -E /dev/sda13 | egrep "Unused"
   Unused Space : before=130984 sectors, after=65536 sectors

although in this case the unused space at the end is most likely due to rounding up the partition sizes from those in the originally created RAID array. (The "--data-offset is computed automatically, but may be overridden from the command line when the array is created -- on a per-member-device basis. But presumably if the data offset is too small, various metadata -- such as bitmaps -- cannot be stored.)

By default, it appears that modern LVM will default to 192 KiB of data at the start of its physical volumes (PV), which can be seen by checking the "pe_start" value:

ewen@linux:~$ sudo pvs -o +pe_start /dev/md26
  PV         VG Fmt  Attr PSize   PFree 1st PE 
  /dev/md26  r1 lvm2 a--  244.12g    0  192.00k
ewen@linux:~$ sudo pvs -o +pe_start /dev/md32
  PV         VG Fmt  Attr PSize   PFree   1st PE 
  /dev/md32     lvm2 ---  244.14g 244.14g 192.00k

and controlled at pvcreate time with the --metadatasize and --dataalignment values (as well as an optimal manual override of the offset).

Fortunately all of these values (1MiB == 2048s, 64MiB == 131072s, 192 KiB) are all multiples of 4 KiB, so providing you are only aligning to 4 KiB boundaries you do not need to worry about additional alignment options if the underlying partitions are aligned. But if you need to align to, eg, larger SSD erase blocks or larger hardware RAID stripes, you may need to adjust the MD and LVM alignment options as well to avoid leaving the underlying file system misaligned. (If you are using modern Linux tools with all software layers -- RAID, LVM, etc -- then the alignment_offset values will probably help to ensure the defaults help with alignment; if there are any hardware layers you will need to provide additional information to ensure the best alignment.)

RAID rebuild

Having created a third (external 2TB) disk with suitably aligned partitions I could then move on to resyncing all the RAID arrays 3 times (once onto the external drive, and then once onto each of the internal drives). A useful Debian User post outlined the process for extending the RAID-1 array onto a third disk, and then removing the third disk again, which provided the basis of my approach.

The first step was to extend all but the last RAID array onto the new disk (the last one needed special treatment as it was getting smaller, but fortunately it did not have any data on it yet). Growing onto the third disk is fairly simple:

sudo mdadm --grow /dev/md21 --level=1 --raid-devices=3 --add /dev/sdc1
sudo mdadm --grow /dev/md22 --level=1 --raid-devices=3 --add /dev/sdc2
sudo mdadm --grow /dev/md23 --level=1 --raid-devices=3 --add /dev/sdc3
sudo mdadm --grow /dev/md25 --level=1 --raid-devices=3 --add /dev/sdc5
sudo mdadm --grow /dev/md26 --level=1 --raid-devices=3 --add /dev/sdc6
sudo mdadm --grow /dev/md27 --level=1 --raid-devices=3 --add /dev/sdc7
sudo mdadm --grow /dev/md28 --level=1 --raid-devices=3 --add /dev/sdc8
sudo mdadm --grow /dev/md29 --level=1 --raid-devices=3 --add /dev/sdc9
sudo mdadm --grow /dev/md30 --level=1 --raid-devices=3 --add /dev/sdc10
sudo mdadm --grow /dev/md31 --level=1 --raid-devices=3 --add /dev/sdc11
sudo mdadm --grow /dev/md32 --level=1 --raid-devices=3 --add /dev/sdc12

although as mentioned above it did take a long time (most of a calendar day) due to the external drive being connected via USB-2, and thus limited to about 30MB/s.

The second step was to destroy the RAID array for the last partition on the disks, discarding all data on it (fortunately none in my case), as that partition had to get smaller as described above. If you have important data on that last RAID array you will need to copy it somewhere else before proceeding.

ewen@linux:~$ head -4 /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md33 : active (auto-read-only) raid1 sdb13[0]
      125233580 blocks super 1.2 [2/1] [U_]

ewen@linux:~$ sudo mdadm --stop /dev/md33
[sudo] password for ewen:
mdadm: stopped /dev/md33
ewen@linux:~$ sudo mdadm --remove /dev/md33
ewen@linux:~$ grep md33 /proc/mdstat
ewen@linux:~$ sudo mdadm --zero-superblock /dev/sda13
ewen@linux:~$ sudo mdadm --zero-superblock /dev/sdb13

After this, check the output of /proc/mdstat to ensure that all the remaining RAID sets are happy, and show three active disks ("UUU") -- the two original disks, and the temporary external disk. If you are not sure everything is perfectly prepared, sort out the remainining issues before proceeding, as the next step will break the first original drive out of the RAID-1 arrays.

The third step, when everything is ready, is to remove the first original drive from the RAID-1 arrays:

sudo mdadm /dev/md21 --fail /dev/sda1  --remove /dev/sda1
sudo mdadm /dev/md22 --fail /dev/sda2  --remove /dev/sda2
sudo mdadm /dev/md23 --fail /dev/sda3  --remove /dev/sda3
sudo mdadm /dev/md25 --fail /dev/sda5  --remove /dev/sda5
sudo mdadm /dev/md26 --fail /dev/sda6  --remove /dev/sda6
sudo mdadm /dev/md27 --fail /dev/sda7  --remove /dev/sda7
sudo mdadm /dev/md28 --fail /dev/sda8  --remove /dev/sda8
sudo mdadm /dev/md29 --fail /dev/sda9  --remove /dev/sda9
sudo mdadm /dev/md30 --fail /dev/sda10 --remove /dev/sda10
sudo mdadm /dev/md31 --fail /dev/sda11 --remove /dev/sda11
sudo mdadm /dev/md32 --fail /dev/sda12 --remove /dev/sda12

and then repartition the drive following the instructions above (ie, to be identical to the 2TB external drive, other than the size of the final partition).

When the partitioning is complete, run:

ewen@linux:~$ sudo partprobe -d -s /dev/sda
/dev/sda: msdos partitions 1 2 3 4 <5 6 7 8 9 10 11 12 13>
ewen@linux:~$ sudo partprobe  -s /dev/sda
/dev/sda: msdos partitions 1 2 3 4 <5 6 7 8 9 10 11 12 13>

to ensure the new partitions are recognised, and then also compare the output of:

ewen@linux:~$ sudo parted -a optimal /dev/sda unit s print
Model: ATA WDC WD20EFRX-68A (scsi)
Disk /dev/sda: 3907029168s
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:

Number  Start        End          Size         Type      File system  Flags
 1      2048s        501759s      499712s      primary                raid
 2      501760s      4503551s     4001792s     primary                raid
 3      4503552s     8505343s     4001792s     primary                raid
 4      8505344s     3907028991s  3898523648s  extended               lba
 5      8507392s     72511484s    64004093s    logical                raid
 6      72511488s    584519676s   512008189s   logical                raid
 7      584519680s   1096527868s  512008189s   logical                raid
 8      1096527872s  1608536060s  512008189s   logical                raid
 9      1608536064s  2120544252s  512008189s   logical                raid
10      2120544256s  2632552444s  512008189s   logical                raid
11      2632552448s  3144560636s  512008189s   logical                raid
12      3144560640s  3656568828s  512008189s   logical                raid
13      3656568832s  3907028991s  250460160s   logical                raid


with the start/size sectors recognised by Linux as active:

ewen@linux:/sys/block/sda$ for PART in 1 2 3 5 6 7 8 9 10 11 12 13; do echo "sda${PART}: " $(cat "sda${PART}/start") $(cat "sda${PART}/size"); done
sda1:  2048 499712
sda2:  501760 4001792
sda3:  4503552 4001792
sda5:  8507392 64004093
sda6:  72511488 512008189
sda7:  584519680 512008189
sda8:  1096527872 512008189
sda9:  1608536064 512008189
sda10:  2120544256 512008189
sda11:  2632552448 512008189
sda12:  3144560640 512008189
sda13:  3656568832 250460160

to ensure that Linux will copy onto the new partitions, not old locations on the disk.

The fourth step is to add the original first drive back into the RAID-1 sets, and wait for them to all resync:

sudo mdadm --manage /dev/md21 --add /dev/sda1
sudo mdadm --manage /dev/md22 --add /dev/sda2
sudo mdadm --manage /dev/md23 --add /dev/sda3
sudo mdadm --manage /dev/md25 --add /dev/sda5
sudo mdadm --manage /dev/md26 --add /dev/sda6
sudo mdadm --manage /dev/md27 --add /dev/sda7
sudo mdadm --manage /dev/md28 --add /dev/sda8
sudo mdadm --manage /dev/md29 --add /dev/sda9
sudo mdadm --manage /dev/md30 --add /dev/sda10
sudo mdadm --manage /dev/md31 --add /dev/sda11
sudo mdadm --manage /dev/md32 --add /dev/sda12

which in my case took about 6 hours.

Once this is done, the same steps can be repeated to remove the /dev/sdb* partitions, repartition the /dev/sdb drive, re-check the partitions are correctly recognised, and then re-add the /dev/sdb* partitions into the RAID sets.

Of note, when I started adding the /dev/sda* partitions back in after repartitioning, I got warnings saying:

ewen@linux:/sys/block$ sudo dmesg -T | grep misaligned
[Mon Jan  1 09:22:20 2018] md21: Warning: Device sda1 is misaligned
[Mon Jan  1 09:22:35 2018] md22: Warning: Device sda2 is misaligned
[Mon Jan  1 10:07:26 2018] md27: Warning: Device sda7 is misaligned
[Mon Jan  1 10:49:12 2018] md28: Warning: Device sda8 is misaligned
[Mon Jan  1 10:49:21 2018] md29: Warning: Device sda9 is misaligned
[Mon Jan  1 11:30:04 2018] md30: Warning: Device sda10 is misaligned
[Mon Jan  1 12:26:56 2018] md31: Warning: Device sda11 is misaligned
[Mon Jan  1 12:45:08 2018] md32: Warning: Device sda12 is misaligned

and when I went checking I found that the "alignment_offset" values had been set to "-1" in the affected cases:

ewen@tv:/sys/block$ grep . md*/alignment_offset
ewen@tv:/sys/block$ grep . md*/alignment_offset

Those alignment offsets should normally be bytes to adjust by to achieve alignment again -- I saw values like 3072, 3584, etc, in them prior to aligning the underlying physical partitions properly, and "0" indicates that it is natively aligned already.

After some hunting it turned out that "-1" was a special magic value meaning basically "alignment is impossible":

 *    Returns 0 if the top and bottom queue_limits are compatible.  The
 *    top device's block sizes and alignment offsets may be adjusted to
 *    ensure alignment with the bottom device. If no compatible sizes
 *    and alignments exist, -1 is returned and the resulting top
 *    queue_limits will have the misaligned flag set to indicate that
 *    the alignment_offset is undefined.

My conclusion was that because the RAID-1 sets had stayed active throughout what was happening was that the previous alignment offset of /dev/sda* partitions was non-zero, and the RAID-1 sets were attempting to find an alignment offset for the new /dev/sda* partitions that would match the physical sectors and use those same offsets -- but there was no such valid offset that matched both the old and new /dev/sda* partition offsets, hence it failed. So I chose to ignore those warnings and carry on.

Once both original drives had been repartitioned and resync'd, the next step was to recreate the /dev/md33 RAID partition again, on the smaller partitions:

ewen@tv:~$ sudo mdadm --zero-superblock /dev/sda13
ewen@tv:~$ sudo mdadm --zero-superblock /dev/sdb13
ewen@tv:~$ sudo mdadm --zero-superblock /dev/sdc13
ewen@tv:~$ sudo mdadm --create /dev/md33 --level=1 --raid-devices=3 --chunk=4M /dev/sda13 /dev/sdb13 /dev/sdc13
mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
Continue creating array? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md33 started.

(because I was not booting from that partition metadata 1.2 was fine, and gave more options -- this one was created with recovery bitmaps).

Note that in this case I chose to create the RAID-1 set including three drives, because the external 2TB drive was slightly smaller, and I wanted the option of later resync'ing it onto that drive as an offsite backup.

At this point it is useful to update /etc/mdadm/mdadm.conf with the new UUID of the new RAID set, to ensure that it stays in sync and RAID arrays can be auto-started.

When that new RAID set completed resync'ing, I then removed the 2TB external drive from all the RAID sets, and set them back to "2-way" RAID to avoid the RAID sets sitting there partly failed:

sudo mdadm /dev/md21 --fail /dev/sdc1  --remove /dev/sdc1
sudo mdadm /dev/md22 --fail /dev/sdc2  --remove /dev/sdc2
sudo mdadm /dev/md23 --fail /dev/sdc3  --remove /dev/sdc3
sudo mdadm /dev/md25 --fail /dev/sdc5  --remove /dev/sdc5
sudo mdadm /dev/md26 --fail /dev/sdc6  --remove /dev/sdc6
sudo mdadm /dev/md27 --fail /dev/sdc7  --remove /dev/sdc7
sudo mdadm /dev/md28 --fail /dev/sdc8  --remove /dev/sdc8
sudo mdadm /dev/md29 --fail /dev/sdc9  --remove /dev/sdc9
sudo mdadm /dev/md30 --fail /dev/sdc10 --remove /dev/sdc10
sudo mdadm /dev/md31 --fail /dev/sdc11 --remove /dev/sdc11
sudo mdadm /dev/md32 --fail /dev/sdc12 --remove /dev/sdc12
sudo mdadm /dev/md33 --fail /dev/sdc13 --remove /dev/sdc13

sudo mdadm --grow /dev/md21 --raid-devices=2
sudo mdadm --grow /dev/md22 --raid-devices=2
sudo mdadm --grow /dev/md23 --raid-devices=2
sudo mdadm --grow /dev/md25 --raid-devices=2
sudo mdadm --grow /dev/md26 --raid-devices=2
sudo mdadm --grow /dev/md27 --raid-devices=2
sudo mdadm --grow /dev/md28 --raid-devices=2
sudo mdadm --grow /dev/md29 --raid-devices=2
sudo mdadm --grow /dev/md30 --raid-devices=2
sudo mdadm --grow /dev/md31 --raid-devices=2
sudo mdadm --grow /dev/md32 --raid-devices=2
sudo mdadm --grow /dev/md33 --raid-devices=2

and then checked for any remaining missing drive references or "sdc" references:

ewen@linux:~$ cat /proc/mdstat | grep "_"
ewen@linux:~$ cat /proc/mdstat | grep "sdc"

Then I unplugged the 2TB external drive, to keep for now as an offline backup.

To make sure that the system still booted, I reinstalled grub and updated the initramfs to pick up the new RAID UUIDs:

ewen@linux:~$ sudo grub-install /dev/sda
Installing for i386-pc platform.
Installation finished. No error reported.
ewen@linux:~$ sudo grub-install /dev/sdb
Installing for i386-pc platform.
Installation finished. No error reported.
ewen@linux:~$ sudo update-initramfs -u
update-initramfs: Generating /boot/initrd.img-4.9.0-4-686-pae
ewen@linux:~$ sudo update-grub
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.9.0-4-686-pae
Found initrd image: /boot/initrd.img-4.9.0-4-686-pae
Found linux image: /boot/vmlinuz-4.9.0-3-686-pae
Found initrd image: /boot/initrd.img-4.9.0-3-686-pae
Found linux image: /boot/vmlinuz-3.16.0-0.bpo.4-686-pae
Found initrd image: /boot/initrd.img-3.16.0-0.bpo.4-686-pae
Found memtest86 image: /memtest86.bin
Found memtest86+ image: /memtest86+.bin
Found memtest86+ multiboot image: /memtest86+_multiboot.bin

and then rebooted the system to make sure it could boot cleanly by itself. Fortunately it rebooted automatically without any issues!

After reboot I checked for reports of misalignment:

ewen@linux:~$ uptime
 11:53:28 up 4 min,  1 user,  load average: 0.05, 0.42, 0.25
ewen@linux:~$ sudo dmesg -T | grep -i misaligned
ewen@linux:~$ sudo dmesg -T | grep alignment
ewen@linux:~$ sudo dmesg -T | grep inconsistency

and was pleased to find that none were reported. I also checked all the alignment_offset values, and was pleased to see all of those were now "0" -- ie "naturally aligned" (in this case to the 4KiB physical sector boundaries):

ewen@linux:~$ cat /sys/block/sda/sda*/alignment_offset
ewen@linux:~$ cat /sys/block/sdb/sdb*/alignment_offset
ewen@linux:~$ cat /sys/block/md*/alignment_offset
ewen@linux:~$ cat /sys/block/dm*/alignment_offset

It is too soon to tell if this has any actual practical benefits in performance due to improving the alignment. But not being reminded that I "did it wrong" several years ago when putting the disks in -- due to the fdisk partition defaults at the time being wrong for 4 KiB physical sector disks -- seems worth the effort anyway.

Posted Tue Jan 2 14:13:30 2018 Tags:


When I upgraded my Vodafone cable connection to Vodafone FibreX installation, it came with a Huawei HG659 Home Gateway supplied as part of the connection, including some IPv6 support and 802.11ac WiFi. While I was not particularly keen on using a "telco CPE" as my home edge device (amongst other things they have a general reputation for being poorly secured), it was fast than anything else I had at the time so I planned to use it until I had a specific reason to need something else to guide the next purchase.

That specific reason came along when I wanted a proper Network "DMZ" for my home network to host some development systems (I often work from home). The Huawei HG659 supports NAT pinholes routed to an internal system on the (single) LAN, but otherwise does not provide any real DMZ network isolation.

I purchased a Mikrotik RB750Gr3 -- nicknamed the Mikrotik "hEX" -- to be my replacement home edge router. They are around NZ$120 from the main local Mikrotik reseller, making them a fairly cost effective home router for someone needing extra flexibility. The Mikrotik RB750Gr3 is a dual-core 850Mhz MIPS CPU, with 256MB of RAM, 16MB of flash, and 5 Gigabit Ethernet (copper 1GBase-T) interfaces (via a single Ethernet switch chip). It is packaged in an indoor case, and while it is capable of PoE via ether1 typically in a home environment it would be powered by a (supplied) 24V DC plug pack.

(The other common "advanced home user" replacement seems to be a Ubiquiti device like the UniFi Security Gateway which can be configured for VLAN tagging, or the Ubiquiti EdgeRouter. I choose the Mikrotik because I am very familiar with them from previous jobs, and client sites, and for single devices much prefer direct configuration to forced-management via a "Security Controller" as is required by the Ubiquiti UniFi line. Something like the Ubiquiti EdgeRouter X 5-port, also a similar price in New Zealand, and apparently configurable from the command line, may well work just as well as the Mikrotik RB750Gr3 for this purpose -- I have not tried it myself.)

For my home Vodafone FibreX configuration I wanted:

  • A "WAN" connection to the Vodafone supplied TechniColor TC4400VDF DOCSIS 3.1 cable modem

  • Multiple LAN ports switched together

  • A "DMZ" interface with its own routing firewall rules, that was separated from both the WAN and LAN

  • Equal support for both IPv4 and IPv6 (Vodafone FibreX has provided IPv6 by default since shortly before my connection was converted to FibreX)

  • The ability to use IPv6 to expose devices in the DMZ, at least to known external locations, without needing to use IPv4 NAT or a VPN

I also wanted to continue to use the Vodafone-supplied Huawei HG659 home gateway for its 802.11ac WiFi, since it is the only 802.11ac AP in my house at present; I achieved that by simply disconnecting the WAN interface of the Huawei HG659 but leaving the LAN interface connected to the Mikrotik RB750Gr3 -- and changing the LAN address of the Huawei HG659 away from my LAN default gateway address. This means the Huawei HG659 acts as an 802.11ac WiFi to Gigabit Ethernet bridge (which also allows using the Huawei HG659 LAN ports as an additional Ethernet switch, that is handy as it is on my "comms" UPS). Presumably the Huawei HG659 is vulnerable to the Wifi KRACK replay attacks, but Vodafone issued a Huawei HG659 firmware update in August 2017, available for download via the Huawei HG659 user guide page, so maybe there will be another update available later in the year to fix more issues. In my configuration (without the WAN port connected), Vodafone will not be able to upgrade my Huawei HG659, but they do give instructions on manually installing the upgrade.


The configuration I chose is based on a GeekZone Forum example of Mikrotik configuration for Vodafone FibreX, a Mikrotik Wiki guide to "Securing Your Router", and a Mikrotik IPv6 Home Example. As well as a lot of experience configuring Mikrotik devices as routers over the years.

To do the initial configuration I used a reimplementation of the Mikrotik mac-telnet feature, which works directly over the Ethernet connection without requiring IP address configured. Looking now, I see there is a later fork of the mac-telnet reimplementation with more features, as well as at least one other older independent implementation. (These mac-telnet reimplementations are much easier to use on Linux / OS X than trying to install WINE to get the Windows Mikrotik MAC-Telnet running -- although that does work too, and I have done it in the past.) Of note, there is also a Wireshark dissector for the MAC-Telnet protocol, and a reverse engineered packet description of the MAC-Telnet protocol -- it looks like it uses Ethernet broadcast frames with IPv4 UDP-like packets to/from port 20561.

Interface layout

The Mikrotik RB750Gr3 is a 5-GigE (1GBase-T) interface device, with all ports connected to an internal Ethernet switch chip. For my use case I wanted:

  • ether1 as the WAN interface, connected to the Vodafone supplied TechniColor TC4400VDF cable modem

  • ether2, ether3, and ether4 as LAN interfaces, with fast switching amongst them; and

  • ether5 as the DMZ interface

so that is the configuration used below. The Ethernet switch chip functionality ("master-port" is used for the LAN interfaces, which are then all ether2 as far as the rest of the configuration is concerned), and the other two (ether1 for WAN; ether5 for DMZ) are stand alone interfaces.

The Vodafone FibreX configuration -- unlike earlier Vodafone cable modem gateways, but like many UFB connections -- uses VLAN tagging, with VLAN 10, on the WAN connection. I assume Vodafone do this to have a fairly consistent configuration on their Huawei HG659 devices, which they use across multiple different connection types. Unlike, many UFB connections, which still use PPPoE, the Vodafone FibreX connection continues to use what is now called "IPoE" -- IP over Ethernet without additional layers like PPPoE (which itself is IP over PPP over Ethernet).

This means that there is an additional logical interface

  • VLAN 10 on ether1, which I have called "fibrex" in this configuration

to which all the IP level configuration is attached. Raw (untagged) ether1 is only used to reach the management IP of the TechniColor TC4400VDF cable modem (on -- and then only to check that it is alive, as the default page provides basically no information, and is password protected with unspecified passwords. (Sadly there is no equivalent of the Motorola SB5100 modem light status page.)

Interface configuration

To implement this interface layout, label the Ethernet interfaces, and join ether3 and ether4 to ether2 via the Ethernet switch:

/int ethernet set ether1 comment="WAN"
/int ethernet set ether2 comment="LAN"
/int ethernet set ether3 master-port=ether2 comment="LAN (switched)"
/int ethernet set ether4 master-port=ether2 comment="LAN (switched)"
/int ethernet set ether5 comment="DMZ"

then add the fibrex VLAN interface:

/int vlan add name=fibrex interface=ether1 vlan-id=10 comment="VLAN 10 on ether1"

to hold the IP configuration facing the Vodafone FibreX connection.

Mikrotik base configuration

  • Upgrade to a recent Mikrotik RouterOS; that was 6.40.3 when I did my install, but 6.40.4 or later is recommended now as 6.40.4 includes the Wifi KRACK improvements, as well as several IPv6 related fixes.

  • Set the system name to something to help you identify it, and enable IPv6 functionality:

    /system identity set name=MY-rb750gr3
    /system package enable ipv6
    /system reboot

    (a reboot is required to get the IPv6 modules running).

  • Set an admin password, and (optionally) add your own admin-level account:

    /user add copy-from=admin name=ME comment="MY FULL NAME"

    then log in with the new account, and set its password:

  • Disable unnecessary services:

    /ip service print
    /ip service set telnet disabled=yes
    /ip service set ftp disabled=yes
    /ip service set www disabled=yes
    /ip service set api disabled=yes
    /ip service set winbox disabled=yes
    /ip service set api-ssl disabled=yes

    then check what is left enabled:

    /ip service print where disabled=no
  • Restrict ssh access to known internal networks and trusted IPs:

    /ip service set ssh address=A.B.C.D/24,E.F.G.H/32
    /ip ssh set strong-crypto=yes
    /ip ssh print
  • Disable Mikrotik WinBox server (since I do not use it; the client only runs on Windows / WINE):

    /tool mac-server mac-winbox set [find] disabled=yes
    /tool mac-server mac-winbox print
  • Permit mac-telnet only from internal interfaces, by overriding default and changing default access to disabled:

    /tool mac-server add interface=ether2 disabled=no
    /tool mac-server add interface=ether3 disabled=no
    /tool mac-server add interface=ether4 disabled=no
    /tool mac-server add interface=ether5 disabled=no
    /tool mac-server print
    /tool mac-server set 0 disabled=yes
    /tool mac-server print

    and disable the MAC-based "ping" functionality completely:

    /tool mac-server ping set enabled=no
    /tool mac-server ping print
  • Turn off Mikrotik Neighbor Discovery on external interfaces:

    /ip neighbor discovery print
    /ip neighbor discovery set ether1 discover=no
    /ip neighbor discovery set fibrex discover=no
    /ip neighbor discovery print
  • Disable other extraneous services:

    /ip dns set allow-remote-requests=no
    /ip proxy set enabled=no
    /ip socks set enabled=no
    /ip upnp set enabled=no
    /ip cloud set ddns-enabled=no update-time=no
    /tool bandwidth-server set enabled=no

    (several of those default to off, but it is good to be sure they are turned off when unneeded).

  • Disable IPv6 Neighbor Discovery by default (we will enable it specifically on internal interfaces later):

    /ipv6 nd set [find interface=all] disabled=yes
    /ipv6 nd print

    (and Vodafone FibreX requires DHCPv6 to obtain IPv6 addresses for the WAN interface, and an IPv6 pool for the internal interfaces).

IPv4 interface configuration

The Vodafone FibreX configuration expects you to use DHCPv4 to obtain the IP address, and by default hands out short-life leases (about 10 minutes) from a dynamic pool. It is possible to request a static IP address, but that is delivered as a static DHCPv4 lease, so DHCPv4 is still required. (I have requested a static IPv4 address because I often work from home, and needed access added for my IPv4 address on several client's firewall.)

  • Configure the WAN interface, which needs to do DHCPv4 on the fibrex (VLAN 10 on ether1) interface:

    /ip dhcp-client add interface=fibrex add-default-route=yes use-peer-dns=no use-peer-ntp=no disabled=no

The internal addressing needs to use IPv4 "Site Local" (RFC1918) addresses, due to the practical exhaustion of the IPv4 global address pool about 10 years ago :-( I would strongly recommend picking a less common RFC1918 address -- not,, or any other vendor default -- to avoid future confusion.

  • Configure the LAN interface with a chosen RFC1918 address:

    /ip addr add interface=ether2 address=A.B.C.D/24 comment="LAN"
  • Configure the DMZ interface with another RFC1918 address (since "home" connections come with only a single IPv4 address :-( ):

    /ip addr add interface=ether5 address=E.F.G.H/24 comment="DMZ"

At this point the Mikrotik should route IPv4 traffic properly between the LAN and the DMZ -- but connections out to the Internet will fail due to the aforementioned IPv4 address exhaustion meaning that NAT is required -- see below for NAT configuration in the firewall section.

IPv6 interface configuration

The Vodafone FibreX provision of IPv6 is a dynamic IPv6 /56 delivered via DHCPv6; there is no option for static DHCPv6 leases, and the leases appear to be tied to the router's MAC address (so changing routers will result in completely new addresses). However the DHCPv6 leases are quite long (about 2 weeks), and renewal does appear to work, so the DHCPv6 addresses should be fairly consistent.

For IPv6 we need to request both an IPv6 address for the router's WAN interface and a pool (/56) of IPv6 addresses to use for allocating internal IP addresses. This is because IPv6 address allocation is designed to provide connectivity-based IP addresses, to minimise the size of the routing table. (There is a range of IPv6 Unique Local Addresses which are roughly equivalent to IPv4 RFC1918 addresses as "Site Local" addresses -- but they are not intended for use with Global Internet Routing, nor is NAT expected to be used with IPv6; instead end devices are expected to have multiple IPv6 addresses.)

  • Request the IPv6 address and pool from the fibrex interface:

    /ipv6 dhcp-client add interface=fibrex pool-name=fibrex-pool add-default-route=yes use-peer-dns=no request=address,prefix

Once we have the fibrex-pool we can then assign internal interfaces out of that pool:

  • The LAN first:

    /ipv6 address add interface=ether2 from-pool=fibrex-pool advertise=yes
    /ipv6 firewall address-list add list=ipv6_lan \
          address=[/ipv6 address get [/ipv6 address find interface=ether2 from-pool=fibrex-pool] address]
  • And then the DMZ:

    /ipv6 address add interface=ether5 from-pool=fibrex-pool advertise=yes
    /ipv6 firewall address-list add list=ipv6_dmz \
          address=[/ipv6 address get [/ipv6 address find interface=ether5 from-pool=fibrex-pool] address]

Because these addresses are dynamic (drawn from a pool, which could change), we add them into "/ipv6 firewall address-list" entries to make it easier to use in firewall rules. (We could arrange for scripts to be run each time the pool changes, and thus these IPs change, but in practice in the last few weeks they have been very stable, so I have not yet automated updating the firewall address-lists on pool change.)

The "advertise=yes" makes the address eligible for the Mikrotik to advertise it for SLACC, which I have previously found worked best on my network (due to the Huawei HG659 DHCPv6 handing out duplicate addresses :-( ). This also avoids the need to set up stateful DHCPv6 on the internal interfaces.

To actually enable Neighbor Discovery Router Announcements (for SLAAC) on these interfaces, because we disabled it globally above, we need to configure Neighbor Discovery for these internal interfaces:

  • On the LAN:

    /ipv6 nd add interface=ether2 disabled=no ra-interval=3m20s-10m \
          ra-delay=3s mtu=unspecified reachable-time=unspecified \
          retransmit-interval=unspecified ra-lifetime=30m hop-limit=unspecified \
          advertise-mac-address=yes advertise-dns=no \
          managed-address-configuration=no other-configuration=no comment="LAN (SLAAC)"
  • On the DMZ:

    /ipv6 nd add interface=ether5 disabled=no ra-interval=3m20s-10m \
        ra-delay=3s mtu=unspecified reachable-time=unspecified \
        retransmit-interval=unspecified ra-lifetime=30m hop-limit=unspecified \
        advertise-mac-address=yes advertise-dns=no \
        managed-address-configuration=no other-configuration=no comment="DMZ (SLAAC)"

Internet edge firewalling

The firewall configuration becomes fairly complex because we have three routing interfaces (WAN = Internet; LAN; DMZ), as well as the Mikrotik itself, and two network protocols (IPv4 and IPv6) which have completely separate addresses and firewall rules. This means that we need firewall rules to cover:

  • LAN to Internet

  • DMZ to Internet

  • LAN to DMZ

  • DMZ to LAN

  • Internet to LAN

  • Internet to DMZ

for both IPv4 and IPv6. Some of those can be very generic policies (eg, "Internet to LAN" should not allow any "unexpected" traffic; "LAN to Internet" may be okay allowing pretty much everything out), but others need a fair amount of detail.

In addition the IP addresses of the WAN interface is notionally dynamic for both IPv4 and IPv6, and the LAN and DMZ interface ranges are also dynamic for IPv6 (due to being auto-assigned out of DHCPv6 provided pools). And IPv4 Internet access requires NAT, due to home connections being provided with only a single IPv4 IP address to share amongst several internal devices.

Since IPv4 and IPv6 are essentially completely independent, they are covered separately below.

IPv4 Firewalling

IPv4 Address Lists

The easiest way to obtain flexibility in Mikrotik firewall rule sets is to make extensive use of the "/ip firewall address-list" facility to attach names to groups of IPv4 addresses -- and then use only those names (rather than literal IPv4 addresses) in the rule set as much as possible.

We start with definitions of the internal interfaces:

/ip firewall address-list add list=ipv4_lan address=A.B.C.D/24
/ip firewall address-list add list=ipv4_dmz address=E.F.G.H/24

which should match the IPv4 subnets used for the interface definitions above (a similar auto-define approach could be used to set the IPv4 address-lists as was used with the IPv6 address lists, but since the IPv4 internal addresses are fixed it does not seem necessary).

Another useful address list is a list of addresses which can externally manage the Mikrotik (eg, a work address for when you need to get into your home connection):

/ip firewall address-list add list=ipv4_ext_mgmt address=G.H.I.J comment="EXPLANATION"

repeat as needed to add multiple addresses; using, eg, a DNS name or site/company name in the comments help with figuring out which one is which later on when they inevitably need to be updated.

It is also useful to define a "bogon" address list, which should not appear on the Internet -- this IPv4 list taken from RFC6890:

/ip firewall address-list
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=Multicast list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment=RFC6890 list=ipv4_bogon
add address= comment="6to4 relay Anycast [RFC 3068]" list=ipv4_bogon

(Note the trailing "/" to return the Mikrotik context back to the top level.)

IPv4 NAT to the Internet

Once we have done that, we can define the NAT rules needed for Internet access:

/ip firewall nat add chain=srcnat action=masquerade out-interface=fibrex src-address-list=ipv4_lan
/ip firewall nat add chain=srcnat action=masquerade out-interface=fibrex src-address-list=ipv4_dmz

which specifies that any IPv4 traffic from the LAN or DMZ address ranges, allowed out by the firewall rules to the Internet, will be sent out using the current IP of the fibrex interface. The use of "action=masquerade" means it should automatically adapt if the IPv4 external address ever changes.

IPv4 ICMP filtering

Over the years IPv4 ICMP has acquired a number of special case uses which probably should not be used on the modern Internet, so it can be useful to be more specific about ICMP required. We can do this by creating an ipv4_icmp filter that whitelists the expected types and blocks all other ICMP.

This list should not be considered exhaustive, but is probably the minimum required for a functioning IPv4 connection:

/ip firewall filter
add chain=ipv4_icmp protocol=icmp icmp-options=0:0 action=accept comment="echo reply"
add chain=ipv4_icmp protocol=icmp icmp-options=3:0 action=accept comment="net unreachable"
add chain=ipv4_icmp protocol=icmp icmp-options=3:1 action=accept comment="host unreachable"
add chain=ipv4_icmp protocol=icmp icmp-options=3:4 action=accept comment="host unreachable fragmentation required"
add chain=ipv4_icmp protocol=icmp icmp-options=4:0 action=accept comment="allow source quench"
add chain=ipv4_icmp protocol=icmp icmp-options=8:0 action=accept comment="allow echo request"
add chain=ipv4_icmp protocol=icmp icmp-options=11:0 action=accept comment="allow time exceed"
add chain=ipv4_icmp protocol=icmp icmp-options=12:0 action=accept comment="allow parameter bad"
add chain=ipv4_icmp protocol=icmp action=drop comment="deny all other types"

(Again note the trailing "/" to reset the Mikrotik context to the top level.)

IPv4 Input/Output from the Mikrotik

The Mikrotik firewalling has an "input" chain for traffic to the Mikrotik itself, an "output" chain for traffic from the Mikrotik itself, and a "forward" chain for traffic originating outside the Mikrotik destined for somewhere outside the Mikrotik.

Having defined all the above address lists and helper filters, we can now define the "input" and "output" chains. These need to allow DHCPv4 (RC2131), as well as management traffic from known locations -- and block unexpected traffic from external locations.

The IPv4 input filter:

/ip firewall filter
add chain=input action=accept connection-state=established,related
add chain=input action=jump   protocol=icmp jump-target=ipv4_icmp
add chain=input action=accept protocol=udp in-interface=fibrex src-port=67 dst-port=68 comment="IPv4 DHCP"
add chain=input action=accept in-interface=fibrex src-address-list=ipv4_ext_mgmt
add chain=input action=accept in-interface=ether2 src-address-list=ipv4_lan
add chain=input action=accept in-interface=ether5 src-address-list=ipv4_dmz
add chain=input action=drop   in-interface=fibrex
add chain=input action=drop   in-interface=ether1
add chain=input action=reject

and the IPv4 output filter:

/ip firewall filter
add chain=output action=accept connection-state=established,related
add chain=output action=jump   protocol=icmp jump-target=ipv4_icmp
add chain=output action=accept protocol=udp out-interface=fibrex src-port=68 dst-port=67
add chain=output action=accept protocol=udp port=123 dst-address-list=ipv4_ntp_servers comment="NTP"
add chain=output action=reject out-interface=fibrex
add chain=output action=reject out-interface=ether1
add chain=output comment="Mikrotik Beacons to LAN" \
    dst-address= out-interface=ether2 port=5678 protocol=udp
add chain=output action=reject out-interface=ether2 log=yes log-prefix="To LAN"
add chain=output comment="Mikrotik Beacons to DMZ" \
    dst-address= out-interface=ether5 port=5678 protocol=udp
add chain=output action=reject out-interface=ether5 log=yes log-prefix="To DMZ"
add chain=output action=drop

Where is the IPv4 broadcast address.

Of note, IPv4 ICMP is filtered via the "known good" ICMPv4 whitelist defined above, in both directions, and DHCPv4 is allowed on the fibrex VLAN tagged interface, in both directions, and traffic from management sources to the Mikrotik is permitted (including, by choice, all the internal addresses -- that could be locked down further if desired).

Traffic from the external interfaces is simply dropped without logging (because the Internet is filled with constant scanning), but traffic to the Internet is logged to help debug missing rules. Traffic to internal interfaces also logged to help determine (a) if there are missing rules and (b) if anything is trying to reach into the internal network.

IPv4 traffic through the Mikrotik

We can fairly easily define default policies for traffic between the three interfaces:

  • Anything from the Internet to an internal interface should be blocked unless it is part of a connection established outbound, or a specific rule allowing traffic to the DMZ

  • Anything to the Internet should be allowed by default from known IPs

  • LAN to DMZ traffic should be allowed by default, but may be more filtered later

  • DMZ to LAN traffic should be limited, initially just ICMPv4 (but maybe later, eg, DNS and logging)

This means that only two of these cases need special treatment, LAN to DMZ, and DMZ to LAN:

/ip firewall filter
add chain=ipv4_lan_to_dmz action=accept

/ip firewall filter
add chain=ipv4_dmz_to_lan action=jump jump-target=ipv4_icmp
add chain=ipv4_dmz_to_lan action=reject

And then we can define some policies for traffic arriving on the LAN:

/ip firewall filter
add chain=ipv4_lan_out action=jump   src-address-list=ipv4_lan dst-address-list=ipv4_dmz in-interface=ether2 out-interface=ether5 jump-target=ipv4_lan_to_dmz
add chain=ipv4_lan_out action=accept src-address-list=ipv4_lan dst-address-list=!ipv4_bogons in-interface=ether2 out-interface=fibrex
add chain=ipv4_lan_out action=reject

and for traffic arriving on the DMZ:

/ip firewall filter
add chain=ipv4_dmz_out action=jump   src-address-list=ipv4_dmz dst-address-list=ipv4_lan in-interface=ether5 out-interface=ether2 jump-target=ipv4_dmz_to_lan
add chain=ipv4_dmz_out action=accept src-address-list=ipv4_dmz dst-address-list=!ipv4_bogons in-interface=ether5 out-interface=fibrex
add chain=ipv4_dmz_out action=reject

which use those LAN/DMZ policies for traffic between the LAN and DMZ interfaces, and the bogon list to filter traffic out to the Internet. "Everything else" unexpected is rejected; but in practice there should not be anything else.

Once those are defined, we can define a general IPv4 forwarding policy which hooks all of these together, and adds blocks for unexpected inbound traffic:

/ip firewall filter
add chain=forward action=fasttrack-connection connection-state=established,related comment="FastTrack (if possible)"
add chain=forward action=accept               connection-state=established,related comment="Other Established, Related"
add chain=forward action=drop                 connection-state=invalid comment="Drop invalid" log=yes log-prefix=Invalid
add chain=forward action=jump in-interface=ether2 jump-target=ipv4_lan_out
add chain=forward action=jump in-interface=ether5 jump-target=ipv4_dmz_out
add chain=forward action=drop in-interface=fibrex connection-nat-state=!dstnat connection-state=new comment="Inbound non-NAT" log=yes log-prefix=!NAT
add chain=forward action=drop in-interface=fibrex
add chain=forward action=drop in-interface=ether1
add chain=forward action=reject

and then our IPv4 firewall policy is complete, if fairly minimal.

The main thing I anticipate adding over time is some DMZ to LAN pinholes, maybe some Internet to DMZ pinholes (using IPv4 Destination NAT) and perhaps some further lock down of the LAN to DMZ traffic. Since those are all in their own rule sets they should be fairly easy to modify.

IPv6 Firewalling

IPv6 Firewalling is completely separate from IPv4 Firewalling on the Mikrotik (and many devices), due to using completely separate IP addresses, but by using "firewall address-lists" the shape of the firewall rules can (and arguably should) look very similar.

IPv6 address lists

We can define external management addresses:

/ipv6 firewall address-list add list=ipv6_ext_mgmt address=AAAA:BBBB:CCC:DDDD:EEEE:FFFF:GGGG:HHHH comment="DESCRIPTION"

to go along with the ipv6_lan and ipv6_dmz definitions that we calculated above (when defining the IPv6 IP addresses).

We can also add some helper address lists for known IPv6 types:

/ipv6 firewall address-list add list=ipv6_link_local address=fe80::/16
/ipv6 firewall address-list add list=ipv6_multicast  address=ff02::/16

and addresses which should not appear on the Internet:

/ipv6 firewall address-list add list=ipv6_bogons address=fc00::/7 comment="IPv6 Unique Local Addresses"

(in this case IPv6 ULA addresses mentioned above, which are site local).

IPv6 ICMP filtering

ICMPv6 is even more critical to IPv6 than ICMPv4 is to IPv4, so we need to be careful with filtering; there are also fewer "tried to 20 years ago, do not use now" ICMPv6 entries. However to match the pattern of firewall rules between IPv4 and IPv6, I also defined an ICMPv6 whitelist. This should definitely be considered the minimum and will almost certainly need expanding over time; hence the "accept but log" at the end before the "default drop" -- thus accepting everything, but tracking "unexpected" traffic.

/ipv6 firewall filter
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=1:0-255 comment="Destination Unreachable"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=2:0-255 comment="Packet Too Big"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=3:0-255 comment="Time Exceeded"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=4:0-255 comment="Parameter Problem"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=128:0-255 comment="Echo Request"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=129:0-255 comment="Echo Reply"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=132:0-255 comment="Multicast Listener Done"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=133:0-255 comment="Router Solicitation (NDP)"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=134:0-255 comment="Router Announcement (NDP)"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=135:0-255 comment="Neighbor Solicitation (NDP)"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=136:0-255 comment="Neighbor Announcement (NDP)"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=137:0-255 comment="Neighbor Redirect (NDP)"
add chain=ipv6_icmp action=accept protocol=icmpv6 icmp-options=143:0-255 comment="Version 2 Multicast Listener Report"
add chain=ipv6_icmp action=accept protocol=icmpv6 log=yes log-prefix=ICMPv6
add chain=ipv6_icmp action=drop

We can also add another filter for "just ping" to be used in more specific scenarios:

/ipv6 firewall filter
add chain=ipv6_ping action=accept protocol=icmpv6 icmp-options=128:0-255 comment="Echo Request"
add chain=ipv6_ping action=accept protocol=icmpv6 icmp-options=129:0-255 comment="Echo Reply"

IPv6 Input/Output from the Mikrotik

Having done that, we can define somewhat longer lists of traffic to/from the Mikrotik itself. Note that due to the extensive use of IPv6 Link Local addresses for key functions it is important that we allow those on each interface (the same addresses are used on each interface, with an interface specific route). We also need to allow IPv6 Link Local Multicast for the same reason. Like IPv4 we obviously need to allow DHCP, since that is how we get the addresses, but DHCPv6 is a different protocol on different UDP ports from DHCPv4 on IPv4.

Then we have an IPv6 input rule set looking similar to the IPv4 one:

/ipv6 firewall filter
add chain=input action=accept connection-state=established,related
add chain=input action=jump   protocol=icmpv6 jump-target=ipv6_icmp
add action=accept chain=input comment="DHCPv6 Replies" \
    dst-address-list=ipv6_link_local dst-port=546 in-interface=fibrex \
    protocol=udp src-address-list=ipv6_link_local src-port=547
add chain=input action=accept in-interface=fibrex src-address-list=ipv6_ext_mgmt
add chain=input action=accept in-interface=ether2 src-address-list=ipv6_lan
add chain=input action=accept in-interface=ether2 src-address-list=ipv6_link_local
add chain=input action=accept in-interface=ether2 src-address-list=ipv6_multicast
add chain=input action=accept in-interface=ether5 src-address-list=ipv6_dmz
add chain=input action=accept in-interface=ether5 src-address-list=ipv6_link_local
add chain=input action=accept in-interface=ether5 src-address-list=ipv6_multicast
add chain=input action=drop   in-interface=fibrex
add chain=input action=drop   in-interface=ether1
add chain=input action=reject

and a similar looking output rule set:

/ipv6 firewall filter
add chain=output action=accept connection-state=established,related
add chain=output action=jump   protocol=icmpv6 jump-target=ipv6_icmp
add action=accept chain=output comment=DHCPv6 dst-address=ff02::1:2/128 \
    dst-port=547 out-interface=fibrex protocol=udp \
    src-address-list=ipv6_link_local src-port=546
add chain=output action=accept out-interface=fibrex protocol=udp dst-port=68-69 comment="DHCP"
add chain=output action=reject out-interface=fibrex
add chain=output action=reject out-interface=ether1
add chain=output comment="Mikrotik Beacons to LAN" dst-address=ff02::1/128 \
    out-interface=ether2 port=5678 protocol=udp
add chain=output action=reject out-interface=ether2 log=yes log-prefix="To LAN"
add chain=output comment="Mikrotik Beacons to DMZ" dst-address=ff02::1/128 \
    out-interface=ether5 port=5678 protocol=udp
add chain=output action=reject out-interface=ether5 log=yes log-prefix="To DMZ"
add chain=output action=drop

The ff02::1/128 address is the "all nodes in link-local" address is basically the IPv6 equivalent of an IPv4 broadcast on a LAN segment; IPv6 does not have broadcast addresses as such.

IPv6 traffic through the Mikrotik

The IPv6 firewall for traffic through the Mikrotik is like the IPv4 firewall for traffic through the Mikrotik, but simpler because it does not require any NAT -- we have globally unique addresses everywhere (I have chosen not to use IPv6 Unique Local Addresses at this time).

The main difference is that we can also receive traffic from the Internet direct to those global addresses -- and for now I have chosen to allow ICMPv6 Ping and nothing else, to help with debugging routing and other issues.

So we have IPv6 firewall chains for traffic from LAN to DMZ and DMZ to LAN:

/ipv6 firewall filter
add chain=ipv6_lan_to_dmz action=accept

/ipv6 firewall filter
add chain=ipv6_dmz_to_lan action=jump protocol=icmpv6 jump-target=ipv6_ping
add chain=ipv6_dmz_to_lan action=reject

which are then used by IPv6 firewall chains for traffic originating on the LAN and DMZ:

/ipv6 firewall filter
add chain=ipv6_lan_out action=jump   src-address-list=ipv6_lan \
    dst-address-list=ipv6_dmz in-interface=ether2 out-interface=ether5 \
add chain=ipv6_lan_out action=accept src-address-list=ipv6_lan \
    dst-address-list=!ipv6_bogons in-interface=ether2 out-interface=fibrex
add chain=ipv6_lan_out action=reject

/ipv6 firewall filter
add chain=ipv6_dmz_out action=jump   src-address-list=ipv6_dmz \
    dst-address-list=ipv6_lan in-interface=ether5 out-interface=ether2 \
add chain=ipv6_dmz_out action=accept src-address-list=ipv6_dmz \
    dst-address-list=!ipv6_bogons in-interface=ether5 out-interface=fibrex
add chain=ipv6_dmz_out action=reject

to both handle LAN to DMZ and LAN to Internet -- and DMZ to LAN and DMZ to Internet -- traffic.

Then we have some IPv6 inbound firewall rules to handle Internet originated traffic:

/ipv6 firewall filter
add chain=ipv6_lan_in action=jump src-address-list=!ipv6_bogons \
    dst-address-list=ipv6_lan in-interface=fibrex out-interface=ether2 \
    protocol=icmpv6 jump-target=ipv6_ping
add chain=ipv6_lan_in action=reject

/ipv6 firewall filter
add chain=ipv6_dmz_in action=jump src-address-list=!ipv6_bogons \
    dst-address-list=ipv6_dmz in-interface=fibrex out-interface=ether5 \
    protocol=icmpv6 jump-target=ipv6_ping
add chain=ipv6_dmz_in action=reject

(If I were starting again I might have called these ipv6_to_lan and ipv6_to_dmz, and the "out" ones ipv6_from_lan and ipv6_from_dmz; but I wanted to be consistent with the already defined above IPv4 firewall chain names, and the Mikrotik makes it non-trivial to change firewall chain names.)

Once all of the above is defined, we can define a general IPv6 "forward" policy that hooks into all these other chains as required:

/ipv6 firewall filter
add chain=forward action=accept connection-state=established,related comment="Other Established, Related"
add chain=forward action=drop   connection-state=invalid comment="Drop invalid" log=yes log-prefix=Invalid
add chain=forward action=jump in-interface=ether2 jump-target=ipv6_lan_out
add chain=forward action=jump in-interface=ether5 jump-target=ipv6_dmz_out
add chain=forward action=jump in-interface=fibrex out-interface=ether2 jump-target=ipv6_lan_in
add chain=forward action=jump in-interface=fibrex out-interface=ether5 jump-target=ipv6_dmz_in
add chain=forward action=drop in-interface=fibrex
add chain=forward action=drop in-interface=ether1
add chain=forward action=reject

Then the basic IPv6 firewall should be complete, ready to be extended over time. If the IPv6 addresses change then the address-lists will need some tweaking, but in theory the rules themselves should be fairly static.

Other firewall related configuration

The IPv4 firewall state timeouts are relatively short by default in some cases so it can help to extend these:

/ip firewall connection tracking set tcp-fin-wait-timeout=5m \
    tcp-close-wait-timeout=5m tcp-last-ack-timeout=5m \
    tcp-time-wait-timeout=5m tcp-close-timeout=5m

and ideally we would do the same for IPv6, but there is no specific IPv6 firewall connection tracking options; it is unclear if the IPv4 settings also apply to IPv6, or if the IPv6 connection tracking times are simply not exposed.

Over time other firewall rules can be added to allow, eg,

  • NTP for time synchronisation via the WAN interface (to known NTP servers):

    /system ntp client set primary-ntp=A.B.C.D enabled=yes
    /ip firewall address-list add list=ipv4_ntp_servers address=A.B.C.D

    which hooks into the IPv4 output firewall rule set above.

  • Allow the DMZ to use the LAN DNS server:

    /ip firewall address-list add list=ipv4_lan_dns_server address=E.F.G.H
    /ip firewall filter print where chain=ipv4_dmz_to_lan
    /ip firewall filter add chain=ipv4_dmz_to_lan protocol=udp dst-port=53 \
        dst-address-list=ipv4_lan_dns_server comment="DNS (UDP)" place-before=NN
    /ip firewall filter add chain=ipv4_dmz_to_lan protocol=tcp dst-port=53 \
        dst-address-list=ipv4_lan_dns_server comment="DNS (TCP)" place-before=NN

    which will need appropriate place-before=NN entries to put it into the right location in the rules.


With all of this set up, it should be possible to plug ether1 of the Mikrotik into the Vodafone TechniColor TC4400VDF in place of the Huawei HG659. Then power cycle the Vodafone TechniColor TC4400VDF to force it to forget the internal MAC addresses, and let the network know to expect a new connection. Once the TechniColor TC4400VDF boots, the Mikrotik should be able to get IPv4 and IPv6 addresses via DHCPv4 and DHCPv6. You can inspect the DHCP state with:

/ip dhcp-client print
/ip dhcp-client print detail

/ipv6 dhcp-client print
/ipv6 dhcp-client print detail

and in both cases you are looking for a status of "bound", and some appropriate IP addresses, and an appropriate lease expiry time (minutes for the IPv4 address; weeks for the IPv6 addresses).

For IPv6 you can also inspect the IPv6 pool allocated, and what was assigned to the LAN and DMZ interfaces:

/ipv6 pool print
/ipv6 address print detail where global and interface=ether2
/ipv6 address print detail where global and interface=ether5

Providing the addresses used from the IPv6 pool are at the very start of the pool (ie, first allocations) they should be fairy stable over reboot of the Mikrotik (on each boot it will revert to the start of the pool). The addresses on those interfaces should be compared with the address lists for the LAN / DMZ IPv6 addresses if there are issues with IPv6 reachability:

/ipv6 firewall address-list print where list=ipv6_lan
/ipv6 firewall address-list print where list=ipv6_dmz

and if they are out of sync, use the commands shown in the IPv6 address definition sections to update the address-lists with the current values.

This configuration has worked fairly reliably for me for the last month. The main issues have been:

  • Issues with DHCPv4 and DHCPv6, which eventually lead to fairly broadly allowing DHCPv4 and DHCPv6 on the WAN interface (I think what was happening was that the state relating to the DHCP request was timing out and the reply was being ignored; DHCP uses different addresses at different stages depending on whether or not it already has an IP address); and

  • Multiple issues with the Vodafone FibreX headend going away (the Vodafone TechniColor TC4400VDF showing no uplink/downlink), particularly around three weeks ago (where it went away twice within 30 minutes on each of 2 days; I think Vodafone were doing some sort of maintenance, but it is unclear exactly what -- and it happened in the middle of the work day rather than overnight).

I have also changed the LAN address of the Vodafone supplied Huawei HG659 to a different IPv4 IP and disabled unneeded functionality:

  • IPv6 RA (Route Announcements)

  • IPv6 DHCPv6

  • UPnP

so that it does not interfere, and then continued to use it as just an access point. It does complain it is "not connected to the Internet" (as the WAN interface is not connected, so its DHCP requests are failing) but otherwise it seems to work fine as just an access point.

Handling IPv6 DHCPv6 client/pool address changes

ETA 2017-11-12: After a few more weeks, it has turned out that the Vodafone "Dynamic but Stable" IPv6 addresses do end up changing often enough to be annoying (the change breaks the IPv6 firewalling, which breaks IPv6 for LAN clients, which causes delays in connecting :-( ). It also appears that the previous 2-week leases might have been reduced to a somewhat more sensible "several hours" lease time.

To handle this I have put some effort into Mikrotik scripting to track the changing LAN/DMZ IPv6 address ranges. Ideally this would happen when the IPv6 addresses themselves changed, but I cannot find a scripting hook on "/ipv6 address" or "/ipv6 pool" to use. The next best thing is to hook into the "/ipv6 dhcp-client" scipting features, and run a script when the DHCPv6 addresses are acquired, applied or removed. But since the IPv6 pool updates and IPv6 address updates from those pools might happen asynchronously, we need a bit of a delay before trying to update the "/ipv6 firewall address-list" entries -- I've chosen around 30 seconds as likely to be sufficent. Sadly there is not an easy way to schedule a script to "run in 30 seconds" (cf at on Unix systems); the best option seems to be to enable/disable a scheduler event that runs every 30 seconds, as a "one shot" run.

So the process is:

  • "/ipv6 dhcp-client ... script=..." which runs a "/system script" that enables the "update filters every 30 seconds" scheduler event.

  • In around 30 seconds, that script launches and (a) runs the script that will update the IPv6 address-lists, and (b) disables the "update filters every 30 seconds" scheduler event.

  • For some more robustness there is also another hourly scheduler event (which stays enabled) which also runs the same script to update the IPv6 address lists); hourly seemed often enough to minimise the "wrong IP" pain, while still keeping resource usage fairly low (amongst other things we are rewriting the config each time!)

The individual per-interface address list updates are simply the commands given earlier to set the "/ipv6 firewall address-list ..." entries, preceeded by a command to clear the existing address-list entries (to avoid them accumulating months of old history!).

The basic per-interface scripts are:

/system script
add name=ipv6-update-lan-range owner=ewen policy=read,write \
    source="/ipv6 firewall address-list remove [/ipv6 firewall \
            address-list find list=ipv6_lan]; \
    /ipv6 firewall address-list add list=ipv6_lan address=[/ipv6 \
          address get [/ipv6 address find interface=ether2 \
          from-pool=fibrex-pool] address]"

/system script
add name=ipv6-update-dmz-range owner=ewen policy=read,write \
    source="/ipv6 firewall address-list remove [/ipv6 firewall \
           address-list find list=ipv6_dmz]; \
    /ipv6 firewall address-list add list=ipv6_dmz address=[/ipv6 \
          address get [/ipv6 address find interface=ether5 \
          from-pool=fibrex-pool] address]"

(Note the use of ";" in between the two comamnds to separate them; the alternative is to embed CR (\r) and NL (\n) characters into the script.)

Having done that for convenience we combine them into one script which calls both:

/system script add name=ipv6-update-filters owner=ewen policy=read,write \
    source="/system script run ipv6-update-lan-range; \
            /system script run ipv6-update-dmz-range"

Then we can schedule that top level script hourly:

/system scheduler add name="ipv6-update-filters" \
        on-event="ipv6-update-filters" interval=1h

as a background precaution.

To make it run "on demand" for IPv6 DHCPv6 client changes, we need to create a delayed one-shot variant. To do this we make a place holder "once" script:

/system script add name="ipv6-update-filters-once" policy=read,write \
        source="/system script run ipv6-update-filters"

that initially just runs the top level script. Then we scheduled that to run every 30 seconds, but leave it disabled:

/system scheduler add name="ipv6-update-filters-once" \
        on-event="ipv6-update-filters-once" disabled=yes interval=30s

and update the script that is run to disable the scheduler event that started it, so that enabling the schduler event will result in a "once" run:

/system script set [/system script find name=ipv6-update-filters-once] \
        source="/system script run ipv6-update-filters; \
                /system scheduler set disabled=yes [/system scheduler \
                        find name=ipv6-update-filters-once]" \
        comment="Oneshot schedulable ipv6-update-filters"

and finally we create a script to enable that event when required:

/system script add name="ipv6-update-filters-in-30-seconds" \
        policy="read,write" source="/system scheduler set disabled=no \
                [/system scheduler find name=ipv6-update-filters-once]" \
        comment="Enable 'once' IPv6 Filter Update"

which we can test by hand with:

/system script run ipv6-update-filters-in-30-seconds

and then watch it with:

/system scheduler print
/system scheduler print
/ipv6 firewall address-list print

and we should see the "once" scheduler event get enabled, then after a while show that it has run once (run-count is 1), and be disabled again. Looking at the "/ipv6 firewall address-list print" should then show the updated addresses.

Once we are sure that it works, we can then hook it up to the IPv6 DHCPv6 client with:

/ipv6 dhcp-client set [/ipv6 dhcp-client find interface=fibrex] \
      script="/system script run ipv6-update-filters-in-30-seconds"

which in theory will run the "one-shot" filter update about 30 seconds after the DHCP change (and since it is idempotent, and via a scheduler event with its own event timing, it should not get run repeatedly very often -- and if it does, it should still work out okay). The hourly event remains as a backup.

Ideally there would be an easier way to express this "address-list contains the network of this interface" policy than the kludge described above, particularly with IPv6 address-lists where the underlying addresses are likely to change regularly for many users (IPv4 mostly avoids this problem by NAT and masquerade just tracking the changing IP). But hopefully IPv6 address changes will now require less manual intervention.

Of note, both Mikrotik examples, and Mikrotik examples are useful hints as to the range of things that can be done with Mikrotik Scripting. It takes a bit of creativity to express what you want, but the scripting language is reasonably full featured.

Posted Mon Oct 23 14:47:36 2017 Tags:

Earlier this week "Do Not Reply" let me know, in an email titled "Return Available", that it was time to file my company GST return:

You have GST returns for period ending 30 September 2017, due
30 October 2017, now available for filing for the following IRD numbers:

So this weekend I finished up the data entry and calculated the data needed to file the GST return, as usual. (Dear IRD, if you are listening, perhaps "Do Not Reply" is not the most optimal sender for official correspondence? Maybe you could consider, eg, "IRD Notification Service"? Also "Return Available" seems like a confusing way to say "please file your GST return this month". Just saying.)

Of note for understanding what transpires below, I was forced to register for "MyIR" a couple of years ago to request IRD provide a Tax Residency Certificate; other countries have information, but IRD only provide a guide to determining tax residency, and needed the concept of a Tax Residency Certificate explained to them, including the fields required by their Double Taxation Treaty partners.

Because of that "MyIR" registration, I am now forced to file GST returns online (once you have registered, filing on paper is no longer an option). Previously the online filing has been relatively simple, but this weekend while the filing went okay, trying to exit out of the "MyGST" part of the "MyIR" website of the Inland Revenue Department turned into a comedy of errors:

  1. The "Log Off" button in the "MyGST" site, something you would hope would be regularly tested, failed to work. It tries to access (via Javascript obscurity):

    which seems a plausible enough URL, but actually ends up with:

    Secure Connection Failed
    The connection to the server was reset while the page was loading.

    every time I tried. (The "Logout" link on the "MyIR" site, also loading via Javascript, went to a different site, but did actually work; it is unclear if logging out of "MyIR" also logs you out of "MyGST", as they are presented as separate websites.)

  2. Since a working "Log Off" function seemed important to a site that holds sensitive, potentially confidental, information I tried to report the issue. Conveniently the "MyGST" site has a handy "Send us a message" link on its front page, so I attempted to use that. However I found:

    • It will not accept ".txt" attachments (to illustrate the problem): "File Type .txt is not allowed" with no indication of why it is not allowed. (I assume "not on the whitelist", but that raises the questions (a) why?! and (b) what "File Type"s are allowed. Experimentally I determined that PNG and PDF were allowed.)

    • There is no option to contact about the website, only "something else".

    • When you "Submit" the message you've written, the website simply returns to the "MyGST" home page with no indication whether or not the message was sent, where you might see the sent message, and no copy of the sent message emailed to you. (I tried twice; same result both times.)

    So that did not seem very promising.

    For the record, I eventually found -- much later -- that you can check if the message has been sent by:

    • Going to the "Activity Centre" tab of "MyGST"

    • Clicking on the "More..." button next to the "Messages" heading

    • Clicking on the "Outbox" tab of that mailbox

    and you will see your messages there, and can click on each one to view them. (Which showed that each of my two attempts had apparently been sent twice, despite the website not informing me it had done so; oops. It is unclear to me how they ended up each being sent twice; I did not, eg, click through a "resend POST data" dialogue.)

  3. When it was unclear if "Send us a message" in "MyGST" worked, I thought the next best option would be to go back to the "MyIR" site, and use "Secure mail" which is IRD's preferred means of contact (as I found out when, eg, trying to get a Tax Residency Certificate a couple of years ago). Unfortunately when I attempted to use that I found:

    • There is no option to choose "Website" or "GST" from the form at all, so I had to send an "All Other" / "All Other" message;

    • There was no option to add attachments to the message, so I could not include the screenshots/error output; and

    • When I submitted that message, I got a generic 404 error!

      which told me:

      Contact us
      Page not available
      The page you are trying to access is not available on our website.
      If you have reached this page by following a link on our website
      rather than using a bookmark, please take a moment to e-mail the
      General comments form with the details of the page you were trying
      to access.

    The "MyIR" "Secure Mail" feature does have an obvious "Sent" tab, so in another window I was quickly able to check that it had not in fact been sent. At this point I assumed I was 0 for 3 -- not a great batting average.

  4. Still, the 404 page did offer a link to the General Comments page:

    so it seemed worth reporting the accumulating list of problems. That "General Comments" page is (naturally) very general, but:

    • "Website" is not a category they have anticipated receiving comments about (so "Other" it is again); and

    • Your choices for response are:

      • No response required

      • In writing by mail

      • Over the phone

      • In writing by fax

      And that is it: no option to ask for a response by email. But if your 1990s fax machine is still hooked up and working then IRD is ready to respond to your online communication with your preferred option! (It appears based on your response here the second stage of the form requires you to enter different values; but the "In writing by mail" does not even collect a postcode!)

      In fairness, the second stage of the form also allowed an optional email address to be entered -- which I did -- so possibly they might treat one of the above as "by email"; it is just not at all obvious to the user.

    • The box for entering comments was 40 characters wide by 4 characters deep -- there are programmable calculators with a larger display! (In fairness Firefox on OS X at least does allow resizing this; but nothing says "we hope you do not have much to say" like allowing a old-Tweet length worth of text to be visible on the screen at once.)

    Anyway undeterred by all of this I reported in brief, the three problems I had encountered so far: (1) "MyGST" Log Off function broken; (2) "MyGST" "Send us a message" function apparently not working; (3) "MyIR" "Secure Mail" sending resulting in a 404.

    That one was successful, giving me a "comment sent" confirmation page, although without any tracking number or other identifier (the closest to an identifier is "Your request was sent on Sunday 8 October 2017 at 14:40"). Sadly my neatly laid out bullet point list of issues encountered was turned into a single line of terribly formatted run on text; it appears they were serious about people keeping their comments to old-Tweet length!

  5. After this experience I was surprised to find that the only working thing -- the General Comments Form -- offered me a chance to:

    Send feedback about this form

    Since I seemed to be on a yak shaving mission to use every feedback form on the site, who could resist?! I (successfully!) offered them anonymous feedback that:

    • In 2017, offering "response by email" might be a useful update;

    • Perhaps "In writing by fax" could be retired;

    • 40x4 character comment forms are... rather small and difficult to use.

    Only I had to do so much more tersely because the "Online Form Feedback" comment field was itself 40x4 characters.

On the plus side:

  • I did manage to file my GST return

  • Eventually if one is patient enough, one does get auto-logged out of the "MyIR" site, so maybe one does get auto-logged out of the "MyGST" site as well;

  • Apparently I did manage to report the original "MyGST" "Log Off" problem after all (and hopefully someone at IRD can merge those into a single ticket, rather than having four people investigating the problem).

Now to actually pay my GST amount due.

If IRD do respond with anything useful I will add an update to this post to record that, eg, some of the above issues have been fixed. At least two of them ("MyGST" "Log Off" isue and "MyIR" "Secure Mail" sending seem likely to be encountered by other users and fixed.)

ETA 2017-10-17: IRD responded to my second contact attempt (in MyGST) with:

"""Good Afternoon Ewen.

Thank you for your email on the 8th October 2017.

seeing you are having issues with the online service please
contact us on 0800 227 770.

As this service doesn't deal with these issues, This service
is for web message responses for GST accounts. We have forwarded
this message to be directed to our Technical Services as this
is a case for them."""

which, at one week to reply, is much better than their estimated reply time. I have assumed that "forwarded [...] to our Technical Services [...]" will be sufficient to get the original reports in front of someone who might be able to actually investigate/fix them, and not done anything further (calling an 0800 number for (frontline) "technical support" seems unlikely to end well over such a technical issue).

The "MyGST" "Log Off" functionality is still broken though. The "MyIR" logout functionality is slow, but does eventually work.

However going back to an earlier GST page after using the "MyIR" log out functionality, and reloading still shows I am in my "MyGST" account", and can access pages in "MyGST" that I previously had not viewed in this session. So it appears the two logoff functions are separate -- even though they are controlled by a single "logon" screen. By contrast, trying to go to a "MyIR" page does correctly show the login screen again.

So we learn that logging off "MyGST" separately is important, and that it is still broken (at least in Firefox 52 LTS on OS X 10.11, and Safari 11 on OS X 10.11; both retested today, 2017-07-17).

Posted Sun Oct 8 16:47:59 2017 Tags:

I have a Huawei HG659 Home Gateway supplied as part of my Vodafone FibreX installation. Out of the box, since early 2017, the Vodafone FibreX / Huawei HG659 combination has natively provided IPv6 support. This automagically works on modern mac OS:

ewen@osx:~$ ping6
PING6(56=40+8+8 bytes) 2407:7000:9b0e:4856:b971:8973:3fe3:1a51 --> 2404:6800:4006:804::200e
16 bytes from 2404:6800:4006:804::200e, icmp_seq=0 hlim=57 time=46.574 ms
16 bytes from 2404:6800:4006:804::200e, icmp_seq=1 hlim=57 time=43.953 ms

and Linux:

ewen@linux:~$ ping6
PING (2404:6800:4006:804::200e)) 56 data bytes
64 bytes from (2404:6800:4006:804::200e): icmp_seq=1 ttl=57 time=44.8 ms
64 bytes from (2404:6800:4006:804::200e): icmp_seq=2 ttl=57 time=43.8 ms

to provide global IPv6 connectivity, for a single internal VLAN, without having to do anything else.

Vodafone delegates a /56 prefix to each customer, which in theory means that it should be possible to further sub-delegate that within our own network for multiple subnets -- most IPv6 features will work down to /64 subnets. I think the /56 is being provided via DHCPv6 Prefix Delegation (see RFC3633 and RFC3769; see also OpenStack Prefix Delegation discussion).

Recently I started looking at whether I could configure an internal Mikrotik router to route dynamically-obtained IPv6 prefixes from the Huawei HG659's /56 pool, to create a separate -- more isolated -- internal subnet. A very useful Mikrotik IPv6 Home Example provided the Mikrotik configuration required, although I did have to update it slightly for later Mikrotik versions (tested with RouterOS 6.40.1).

Enable IPv6 features on the Mikrotik if they are not already enabled:

/system package enable ipv6
/system package print

If the "print" shows you an "X" with a note that it will be enabled after reboot, then also reboot the Mikrotik at this point:

/system reboot

After that, you should have a IPv6 Link Local Address on the active interface, which you can see with:

[admin@naos-rb951-2n] > /ipv6 addr print
Flags: X - disabled, I - invalid, D - dynamic, G - global, L - link-local
 #    ADDRESS                                     FROM-... INTERFACE        ADV
 0 DL fe80::d6ca:6dff:fe50:6c44/64                         ether1           no
[admin@naos-rb951-2n] >

(The IPv6 Link Local addresses are recognisable as being in fe80::/64, and on the Mikrotik will show as "DL" -- dynamically assigned, link local.)

Once that is working configure the Mikrotik IPv6 DHCPv6 client to request a Prefix Delegation with:

/ipv6 dhcp-client add interface=ether1 pool-name=ipv6-local \
      add-default-route=yes use-peer-dns=yes request=prefix

Unfortunately when I tried that, it never succeeded in getting an answer from the Huawei HG659. Instead the status was stuck in "searching":

[admin@naos-rb951-2n] > /ipv6 dhcp-client print detail
Flags: D - dynamic, X - disabled, I - invalid
 0    interface=ether1 status=searching... duid="0x00030001d4ca6d506c44"
      dhcp-server-v6=:: request=prefix add-default-route=yes use-peer-dns=yes
      pool-name="ipv6-local" pool-prefix-length=64 prefix-hint=::/0
[admin@naos-rb951-2n] >

which makes me think that while the Huawei HG659 appears to be able to request an IPv6 prefix delegation (with a DHCPv6 client) it does not appear provide a DHCPv6 server that is capable of prefix delegation, which rather defeats the purpose of having a /56 delegated :-(


/ipv6 dhcp-client remove 0
/ipv6 dhcp-client add interface=ether1 pool-name=ipv6-local \
      add-default-route=yes use-peer-dns=yes request=address,prefix

which apepars to be the syntax to request an interface address and a prefix delegation did not work any better, still getting stuck with a status of "searching...":

[admin@naos-rb951-2n] > /ipv6 dhcp-client print  detail
Flags: D - dynamic, X - disabled, I - invalid
 0    interface=ether1 status=searching... duid="0x00030001d4ca6d506c44"
      dhcp-server-v6=:: request=address,prefix add-default-route=yes
      use-peer-dns=yes pool-name="ipv6-local" pool-prefix-length=64
[admin@naos-rb951-2n] >

If I delete that and just request an address:

/ipv6 dhcp-client remove 0
/ipv6 dhcp-client add interface=ether1 pool-name=ipv6-local \
      add-default-route=yes use-peer-dns=yes request=address

then the DHCPv6 request does succeed very quickly:

[admin@naos-rb951-2n] > /ipv6 dhcp-client print
Flags: D - dynamic, X - disabled, I - invalid
 #    INTERFACE                     STATUS        REQUEST
 0    ether1                        bound         address
[admin@naos-rb951-2n] >

and there is an additional IPv6 address visible for that interface:

[admin@naos-rb951-2n] > /ipv6 addr print
Flags: X - disabled, I - invalid, D - dynamic, G - global, L - link-local
 #    ADDRESS                                     FROM-... INTERFACE        ADV
 0 DL fe80::d6ca:6dff:fe50:6c44/64                         ether1           no
 1 IDG ;;; duplicate address detected
      2407:xxxx:xxxx:4800::2/64                            ether1           no
[admin@naos-rb951-2n] >

Unfortunately the "I" flag and the "duplicate address detected" comment are both very bad signs -- that the address supplied by DHCPv6 is unusable. When I look around other devices on my network I find that they too have that address, including my main OS X 10.11 laptop:

ewen@ashram:~$ ifconfig -a | grep -B 10 ::2 | egrep "^en|::2"
        inet6 2407:xxxx:xxxx:4800::2 prefixlen 128 dynamic

and another OS X 10.11 laptop:

ewen@mandir:~$ ifconfig -a | grep -B 9 ::2 | egrep "^en|::2"
        inet6 2407:xxxx:xxxx:4800::2 prefixlen 128 duplicated dynamic

which implies that the Huawei HG659 DHCPv6 server is handing out the same (::2) address to multiple clients (possibly all clients?!) and only the first client to make the request has a reasonable chance of working (in theory the others will discover via RFC7527 Duplicate Address Detection that the address is already in used, and invalidate it, to allow the first client to work).

From all of this I conclude that the Huawei HG659 DHCPv6 server will basically only work in a useful fashion for a single DHCPv6 client, that wants a single address -- so it is almost useless. In particular the DHCPv6 server does not appear to be a way to get use of parts of at the IPv6 /56 delegation provided by Vodafone.

Yet IPv6 global transit does work from multiple OS X and Linux devices on my home network -- so they are clearly not (solely) reliant on IPv6 DHCPv6 working properly.

The reason they have working IPv6 transit is that OS X and Linux will also do SLAAC -- Stateless Addesss Auto-Configuration (RFC4826 -- to obtain an IPv6 address and default route. SLAAC uses the IPv6 Neighbor Discovery Protocol (RFC4861) to determine the IPv6 address prefix (/64), and a Modified EUI-64 algorithm (described in RFC5342 section 2.2) to determine the IPv6 address suffix (64-bits).

Providing the Hauwei HG659 is configured to send IPv6 RA ("Route Advertisement") messages (Home Interface -> LAN Interface -> RA Settings -> Enable RA is ticked), then SLAAC should work. There are two other settings:

  • "RA mode": automatic / manual. In automatic mode it appears to pick a prefix from the /56 that the IPv6 DHCPv6 Prefix Delegation client obtained from Vodafone -- apparently the "56" prefix (at least in my case), for no really obvious reason. In manual mode you can specify a prefix, but that does not seem very useful when the larger prefix you have is dynamically allocated....

  • "ULA mode": disable / automatic / manual. This controls the delegation of IPv6 Unique Local Addresses (RFC4193), which are site-local addresses in the fc00::/7 block. By default it is set to "automatic" which appears to result in the Huawei HG659 picking a prefix block at random (as indicated by a fd00::/8 address). "manual" allows manual specification of the block to use, and "disable" I assume turns off this feature.

Together these four features (IPv6 Link Local Addresses, IPv6 DHCPv6, IPv6 SLAAC, RFC4193 Unique Local Addresses) explain most of the IPv6 addresses that I see on my OS X client machines. For instance (some of the globally unique 56 prefix replaced with xxxx:xxxx, and the last three octets of the SLAAC addresses replaced by yy:yyyy for privacy):

ewen@ashram:~$ ifconfig en6 | egrep "^en|inet6" 
        inet6 fe80::6a5b:35ff:feyy:yyyy%en6 prefixlen 64 scopeid 0x4
        inet6 2407:xxxx:xxxx:4856:6a5b:35ff:feyy:yyyy prefixlen 64 autoconf
        inet6 2407:xxxx:xxxx:4856:b971:8973:3fe3:1a51 prefixlen 64 autoconf temporary
        inet6 fd50:1d9:5e3e:8300:6a5b:35ff:feyy:yyyy prefixlen 64 autoconf
        inet6 fd50:1d9:5e3e:8300:e010:afec:6457:2850 prefixlen 64 autoconf temporary
        inet6 2407:xxxx:xxxx:4800::2 prefixlen 128 dynamic

In this list:

  • The fe80::6a5b:35ff:feyy:yyyy%en6 address is the IPv6 Link Local Address, derived from the prefix fe80:://64 and an EUI-64 suffix derived from the interface MAC address (as described in RFC2373). It is approximately the first 3 octets of the MAC address, then ff:fe, then the last 3 octets of the MAC address -- but the Universal/Local bit of the MAC address is inverted in IPv6, so as to make ::1, ::2 style hand created addresses end up automatically marked as "local". (While this seems clever, with perfect hindsight it would perhaps have been better if the IEEE MAC address Universal/Local flag was a Local/Universal flag with the bit values inverted, for the same reason... and perhaps better positioned in the bit pattern.) In this case 0x68 in the MAC address becomes 0x6a:

    ewen@ashram:~$ perl -le 'printf("%08b\n", 0x68);'
    ewen@ashram:~$ perl -le 'printf("%08b\n", 0x6a);'

    by setting this additional (7th from the left) bit.

  • The 2407:xxxx:xxxx:4856:6a5b:35ff:feyy:yyyy address is the globally routable IPv6 SLAAC address, derived from the SLACC /64 prefix obtained from the IPv6 Route Advertisement packets and the EUI-64 suffix ss described above (where the SLAAC /64 prefix provided by the Hauwei HG659 itself came from an IPv6 DHCPv6 Prefix Delegation request made by the Huawei HG659). This address is recognisable by the "autoconf" flag indicating SLAAC, and the non-fd prefix.

  • The fd50:1d9:5e3e:8300:6a5b:35ff:feyy:yyyy address is the Unique Local Address (RFC4193), derived from a randomly generated prefix in fd00::/8 and the EUI-64 suffix as described above. This address is recognisable by the "autoconf" flag indicating SLAAC, and the fd prefix. (See also "3 Ways to Ruin Your Future Network with IPv6 Unique Local Addresses" Part 1 and Part 2 -- basically by re-introducing all the pain of NAT to IPv6, as well as all the pain of "everyone uses the same site-local prefixes".)

  • The 2407:xxxx:xxxx:4800::2 address is obtained from the Huawei HG659 DHCPv6 server, and consists of the first /64 in the /56 that the Huawei HG659 DHCPv6 client obtained via Prefix Delegation, and a DHCP assigned suffix, starting with ::2 (where I think the Huawei HG659 itself is ::1, but it does not respond to ICMP with that address). This address is recognisable by the "dynamic" flag indicating DHCPv6.

    Unfortunately as described above the Huawei HG659 DHCPv6 DHCP server is broken (at least in Huawei HG659 firmware version V100R001C206B020), and mistakenly hands out the same DHCP assigned suffix to multiple clients. This means that only the lucky first DHCPv6 client on the network will have a working DHCPv6 address. (It also appears, as described above, that it does not support DHCPv6 Prefix Delegation.)

That explains all but two of the IPv6 addresses listed. The remaining two have the "temporary" flag:

ewen@ashram:~$ ifconfig en6 | egrep "^en|inet6" | egrep "^en6|temporary"
         inet6 2407:xxxx:xxxx:4856:b971:8973:3fe3:1a51 prefixlen 64 autoconf temporary
         inet6 fd50:1d9:5e3e:8300:e010:afec:6457:2850 prefixlen 64 autoconf temporary

and those are even more special. IPv6 Temporary Addresses are created to reduce the ability to track the same device across multiple locations through the SLAAC EUI-64 suffix -- which being predictably derived from the MAC address will stay the same across multiple SLAAC prefixes. Mac OS X (since OS X 10.7 -- Lion) and Microsoft Windows since (Windows Vista) will generate, and use them, by default.

The relevant RFC is RFC4941 which defines "Privacy Extensions for Stateless Address Autoconfiguration in IPv6". Basically it defines a method to create additional ("temporary") IPv6 addresses, following a method like IPv6 SLAAC, which are not derived from a permanent identifier like the ethernet MAC address -- instead a IPv6 suffix is randomly generated and used instead of the EUI-64 suffix. Amusingly the suggested algorithm appears to be old enough to use the (now widely deprecated) MD5 hash algorithm as part of the derivation steps. (These temporary/"Privacy" addresses are supported on many modern OS.)

These RFC4941 "temporary" addresses normally have a shorter lifetime, which can be seen on OS X with "ifconfig -L":

ewen@ashram:~$ ifconfig -L en6 | egrep "^en6|temporary"
        inet6 2407:xxxx:xxxx:4856:b971:8973:3fe3:1a51 prefixlen 64 autoconf temporary pltime 3440 vltime 7040
        inet6 fd50:1d9:5e3e:8300:e010:afec:6457:2850 prefixlen 64 autoconf temporary pltime 3440 vltime 7040

but on the system I checked both the temporary and the permanent SLAAC addresses had the same pltime/vltime values, which I assume are derived from the SLAAC validity times. The "pltime" is the "preferred" lifetime, and the "vltime" is the "valid" lifetime; I think that after the preferred lifetime an attempt will be made to generate renew the address, and after the valid lifetime the address will be expired (assuming it is not renewed/replaced before then).

It appears that in macOS 10.12 (Sierra) and later, even the non-temporary IPv6 addresses no longer use the EUI-64 approach to derive the address suffix from the MAC address -- which means the "permanent" addresses also changed between 10.11 and 10.12. I do not currently have a macOS 10.12 (Sierra) system to test this on. I found a claim these are RFC 3972 "Cryptographically Generated Addresses", but there does not seem to be much evidence for the exact algorithm used. (There are also suggestions that this is an implementation of RFC7217 "Semantically Opaque Interface Identifiers" which effectively make the IPv6 suffix also depend on the IPv6 prefix. Ie, the resulting address would be stable given the same Prefix, but different for each prefix. See also an IPv6 on OS X Hardening Guide -- from 2015, so probably somewhat out of date now.)

Returning to the problem I started with, configuring a Mikrotik for IPv6, I found that the Mikrotik could have an interfacen address configured with SLAAC, by setting:

/ipv6 settings set accept-router-advertisements=yes


/ipv6 settings set accept-router-advertisements=yes-if-forwarding-disabled forward=no

(see Mikrotik IPv6 Settings), but at least on 6.40.1 this still does not result in an IPv6 SLAAC address being visible anywhere, even after a reboot. (Bouncing the interface, or rebooting -- /system reboot -- is required to initiate SLAAC.)

You can check the address did get assigned properly by pinging it from another SLAAC configured system, with the EUI-64 derived suffix:

ewen@ashram:~$ ping6 -c 2 2407:xxxx:xxxx:4856:d6ca:6dff:feyy:yyyy
PING6(56=40+8+8 bytes) 2407:xxxx:xxxx:4856:6a5b:35ff:fe88:8f6e --> 2407:7000:9b0e:4856:d6ca:6dff:feyy:yyyy
16 bytes from 2407:xxxx:xxxx:4856:d6ca:6dff:feyy:yyyy, icmp_seq=0 hlim=255 time=0.440 ms
16 bytes from 2407:xxxx:xxxx:4856:d6ca:6dff:feyy:yyyy, icmp_seq=1 hlim=255 time=0.504 ms

--- 2407:xxxx:xxxx:4856:d6ca:6dff:feyy:yyyy ping6 statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.440/0.472/0.504/0.032 ms

In the default IPv6 settings:

[admin@naos-rb951-2n] > /ipv6 settings print
                       forward: yes
              accept-redirects: yes-if-forwarding-disabled
  accept-router-advertisements: yes-if-forwarding-disabled
          max-neighbor-entries: 8192
[admin@naos-rb951-2n] >

then IPv6 SLAAC will not be performed; but with either of the settings above (after a reboot: /system reboot) then SLAAC will be performed.

Other than the UI/display issue, this is consistent with the idea that the WAN interface of a router should be assignable using SLAAC, but not entirely consistent with the documentation which says SLAAC cannot be used on routers. It is just that to be useful routers typically need IP addresses for multiple interfaces, and the only way to meaningfully obtain those is either IPv6 DHCPv6 Prefix Delegation -- or static configuration.

Since the Huawei HG659 appears not to provide a usable IPv6 DHCPv6 server there is no way to get DHCPv6 Prefix Delegation working internally, which means my best option will be to replace Huawei HG659 with something else as the "Home Gateway" connected to the Technicolor TC4400VDF modem. Generally people seem to be using a Mikrotik RB750Gr3 (a "hEX") for which there is a general Mikrotik with Vodafone setup guide available. It is a small 5 * GigE router capable of up to 2 Gbps throughput in ideal conditions (by contrast the old Mikrotik RB951-2n that I had lying around to test with is only 5 * 10/100 interfaces, so slower than my FibreX connection let alone my home networking).

In theory the Mikrotik IPv6 support includes both DHCPv6 Prefix Delegation in the client and server, including on-delegating smaller prefixes. Which should mean that if a Mikrotik RB750Gr3 were directly connected to the Technicolor TC4400VDF cable modem it could handle all my requirements, including creating isolated subnets in IPv4 and IPv6. (The Huawei HG659 supports the typical home "DMZ" device by NAT'ing all IPv4 traffic to a specific internal IP, but it is not very isolated unless you NAT to another firewall like the Mikrotik and then forward from there to the isolated subnet -- and I would really prefer to avoid double NAT. That Huawei HG659 DMZ support also appears to be IPv4 only, and it does not appear to support static IPv4 routes on the LAN interface either -- the static routing functions only allow you to choose a WAN interface.)

Since I seem to have hit up against the limits of the Huawei HG659, my "interim" use of the supplied Huawei HG659 appears to be coming to an end. In the meantime I have turned off the DHCPv6 server on the Huawei HG659 (Home Network -> Lan Interface -> IPv6 DHCP Server -> IPv6 DHCP Server should be unticked).

For the record, the Mikrotik MAC Telnet reimplmentation appears to work quite well on OS X 10.11, providing you already know the MAC address you want to reach (eg, from the sticker on the outside of the Mikrotik). That helps a lot with reconfiguration of the Mikrotik for a new purpose, without relying on a Microsoft Windows system or WINE.

Posted Sun Aug 20 17:42:10 2017 Tags:

KeePassXC (source, wiki) is a password manager forked from KeePassX which is a Linux/Unix port of the Windows KeePass Password Safe. KeePassXC was started because of concern about the relatively slow integration of community code into KeePassX -- ie it is a "Community" fork with more maintainers. KeePassXC seems to have been making regular releases in 2017, with the most recent (KeePassXC 2.2.0) adding Yubikey 2FA support for unlocking databases. KeePassXC also provides builds for Linux, macOS, and Windows, including package builds for several Linux distributions (eg an unofficial Debian/Ubuntu community package build, built from the deb package source with full build instructions).

For macOS / OS X there is a KeePassXC 2.2.0 for macOS binary bundle, and KeePassXC 2.2.0 for macOS sha256 digest. They are GitHub "release" downloads, which are served off Amazon S3. KeePassXC provide instructions on verifying the SHA256 Digest and GPG signature. To verify the SHA256 digest:

  • wget

  • wget

  • Check the SHA256 digest matches:

    ewen@ashram:~/Desktop$ shasum -a 256 -c KeePassXC-2.2.0.dmg.digest
    KeePassXC-2.2.0.dmg: OK

To verify the GPG signature of the release:

  • wget

  • wget (which is stored inside the website repository)

  • gpg --import keepassxc_master_signing_key.asc

  • gpg --recv-keys 0xBF5A669F2272CF4324C1FDA8CFB4C2166397D0D2 (alternatively or in addition; in theory it should report it is unchanged)

    ewen@ashram:~/Desktop$ gpg --recv-keys 0xBF5A669F2272CF4324C1FDA8CFB4C2166397D0D2
    gpg: requesting key 6397D0D2 from hkps server
    gpg: key 6397D0D2: "KeePassXC Release <>" not changed
    gpg: Total number processed: 1
    gpg:              unchanged: 1
  • Compare the fingerprint on the website with the output of "gpg --fingerprint 0xBF5A669F2272CF4324C1FDA8CFB4C2166397D0D2":

    ewen@ashram:~/Desktop$ gpg --fingerprint 0xBF5A669F2272CF4324C1FDA8CFB4C2166397D0D2
    pub   4096R/6397D0D2 2017-01-03
          Key fingerprint = BF5A 669F 2272 CF43 24C1  FDA8 CFB4 C216 6397 D0D2
    uid                  KeePassXC Release <>
    sub   2048R/A26FD9C4 2017-01-03 [expires: 2019-01-03]
    sub   2048R/FB5A2517 2017-01-03 [expires: 2019-01-03]
    sub   2048R/B59076A8 2017-01-03 [expires: 2019-01-03]

    to check that the GPG key retrieved is the expected one.

  • Compare the GPG signature of the release:

    ewen@ashram:~/Desktop$ gpg --verify KeePassXC-2.2.0.dmg.sig
    gpg: assuming signed data in `KeePassXC-2.2.0.dmg'
    gpg: Signature made Mon 26 Jun 11:55:34 2017 NZST using RSA key ID B59076A8
    gpg: Good signature from "KeePassXC Release <>"
    gpg: WARNING: This key is not certified with a trusted signature!
    gpg:          There is no indication that the signature belongs to the owner.
    Primary key fingerprint: BF5A 669F 2272 CF43 24C1  FDA8 CFB4 C216 6397 D0D2
         Subkey fingerprint: C1E4 CBA3 AD78 D3AF D894  F9E0 B7A6 6F03 B590 76A8

    at which point if you trust the key you downloaded is supposed to be signing the code you intend to run, the verification is complete. (There are some signatures on the signing key, but I did not try to track down a GPG signed path from my key to the signing keys, as the fingerprint verification seemed sufficient.)

In addition for Windows and OS X, KeePassXC raised funds for an AuthentiCode code signing certificate earlier this year. When signed, this results in a "known publisher" which avoids the Windows and OS X warnings about running "untrusted" code, and acts as a second verification of the intended code running. It is not clear that the .dmg or on OS X is signed at present, as "codesign -dv ..." reports both the .dmg file and the .app as not signed (note that it is possible to use Authenticode Code Signing Certificate with OS X's Signing Tools). My guess is maybe the KeePassXC developers focused on Windows executable signing first (and Apple executables normally need to be signed by a key signed by Apple anyway).

Having verified the downloaded binary package, on OS X it can be installed in the usual manner by mounting the .dmg file, and dragging the .app to somewhere in /Applications. There is a link to /Applications in the .dmg file, but without the clever folder background art that some .dmg files it is less obvious that you are intended to drag it into /Applications to install. (However there is no included installer, so the obvious alternative is "drag'n'drop" to install.)

Once installed, run to start. Create a new password database and give it at least a long master password, then save the database (with the updated master password). After the database is created it is possible to re-open with the relevant database with the usual:


thanks to the application association with the .kdbx file extension. This makes it easier to manage multiple databases. When opened in this way the application will propmpt for the master password of the specific database immediately (with the other known databases available as tabs).

KeePassXC YubiKey Support

KeePassXC YubiKey support is via the YubiKey HMAC-SHA1 Challenge-Response authentication, where the YubiKey mixes a shared secret with a challenge token to create a response token. This method was chosen for the KeePassXC YubiKey support because it provides a determinstic response without, eg, needing to reliably track counters or deal with gaps in monotonically increasing values, such as is needed with U2F -- Universal 2nd Factor. This trades a reduction in security (due to just relying on a shared secret) for robustness (eg, not getting permanently locked out of password database due to the YubiKeys counter having moved on to a newer value than the password database), and ease of use (eg, not having to activate the YubiKey at both open and close of a database; the KeePassXC ticket ticket #127 contains some useful discussion of the tradeoffs with authentiction modes needing counters; pwsafe also uses YubiKey Challenge-Response mode, presumably for similar reasons).

The design chosen seems similar to KeeChallenge, a plugin for KeePass2 (source) to support YubiKey authentiction for the Windows KeePass. There is a good setup guide to Securing KeePass with a Second Factor descriting how to set up the YubiKey and KeeChallenge, which seems broadly transferrable to using the similar KeePassXC YubiKey Challenge-Response feature. (A third party YubiKey Handbook contains an example of configuring the Challenge-Response mode from the command line for a slightly different purpose.)

By contrast, the Windows KeePass built in support is OATH-HOTP authentication (see also KeePass and YubiKey), which does not seem to be supported on KeePassXC -- some people also note OTP 2nd Factor provides authentication not encryption which may limit the extra protection in the case of a local database. HOTP also uses a shared key and a counter so suffers from similar shared secret risks as the Challenge Response mechanism, as well as robustness risks in needing to track the counter value -- one guide to the OATH-HOTP mode warns about keeping OTP recovery codes to get back in again after being locked out due to the counter getting out of sync. See also HOTP and TOTP details; HOTP hashes a secret key and a counter, whereas TOTP hashes a secret key and the time, which means it is easier to accidentally get out of sync with HOTP. TOTP seems to be more widely deployed in client-server situations, presumably because it is self-recovering given a reasonably accurate time source.

Configuring a YubiKey to support Challenge-Response HMAC-SHA1

To configure one or more YubiKeys to support Challenge-Response you need to:

  • Install the YubiKey Personalisation Tool from the Apple App Store; it is a zero cost App, but obviously will not be very useful without a YubiKey or two. (The YubiKey Personalisation Tool is also available for other platforms, and in a command line version.)

  • Run the YubiKey Personalization

  • Plug in a suitable YubiKey, eg, YubiKey 4; the cheaper YubiKey U2F security key does not have sufficient functionality. (Curiously the first time that I plugged a new YubiKey 4 in, the Keyboard Assistant in OS X 10.11 (El Capitan) wanted to identify it as a keyboard, which seems to be a known problem -- apparently one can just kill the dialog, but I ended up touching the YubiKey, then manually selecting a ANSI keyboard, which also seems to a valid approach. See also the YubiKey User Guide examples for mac OS X.)

  • Having done that, the YubiKey Personalisation Tool should show "YubiKey is inserted", details of the programming status, serial number, and firwmware version, and a list of the features supported.

  • Change to the "Challenge Response" tab, and click on the "HMAC-SHA1" button.

  • Select "Confguration Slot 2" (if you overwrite Configuration Slot 1 then YubiKey Cloud will not work, and that apparently is not recoverable, so using Slot 2 is best unless you are certain you will never need YubiKey Cloud; out of the factory only Configuration Slot 1 is programmed).

  • Assuming you have multiple YubiKeys (and you should, to allow recovery if you lose one or it stops functioning) tick "Program Multiple YubiKeys" at the top, and choose "Same Secret for all Keys" from the dropdown, so that all the keys share the same secret (ie, they are interchangable for this Challenge-Response HMAC-SHA1 mode).

  • You probably want to tick "Require user input (button press)", to make it harder for a remote attacker to activate the Challenge-Response functionality.

  • Select "Fixed 64-byte input" for the HMAC-SHA1 mode (required by KeeChallenge for KeePass; unclear if it is required for KeePassXC but selecting it did work).

  • Click on the "Generate" button to generate a random 20-byte value in hex.

  • Record a copy of the 20-byte value somewhere safe, as it will be needed to program an additional/replacement YubiKey with the same secret later (unlike KeeChallenge it is not needed to set up KeePassXC; instead KeePassXC will simply ask the YubiKey to run through the Challenge-Response algorithm as part of the configuration process, not caring about the secret key used, only caring about getting repeatable results).

    Beware the dialog box seems to be only wide enough to display 19 of the bytes (not 20), and not resizeable, so you have to scroll in the input box to see all the bytes :-( Make sure you get all 20 bytes, or you will be left trying to guess the first or last byte later on. (And make sure you keep the copy of the shared secret secure, as anyone with that shared secret can program a working YubiKey that will be functionally identical to your own. Printing it out and storing it somewhere safe would be better than storing it in plain text on the computers you are using KeePassXC on... and storing it inside KeePassXC creates a catch-22 situation!)

  • Double check your settings, then click on "Write Configuration" to store the secret key out to the attached YubiKey.

  • The YubiKey Personalisation Tool will want to write a "log" file (actually a .csv file), which will also *contain the secret key, so make sure you keep that log safe, or securely delete it.

  • Pull out the first YubiKey, and insert the next one. You should see a "YubiKey is removed" message then a "YubiKey is inserted" message. Click on "Write Configuration" for the next one. Repeat until you have programmed all the YubiKeys you want to be interchangeable for the Challenge-Response HMAC-SHA1 algorithm. (Two, kept separately, seems like the useful minimum, and three may well make sense.)

Configuring a KeePassXC to use password and YubiKey authentication

  • Insert one of the programmed YubiKeys

  • Open KeePassXC on an existing password database (or create a new one), and authenticate to it.

  • Go to Database -> Change Master Key.

  • Enter your Password twice (ie, so that the Password will be set back to the same password)

  • Tick "Challenge Response" as well (so that the Password and "Challenge Respone" are both ticked)

  • An option like "YubiKey (nnnnnnn) Challenge Response - Slot 2 - Press" should appear in the drop down list

  • Click the "OK" button

  • Save the password database

  • When prompted press the button on your YubiKey (which will allow it to use the YubiKey Challenge Response secret to update the database).

Accessing the KeePassXC database with password and YubiKey authentication

To test that this has worked, close KeePassXC (or at least lock the database), then open KeePassXC again. You will get a prompt for access credentials as usual, without any options ticked.

Verify that you can open the database using both the password and the YubiKey Challenge-Response, by typing in the password and ticking "Challenge Response" (after checking it detected the YubiKey) and then clicking on "OK". When prompted, click the button on your YubiKey, and the database should open. (KeePassXC seems to recognise that the Challenge-Response is needed if you have opened the database with the YubiKey and the YubiKey is present; but you will need to remember to also enter the password each time you authenticate. At least it will auto-select the Password as soon as you type one in. The first time around opening a specific database is just one additional box to tick, which is fairly easy to remember particularly if you use the same combination -- password and YubiKey Challenge-Response -- on all your databases.)

You can confirm that both the password and the YubiKey Challenge Response and required, by trying to authenticate just using the Password (enter Password, untick "Challenge Response", press OK), and by trying to authenticate just using the YubiKey (tick "Challenge Response", untick Password, press OK). In both cases it should tell you "Unable to open database" (the "Wrong key or database file is corrupt" really means "insufficient authentication" to recover the database encryption key in this case; they could perhaps more accurately say "could not decrypt master key" here, perhaps with a suggestion to check the authentication details provided).

If you have programmed multiple YubiKeys with the same Challenge-Response shared secret (and hopefully you have programmed at least two), be sure to check opening the database with each YubiKey to verify that they are programmed identically and thus are interchangable for opening the password database. It should open identically with each key (because they all share the same secret when you programmed the keys, and thus the Challenge Response values are identical).

If you have multiple databases that you want to protect with the YubiKey Challenge-Response method, you will need to go through the Database -> Change Master Key steps and verification steps for each one. It probably makes sense to change them all at the same time, to avoid having to try to remember which ones need the YubiKey and which ones do not.

Usability of KeePassXC with Password and YubiKey authentiction

Once you have configured KeePassXC for Password and YubiKey authentication, and opened the database at least once using the YubiKey, the usability is fairly good. Use:


to open a specific KeePassXC password database directly, and KeePassXC will launch with a window to authenticate to that password database. So long as one of the appropriate YubiKeys is plugged in, after a short delay (less time than it takes to type in your password) the YubiKey will be detected, and Challenge-Response selected. The you just type in your password as usual (which auto selects "Password" as well), hit enter (which auto-OKs the dialog), and touch your YubiKey when prompted.

One side effect of configuring your KeePassXC databases like this is that they are not able to be opened in other KeePass related tools, except maybe the Windows KeePass with the KeeChallenge plugin (which uses a similar method; I have not tested that). For desktop use, KeePassXC should work pretty much everywhere that is likely to be useful (modern Windows, modern macOS / OS X, modern Linux), as should the YubiKey, so desktop portability is fairly good. But, for instance, MiniKeePass (source, on the iOS App Store) will not be able to open the password database. Amongst other reasons, while the "camera connection kit" can be used to link a YubiKey to an iOS device, the YubiKey iOS HowTo points out that U2F, OATH-TOTP and Challenge-Response functionality will not work (and I found suggestions on the Internet this only worked with older iOS versions).

If access from a mobile device is important, then you may want to divide your passwords amongst multiple KeePass databases: a "more secure" one including the YubiKey Challenge-Response and a "less secure" one that only requires a password for compatibility. For instance it might make sense to store "low risk" website passwords in their own database protected only by a relatively short master password, and synchronise that database for use by MiniKeePass (using the DropBox app). But keep higher security/higher risk passwords protected by password and YubiKey Challenge-Response and only accessible from a desktop application (and not synchronised via DropBox to reduce exposure of the database itself).

It also looks like, in the UI, it should be possible to configure KeePassXC to require only the YubiKey Challenge-Response (no password), simply by changing the master key and only specifying YubiKey Challenge-Response. Since the Challenge-Response shared secret is fairly short (20 bytes, so 160 bits), secured only by that shared key, and the algorithm is known, that too would be a relatively low security form of authentication. Possibly again for "low value" passwords like random website logins with no real risk it might offer a more secure way to store per-website random passwords, ratehr than reusing the same password on the each website. But the combination of password and YubiKey Challenge-Response would be preferable for most password databases over the YubiKey Challenge-Response alone, even if the password itself was fairly short (eg under 16 characters).

ETA, 2018-01-11: KeePassXC has an official Ubuntu PPA which includes packages for all currently supported Ubuntu versions, including Ubuntu 16.04 LTS. To install:

sudo add-apt-repository ppa:phoerious/keepassxc
sudo apt-get update
sudo apt-get install keepassxc
Posted Sun Jul 23 15:34:28 2017 Tags:

Apple's Time Machine software, included with macOS for about the last 10 years is a service to automatically back up a computer to one or more external drives or machines. Once configured it pretty much looks after itself, usually keeping hourly/weekly/monthly snapshots for sensible periods of time. It can even rotate the snapshots amongst multiple targets to give multiple backups -- although it really wants to see every drive around once a week, otherwise it starts to regularly complain about no backups to a given drive, even when there are several other working backups. (Which makes it a poor choice for offline, offsite, backups which are not brought back onsite again frequently; full disk clones are better for that use case.)

More recent versions of Time Machine include local snapshots, which are copies saved to the internal drive in between Time Machine snapshots to an external target -- for instance when that external target is not available. This is quite useful functionality on, eg, a laptop that is not always on its home network or connected to the external Time Machine drive. These local snapshots do take up some space on the internal drive, but Time Machine will try to ensure there is at least 10% free space on the internal drive and aim for 20% free space (below that Time Machine local snapshots are usually cycled out fairly quickly, particularly if you do something that needs more disk space).

On my older MacBook Pro, the internal SSD (large, but not gigantic, for the time when it was bought, years ago) has been "nearly full" for a long time, so I have been regularly looking for things taking up space that that do not need to be on the internal hard drive. In one of these explorations I found that while Time Machine's main local snapshot directory was tiny:

ewen@ashram:~$ sudo du -sm /.MobileBackups
1       /.MobileBackups

as expected with an almost full drive causing the snapshots to be expired rapidly, there was another parallel directory which was surprisingly big:

ewen@ashram:~$ sudo du -sm /.MobileBackups.trash/
21448   /.MobileBackups.trash/

(21.5GB -- approximately 2-3 times the free space on the drive). When I looked in /.MobileBackups.trash/ I found a bunch of old snapshots from 2014 and 2016, some of which were many gigabytes each:

root@ashram:/.MobileBackups.trash# du -sm *
2468    Computer
412     MobileBackups_2016-10-22-214323
16824   MobileBackups_2016-10-24-163201
1746    MobileBackups_2016-10-26-084240
1       MobileBackups_2016-12-18-144553
1       MobileBackups_2017-02-05-125225
1       MobileBackups_2017-05-18-180448
root@ashram:/.MobileBackups.trash# du -sm Computer/*
1480    Computer/2014-06-08-213847
58      Computer/2014-06-15-122559
156     Computer/2014-06-15-162406
166     Computer/2014-06-29-183344
608     Computer/2014-07-06-151454
3       Computer/2016-10-22-174000

Some searching online indicated that this was a fairly common problem (there are many other similar reports). As best I can tell what is supposed to happen is:

  • /.MobileBackups is automatically managed by Time Machine to store local snapshots, and they are automatically expired as needed to try to keep the free disk space at least above 10%.

  • /MobileBackups.trash appears if for some reason Time Machine cannot remove a particular local snapshot or needs to start again (eg a local snapshot was not able to complete); in that case Time Machine will move the snapshot out of the main /.MobileBackups directory into /MobileBackups.trash directory. The idea is that eventually whatever is locking the files in the snapshot to prevent them from being deleted will be cleared, eg, by a reboot, and then /.MobileBackups.trash will get cleaned up. This is part of the reason for reboots being suggested as part of the resolution for Time Machine issues.

However there appears to be some scenarios where it is impossible to remove /.MobileBackups.trash, which just leads to them gradually accumulating over time. Some people report hundreds of gigabytes used there. Because /.MobileBackups.trash is not the main Time Machine Local Snapshots, it shows up as "Other" in the OS X Storage Report -- rather than "Backups". And of course if it cannot be deleted, it will not be automatically removed to make space when you need more space on the drive :-(

Searching for /.MobileBackups.trash in /var/log/system.log turned up the hint that Time Machine was trying to remove the directory, but being rejected:

Jul 18 16:31:36 ashram[852]: Failed to delete
/.MobileBackups.trash, error: Error Domain=NSCocoaErrorDomain
Code=513 "“.MobileBackups.trash” couldn’t be removed because you
don’t have permission to access it."
UserInfo={NSFilePath=/.MobileBackups.trash, NSUserStringVariant=(
), NSUnderlyingError=0x7feb82514860 {ErrorDomain=NSPOSIXErrorDomain
Code=1 "Operation not permitted"}}

(plus lots of "audit warning" messages about the drive being nearly full, which was the problem I first started with). There are some other references to that failure on OS X 10.11 (El Capitan), which I am running on the affected machine.

Based on various online hints I tried:

  • Forcing a full Time Machine backup to an external drive, which is supposed to cause it to clean up the drives (it did do a cleanup, but it was not able to remove /.MobileBackups.trash).

  • Disabling the Time Machine local snapshots:

    sudo tmutil disablelocal

    which is supposed to remove the /.MobileBackups and /.MobileBackups.trash directories; it did remove /.MobileBackups but could not remove /.MobileBackups.trash.

  • Emptying the Finder Trash (no difference to /.MobileBackups.trash)

  • Wait a while to see if it got automatically removed (nope!)

  • Forcing a full Time Machine backup to an external drive, now that the local Time Machine snapshots are turned off. That took ages to get through the prepare stage (the better part of an hour), suggesting it was rescanning everything... but it did not reduce the space usage in /.MobileBackups.trash in the slightest.

Since I had not affected /.MobileBackups.trash at all, I then did some more research into possible causes for why the directory might not be removable. I found a reference suggesting file flags might be an issue, but searching for the schg and uchg flags did not turn up anything:

sudo find /.MobileBackups.trash/ -flags +schg
sudo find /.MobileBackups.trash/ -flags +uchg

(uchg is the "user immutable" flag; schg is the "system immutable" flag). There are also xattr attributes (which I have used previously to avoid accidental movement of directories in my home directory), which should be visible as "+" (attributes) or "@" (permissions) when doing "ls -l" -- but in some quick hunting around I was not seeing those either (eg sudo ls -leO@ CANDIDATE_DIR).

I did explictly try removing the immutable flags recursively:

sudo chflags -f -R nouchg /.MobileBackups.trash
sudo chflags -f -R noschg /.MobileBackups.trash

but that made no obvious difference.

Next, after finding a helpful guide to reclaiming space from Time Machine Local snapshots I ensured that the Local Snapshots were off, then rebooted the system:

sudo tmutil disablelocal

followed by Apple -> Restart... In theory that is supposed to free up the /.MobileBackups.trash snapshots for deletion, and then delete them. At least when you do another Time Machine backup -- so I forced one of those after the system came back up again. No luck, /.MobileBackups.trash was the same as before.

After seeing reports that /.MobileBackups.trash could be safely removed manually, and having (a) two full recent Time Machine shapshots and (b) having just rebooted with the Time Machine Local Snapshots turned off, I decided it was worth trying to manaully remove /.MobileBackups.trash. I did:

sudo rm -rf "/.MobileBackups.trash"

with the double quotes included to try to reduce the footgun potential of typos (rm -rf / is something you very rarely want to do, especially by accident!).

That was able to remove most of the files, by continuing when it had errors, but still left hundreds of files and directories that it reported being unable to remove:

ewen@ashram:~$ sudo rm -rf "/.MobileBackups.trash"
rm: /.MobileBackups.trash/MobileBackups_2016-10-22-214323/Computer/2016-10-22-182406/Volume/Library/Preferences/SystemConfiguration: Operation not permitted
rm: /.MobileBackups.trash/MobileBackups_2016-10-22-214323/Computer/2016-10-22-182406/Volume/Library/Preferences: Operation not permitted
rm: /.MobileBackups.trash/MobileBackups_2016-10-22-214323/Computer/2016-10-22-182406/Volume/Library: Operation not permitted
rm: /.MobileBackups.trash/MobileBackups_2016-10-22-214323/Computer/2016-10-22-182406/Volume/private/var/db: Operation not permitted
rm: /.MobileBackups.trash/MobileBackups_2016-10-22-214323/Computer/2016-10-22-182406/Volume/private/var: Operation not permitted
rm: /.MobileBackups.trash/MobileBackups_2016-10-22-214323/Computer/2016-10-22-182406/Volume/private: Operation not permitted
rm: /.MobileBackups.trash/MobileBackups_2016-10-22-214323/Computer/2016-10-22-182406/Volume: Directory not empty

At least most of the disk space was reclaimed, with just 45MB left:

-=- cut here -=-
ewen@ashram:~$ sudo du -sm /.MobileBackups.trash/
45      /.MobileBackups.trash/
-=- cut here -=-

In order to get back to a useful state I then moved that directory out of the way:

sudo mv /.MobileBackups.trash /var/tmp/mobilebackups-trash-undeleteable-2017-07-18

and rebooted my machine again to ensure everything was in a fresh start state.

When the system came back up again, I tried removing various parts of /var/tmp/mobilebackups-trash-undeleteable-2017-07-18 with no more success. Since the problem had followed the files rather than the location I figured there had to be something about the files which prevented them from being removed. So I did some more research.

The most obvious is the Time Machine Safety Net, which is special protections around the Time Machine snapshots to deal with the fact that they create hard links to directories (to conserve inodes, I assume) which can confuse rm. The recommended approach is to use "tmutil delete", but while it will take a full path doing something like:

tmutil delete /var/tmp/mobilebackups-trash-undeleteable-2017-07-18/MobileBackups_2016-10-22-214323

will just fail with a report that it is an "Invalid deletion target":

ewen@ashram:/var/tmp$ sudo tmutil delete /var/tmp/mobilebackups-trash-undeleteable-2017-07-18/MobileBackups_2016-10-22-214323
/private/var/tmp/mobilebackups-trash-undeleteable-2017-07-18/MobileBackups_2016-10-22-214323: Invalid deletion target (error 22)
Total deleted: 0B

and nothing will be deleted. My guess is that it at least tries to ensure that it is inside a Time Machine backup directory.

Another approach suggested is to use Finder to delete the directory, as that has hooks to the extra cleanup magic required, so I did:

open /var/tmp

and then highlighted mobilebackups-trash-undeleteable-2017-07-18 and tried to do permanently delete it with Alt-Cmd-Delete. After a confirmation prompt, and some file counting, that failed with:

The operation can't be completed because an unexpected error occurred (error code -8072).

deleting nothing. Explicitly changing the problem directories to be owned by me:

sudo chown -R ewen:staff /var/tmp/mobilebackups-trash-undeleteable-2017-07-18

also failed to change anything.

There is an even lower level technique to bypass the Time Machine Safety Net, using a helper bypass tool, which on OS X 10.11 (El Capitan) is in "/System/Library/Extensions/TMSafetyNet.kext/Contents/Helpers/bypass". However running the rm with the bypass tool did not get me any further forward:

cd /var/tmp
sudo /System/Library/Extensions/TMSafetyNet.kext/Contents/Helpers/bypass rm -rf mobilebackups-trash-undeleteable-2017-07-18

failed with the same errors, leaving the whole 45MB still present. (From what I can tell online using the bypass tool is fairly safe if you are removing all the Time Machine snapshots, but can leave very incomplete snapshots if you merely try to remove some snapshots -- due precisely to the directory hard links which is the reason that the Time Machine Safety Net exists in the first place. Proceed with caution if you are not trying to delete everything!)

More hunting for why root could not remove files, turned up the OS X 10.11+ (El Capitan onwards) System Integrity Protection which adds quite a few restrictions to what root can do. In particular the theory was that the file had a restricted flag on it which means that only restricted processes, signed by Apple, would be able to modify them.

That left me with the options of either trying to move the files back somewhere that "tmutil delete" might be willing to deal with, or trying to override System Integrity Protection for long enough to remove the files. Since Time Machine had failed to delete the files, apparently for months or years, I chose to go with the more brute force approach of overriding System Integrity Protection for a while so that I could clean up.

The only way to override System Integrity Protection is to boot into System Recovery mode, and run "csrutil disable", then reboot again to access the drive with Sytsem Integrity Protection disabled. To do this:

  • Apple -> Restart...

  • Hold down Cmd-R when the system chimes for restarting, and/or the Apple Logo appears; you have started a Recovery Boot if the background stays black rather than showing a color backdrop prompting for your password

  • When Recovery mode boots up, use Utilities -> Terminal to start a terminal.

  • In the Terminal window, run:

     csrutil disable
  • Reboot the system again from the menus

When the normal boot completes and you log in, you are running without System Integrity Protection enabled -- the foot gun is now on automatic!

Having done that, OS X was happy to let me delete the left over trash:

ewen@ashram:/var/tmp$ sudo du -sm mobilebackups-trash-undeleteable-2017-07-18/
45      mobilebackups-trash-undeleteable-2017-07-18/
ewen@ashram:/var/tmp$ sudo rm -rf mobilebackups-trash-undeleteable-2017-07-18
ewen@ashram:/var/tmp$ ls mob*
ls: mob*: No such file or directory

so I had finally solved the problem I started with, leaving no "undeleteable" files around for later. My guess is that those snapshots happened to run at a time that captured files with restricted flags on them, which then could not be removed (at least once Time Machine had thrown them out of /.MobileBackups and into /.MobileBackups.trash). But it seems unfortunate that the log messages could not have provided more useful instructions.

All that was left was to put the system back to normal:

  • Boot into recovery mode again (Apple -> Restart...; hold down Cmd-R at the chime/Apple logo)

  • Inside Recovery Mode, re-enable System Integrity Protection, with:

    csrutil enable

    inside Utilities -> Termimal.

  • Reboot the system again from the menus.

At this point System Integrity Protection is operating normally, which you can confirm with the "csrutil status" command that you can run at any time:

ewen@ashram:~$ csrutil status
System Integrity Protection status: enabled.

(changes to the status can be made only in Recovery Mode).

Finally re-enable Time Machine local snaphots because on a mobile device it is a useful feature:

sudo tmutil enablelocal

and then force the first local snapshot to be made now to get the process off to an immediate start:

sudo tmutil snapshot

At which point you should have /.MobileBackups with a snapshot or two inside it:

root@ashram:~# ls -l /.MobileBackups/Computer/
total 8
-rw-r--r--  1 root  wheel  263 18 Jul 17:37 .mtm.private.plist
drwxr-xr-x@ 3 root  wheel  102 18 Jul 17:37 2017-07-18-173719
drwxr-xr-x@ 3 root  wheel  102 18 Jul 17:37 2017-07-18-173758

and if you look in the Time Machine Preferences Window you should see the line that it will create "Local snapshots as space permits".

Quite the adventure! But my system now has about three times as much free disk space as it did previously, which was definitely worth the effort.

Posted Wed Jul 19 17:55:22 2017 Tags: