Fundamental Interconnectedness

This is the occasional blog of Ewen McNeill. It is also available on LiveJournal as ewen_mcneill, and Dreamwidth as ewen_mcneill_feed.

This past week there has been a lot of hype about CVE-2016-10229 which seems to have been one of those "just a bug" bugs that later turned out to be exploitable. The description:

udp.c in the Linux kernel before 4.5 allows remote attackers to
execute arbitrary code via UDP traffic that triggers an unsafe
second checksum calculation during execution of a recv system call
with the MSG_PEEK flag.

implies that Linux versions before Linux 4.5 are vulnerable, which seems to have led to misleading things like Security Focus listing dozens of Linux versions as vulnerable.

But according to the author of the patch, "Whoever said that linux [before] 4.5 was vulnerable made a mistake", and only kernels which had Linux kernel git commit 89c22d8c3b278212eef6a8cc66b570bc840a6f5a backported need the fix, which is in Linux kernel git commit 197c949e7798fbf28cfadc69d9ca0c2abbf93191. The fix was created in late 2015, and applied to the main Linux git repository in early 2016.

Debian patched CVE-2016-10229 before there was any CVE assigned, as a result of Debian Bug #808293 where UDP in IPv6 did not always work correctly. The fix was released in, eg, Debian Linux kernel 3.2.73-2+deb7u2 (for Debian Squeeze):

ewen@debian-squeeze:~$ zgrep -A 18 3.2.73-2+deb7u2 /usr/share/doc/linux-image-3.2.0-4-686-pae/changelog.Debian.gz | egrep "udp|808293|-- |^ *$"

  * udp: properly support MSG_PEEK with truncated buffers
    (Closes: #808293, regression in 3.2.72)

 -- Ben Hutchings [ omitted...]  Sat, 02 Jan 2016 03:31:22 +0000

in January 2016, which means that Debian Squeeze has not been vulnerable since very early 2016.

Ubuntu patched CVE-2016-10229 before there was any CVE assigned, as a result of Ubuntu Bug #1527902, as a result of different symptoms but referencing the Debian Bug and the net-next patch that got committed above. For Ubuntu 14.04 the patch was released in 3.13.0-79.123; which is so long ago that the installed changlogs do not even include that release in the installed changelog.Debian.gz. The full Linux Trusty kernel changelog does not have a date for, but it must have been released at least by Monday 2016-02-22 when 3.13.0-80.124 was released (the next release). So Ubuntu has also been fixed since early 2016.

Redhat Linux never included CVE-2016-10229, due to not backporting the vulnerable code, so they have never been vulnerable. And it appears that Debian and Ubuntu were vulnerable for only a few Linux kernel releases before realising they had a regression and fixing them.

At this point it would be difficult to be running a modern server-Linux distribution and not have been not-vulnerable to CVE-2016-10229 for over a year, assuming you ever install patches. Which means no rush-patching is required. (Rather like last month's Microsoft MS17-010 SMB fixes turned out to patch the bugs in the Shadow Brokers Release that were not already patched, and was released weeks before the Shadow Brokers Release. Pro Tip: Stop using SMB1!)

So why the hype now? As best I can tell it is because Android only just patched CVE-2016-10229 this month, and called it out as a security issue whereas no one else had. That plus the imprecise CVE-2016-10229 description "udp.c in the Linux kernel before 4.5 allows remote attackers to execute arbitrary code via UDP" seems to have caused all the noise.

It probably did not help that the Register, Reddit, and Hacker News describe it as patched "earlier this year", or "in Jan/Feb 2017" or "a while ago", without pointing out that it has been patched for around 14-15 months (early 2016, weeks after being introduced) in most non-Android locations. Plus of course the brokenness of the Android security update eco-system (most handsets are patched via a chain of Google, phone manufacturer and/or telco -- and many fixes do not make it through that chain to devices in real world use -- which leads to a lot of non-patchable devices).

Sometimes Linus Torvalds's "So I personally consider security bugs to be just "normal bugs"" does pay off; this bug was mostly fixed as a regression (except by Android who were a year late to the party). But it seems like the lack of CVE identifiers being back-tagged onto older bugs that were fixed, combined with a lack of research by journalists, leads to more hype when the security risks (rather than just regressions) are later realised.

At least CVE-2016-10229 did not have a vanity website.

Posted Mon Apr 17 11:35:25 2017 Tags:

Recently I ordered a Synology DS216+ II Linux based NAS with two 6GB WD60EFRX (WD Red NAS) drives, as an "end of (business) year" special. I had been considering buying a NAS for a while as I have lots of data collected over years from many different computers scattered over lots of drives (including several copies of that data), and having a definitive central copy of that data would make things a lot easier. My other hope is to finally get rid of the attached external drive by my main workstation (which has been full for a while anyway), as that is the loudest thing near my work area (at least when it spins up; and the drive spin up causes annoying disk IO pauses even on things that should in theory just need the internal SSD).

I went with Synology because I have friends who have used them for years, and know that I can get a ssh connection into them to check things. In addition the data recovery options for getting data off the disks elsewhere are pretty good -- it is Linux mdadm and lvm under the hood. The DS216+ II happened to be one on sale, and the bundle turned out to be not that much more expensive (on sale) than buying a DS216j and the drives separately -- so the better RAM and CPU specifications seemed worth the small extra cost, and hot swapable drives is also a useful addition (the DS216j requires opening the case with a screw driver).

The single Gigabit Ethernet of both models was not a major limitation for me, as my use case is basically "single user", and each of the client machines also has only Gigabit Ethernet (or less); it is very rare I'm using more than one of those client machines at a time. (Besides the 100MB/s maximum of a single Gigabit Ethernet is still faster than the USB2 speed of older drive attachments, around 48 MB/s due to 480 Mbps -- and, eg, the external drive on my main desktop is USB2 attached due to that being what is available on the Apple Thunderbolt Cinema Display monitor I have.) The 6GB WD Red NAS drives were basically chosen based on price/capacity being reasonable, and expecting to only use 3-4GB in the immediate future. (Only WD Red NAS drives were available in the bundle, but I would probably have chosen them anyway.)

Because the DS216+ was ordered as a bundle it arrived with the drives pre-installed, and a note attached to check that they were still properly inserted. It also appears to have been delivered with DSM (Disk Station Manger) pre-installed on the drives -- DSM 6.1-15047 to be precise -- which means that I did not have to go through some of the setup steps. But it also meant that it has been preinstalled with some defaults that I did not necessarily want -- so I chose to delete the disk volume and start again (given that they apparently cannot be shrunk, and I do want to leave space for more than one volume at this stage).

Out of the box, the DS216+ found an IP address with DHCP, and then was reachable on http://IP:5000/ and also on http://diskstation.local:5000/ -- the latter being found by Multicast DNS (mDNS)/Bonjour. They default username was admin, and it appears if you do not complete all the setup the default password is no password (ie, enter admin and then just press enter for the password).

My first setup step was to assign a static DHCP lease for the DS216+ MAC address, and a DNS name, so that I could more easily find it (nas01 in my local domain). The only way I could find to persuade the DS216+ to switch over to the new IP address was to force it to restart ("person" icon -> Restart).

Once that was done, it seemed worth updating to the latest DSM, which is currently 6.1-15047-2 that appears to just have some bug fixes for 6.1-15047. To do that in the DSM interface (http://nas01:5000/) go to the Control Panel -> Update & Restore, and it should tell you that a new DSM is available and offer to download it. Clicking on Download will download the software off the Internet, and when that finishes clicking on "Update Now" will install the update. After the "are you sure you want to do this now" prompt, warning you that it will reset the DS216+, the update will start and then the DS216+ will restart. It said it would take up to 10 minutes, but actually took about 2 minutes (presumably at least in part due to being a minor software update).

The other "attention needed" task was an update in the Package Center, which is Synology's "app store". It needed me to agree to the Package Center Terms of Service, and then I could see there was an update to the "File Station" application which I assume is in the default install. I also updated that at this point (by clicking on "Update", which seemed to do everything fairly transparently).

At this point it also seemed useful to create a user for myself, and set the "admin" password to something longer than an empty string. Both are done in the Control Panel -> User area. There are a lot of options in the new user creation (around volume access, and quotas), but I left them all at the default other than putting my user into the administrators group so that it could be used via ssh.

With the user/passwords set up, I could ssh into the DS216+ (since ssh seemed to be on by default):

ssh nas01

and look around at how things were set up out of the box.

The DS216+ has a Linux 3.10 kernel:

ewen@nas01:/$ uname -a
Linux nas01 3.10.102 #15047 SMP Thu Feb 23 02:23:28 CST 2017 x86_64 GNU/Linux synology_braswell_216+II

with a dual core Intel N3060 CPU:

ewen@nas01:/$ grep "model name" /proc/cpuinfo
model name  : Intel(R) Celeron(R) CPU  N3060  @ 1.60GHz
model name  : Intel(R) Celeron(R) CPU  N3060  @ 1.60GHz

The two physical hard drives appear as SATA ("SCSI") disks, along with what looks like a third internal disk:

ewen@nas01:/$ cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: WDC      Model: WD60EFRX-68L0BN1         Rev: 82.0
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi1 Channel: 00 Id: 00 Lun: 00
  Vendor: WDC      Model: WD60EFRX-68L0BN1         Rev: 82.0
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi3 Channel: 00 Id: 00 Lun: 00
  Vendor: Synology Model: DiskStation              Rev: PMAP
  Type:   Direct-Access                    ANSI  SCSI revision: 06

On the first two disks there are three Linux MD RAID partitions:

ewen@nas01:/$ sudo fdisk -l /dev/sda
Disk /dev/sda: 5.5 TiB, 6001175126016 bytes, 11721045168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: FB4736A9-5AAF-4D25-905D-97A8A8035FC2

Device       Start         End     Sectors  Size Type
/dev/sda1     2048     4982527     4980480  2.4G Linux RAID
/dev/sda2  4982528     9176831     4194304    2G Linux RAID
/dev/sda5  9453280 11720838239 11711384960  5.5T Linux RAID

ewen@nas01:/$ sudo fdisk -l /dev/sdb
Disk /dev/sdb: 5.5 TiB, 6001175126016 bytes, 11721045168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 1322083B-9F26-47B5-825A-56C09FAB9C39

Device       Start         End     Sectors  Size Type
/dev/sdb1     2048     4982527     4980480  2.4G Linux RAID
/dev/sdb2  4982528     9176831     4194304    2G Linux RAID
/dev/sdb5  9453280 11720838239 11711384960  5.5T Linux RAID

which are then joined together into three Linux MD software RAID arrays, using RAID 1 (mirroring):

ewen@nas01:/$ cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid1 sda5[0] sdb5[1]
      5855691456 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
      2097088 blocks [2/2] [UU]

md0 : active raid1 sda1[0] sdb1[1]
      2490176 blocks [2/2] [UU]

unused devices: <none>

The first is used for the root file system:

ewen@nas01:/$ mount | grep md0
/dev/md0 on / type ext4 (rw,relatime,journal_checksum,barrier,data=ordered)

The second is used as a swap volume:

ewen@nas01:/$ grep md1 /proc/swaps
/dev/md1                                partition   2097084 0   -1

and the third is used for LVM:

ewen@nas01:/$ sudo pvdisplay
  --- Physical volume ---
  PV Name               /dev/md2
  VG Name               vg1000
  PV Size               5.45 TiB / not usable 704.00 KiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              1429612
  Free PE               0
  Allocated PE          1429612
  PV UUID               mcSYoC-774T-T6Qj-bk1g-juLe-bqfi-cPRBCS


By default there is one volume group:

ewen@nas01:/$ sudo vgdisplay
  --- Volume group ---
  VG Name               vg1000
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               1
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               5.45 TiB
  PE Size               4.00 MiB
  Total PE              1429612
  Alloc PE / Size       1429612 / 5.45 TiB
  Free  PE / Size       0 / 0
  VG UUID               Qw9A2i-F3aQ-txow-XUIk-OP6o-pVCf-sIsz1g


with a single volume in it:

ewen@nas01:/$ sudo lvdisplay
  --- Logical volume ---
  LV Path                /dev/vg1000/lv
  LV Name                lv
  VG Name                vg1000
  LV UUID                KRcrco-cOGl-gdOt-GVJ7-IWvc-jogO-ZqyA4G
  LV Write Access        read/write
  LV Creation host, time ,
  LV Status              available
  # open                 1
  LV Size                5.45 TiB
  Current LE             1429612
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     4096
  Block device           253:0


I believe this is the result of going through the default setup process and choosing a "quick" volume -- resulting in a single volume on RAID. This appears to result in the single data RAID 1, with a single LVM volume group and logical volume -- and not be possible to shrink, or turn into a multi-volume setup without adding hard drives, which obviously is not possible in a two drive chassis.

After some reading, my aim is a SHR (Synology Hybrid RAID)/RAID 1 disk group, with about a 3.5TB disk volume for the initial storage, and the rest left for future use (either expanding the existing volume or, eg, presenting as iSCSI LUNs). In the case of a two drive system Synology Hybrid RAID is basically just a way to say "RAID 1", but possibly having it recorded on disk that way would allow transferring the disks to a larger (more drive bays) unit later on.

That 3.5TB layout is chosen knowing that the recommended Time Machine Server setup is to use a share out of a common volume, with a disk quota to limit the maximum disk usage -- rather than a separate volume, which was my original idea. (The DS216+ can also create a file-backed iSCSI LUN, but the performance is probably not as good, so I would rather keep my options open to have more than one volume.)

The DS216+ II (unlike the DS216j) will support btrfs as a local file system (on wikipedia), which is a Linux file system that has been "in development" for about 10 years, designed to compete with the ZFS file system originally developed by Sun Microsystems. Historically btrfs has been fairly untrusted (with multiple people reporting data loss in the early years), but it has been the default file system for SLES 12 since 2014, and it is also now the default file system for the DS216+. Apparently btrfs is also heavily used at Facebook. The stability of btrfs appears to depend on the features you need, with much of the core file system functionality being listed as "OK" in recent kernels -- which is around Linux 4.9 at present, about 4 years newer than the Linux 3.10 kernel, presumably with many patches, running on the DS216+. (Hopefully missing some or all of those 4 years of development does not cause btrfs stability issues...)

Since the btrfs metadata and data checksums seem useful in a large file system, and the snapshot functionality might be useful, I decided to stick with the Synology DS216+ default of btrfs. Hopefully the older Linux kernel (and thus older btrfs code) does not bite me! (The "quotas for shared folders" are also potentially useful, eg, for the Time Machine Server use case.)

Given that there is no way to (a) shrink a volume that I could find, and (b) no way to convert a volume to a disk group (without adding disks, that I cannot do), my next step was to delete the pre-configured, empty, volume so that I could start the disk layout again. To do this go to the main menu -> Storage Manager -> Volume, and choose to remove the volume.

There were two confirmation prompts -- one to remove the volume, and one "are you sure" warning that data will be deleted, and services will restart. Finally it asked for the account password before continuing, which is a useful verification step for such a destructive action (although you do have to remember which user you used to log in, and thus which password applies -- there does not seem to be anything displaying the logged in user).

The removal process is very thorough -- after removal there is no LVM configuration left on the system, and the md2 RAID array is removed as well:

ewen@nas01:/$ sudo lvdisplay
ewen@nas01:/$ sudo vgdisplay
ewen@nas01:/$ sudo pvdisplay
ewen@nas01:/$ cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active raid1 sda2[0] sdb2[1]
      2097088 blocks [2/2] [UU]

md0 : active raid1 sda1[0] sdb1[1]
      2490176 blocks [2/2] [UU]

unused devices: <none>

so you are effectively back to a bare system, which amongst other things will mean that the RAID array gets rebuilt from scratch. (I had sort of hoped to avoid that for time reasons -- but at least forcing it to be rebuilt will also force a check of reading/writing the disks, which is a useful step prior to trusting it with "real" data.)

Once you are back to an empty system, it is possible to go back through the volume creation wizard and choose "custom" and "multi-volume", but I chose to explicitly create the Disk Group first, by going to Storage Manager -> Disk Group, and agreeing to use the two disks that it found. There was a warning that all data on the disks would be erased, and then I could choose the desired RAID mode -- I choose Synology Hybrid RAID (SHR) to leave my options open, as discussed above. I also chose to perform the optional disk check given that these are new drives which I have not tested before. Finally it wanted a description for the disk group, which I have called "shr1". (An example with pictures.)

Once that was applied (which took a few seconds as described in the wizard) there was a new md2 raid partition on the disk, which was rebuilding:

ewen@nas01:/$ cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid1 sdb5[1] sda5[0]
      5855691456 blocks super 1.2 [2/2] [UU]
      [>....................]  resync =  0.0% (2592768/5855691456) finish=790.1min speed=123465K/sec

md1 : active raid1 sda2[0] sdb2[1]
      2097088 blocks [2/2] [UU]

md0 : active raid1 sda1[0] sdb1[1]
      2490176 blocks [2/2] [UU]

unused devices: <none>

as well as new LVM physical volumes and volume groups:

ewen@nas01:/$ sudo pvdisplay
  --- Physical volume ---
  PV Name               /dev/md2
  VG Name               vg1
  PV Size               5.45 TiB / not usable 704.00 KiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              1429612
  Free PE               1429609
  Allocated PE          3
  PV UUID               l03e6f-X3Wa-zGsW-a6yo-3NKG-5YI9-5ghHit

ewen@nas01:/$ sudo vgdisplay
  --- Volume group ---
  VG Name               vg1
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  2
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                1
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               5.45 TiB
  PE Size               4.00 MiB
  Total PE              1429612
  Alloc PE / Size       3 / 12.00 MiB
  Free  PE / Size       1429609 / 5.45 TiB
  VG UUID               RjMnEQ-IKst-3N2V-3vJb-s8GE-15RO-qQOdOc


And to my surprise there was even a small LVM logical volume:

ewen@nas01:/$ sudo lvdisplay
  --- Logical volume ---
  LV Path                /dev/vg1/syno_vg_reserved_area
  LV Name                syno_vg_reserved_area
  VG Name                vg1
  LV UUID                4IdgrT-c5A6-3IOo-6Tq6-3rej-9nL9-i2SQou
  LV Write Access        read/write
  LV Creation host, time ,
  LV Status              available
  # open                 0
  LV Size                12.00 MiB
  Current LE             3
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     384
  Block device           253:0


That syno_vg_reserved_area volume seems to appear in other installs too, but I do not know what it is used for (other than perhaps as a marker that there is a "real" Disk Group and multiple volumes).

Since even once the MD RAID 1 rebuild picked up to full speed:

ewen@nas01:/$ cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid1 sdb5[1] sda5[0]
      5855691456 blocks super 1.2 [2/2] [UU]
      [>....................]  resync =  0.6% (40464000/5855691456) finish=585.6min speed=165497K/sec

md1 : active raid1 sda2[0] sdb2[1]
      2097088 blocks [2/2] [UU]

md0 : active raid1 sda1[0] sdb1[1]
      2490176 blocks [2/2] [UU]

unused devices: <none>

it was going to take about 10 hours to finish the rebuild, I left the DS216+ to its own devices overnight before carrying on.

As an aside, since there is an implicit "Disk Group" (RAID set, LVM volume group) even in the "One Volume" case, it is not obvious to me why the Synology DSM chose to also delete the original "Disk Group" (RAID set) when the single volume was deleted -- it could have just dropped the logical volume, and left the RAID alone, saving a lot of disk IO. Possibly the quick setup should more explicitly create a Disk Group, so a more easy transition becomes an obvious option, rather than retaining what appears to be two distinct code paths.

By the next morning the RAID array had rebuilt. I then forced an extended SMART disk check on each disk in turn by going to Storage Manager -> HDD/SSD -&gt, highlighting the disk in question, and clicking on "Health Info", then setting up the test in the "S.M.A.R.T Test" tab. Each Extended Disk Test took about 11 hours, which I left running while doing other things. I did them approximately one at a time, so that the DS216+ RAID array could still be somewhat responsive -- but ended up with a slight overlap as I started the second one just before going to bed, and the first one had not quite finished by then. (It turns out that I got a bonus second extended disk check on the first disk, because there is a Smart Test scheduled to run once a week on all disks starting at 22:00 on Saturday -- and that must have kicked in on the first disk minutes after the one I manually started in the morning finished, but of course by then the manual one on the second disk was already running.)

The results of the S.M.A.R.T tests are visible in the "History" tab of the "Health Info" page for each drive (in Storage Manager -> HDD/SSD), and I also checked them via the ssh connection:

ewen@nas01:/$ sudo smartctl -d ata -l selftest /dev/sda
smartctl 6.5 (build date Feb 14 2017) [x86_64-linux-3.10.102] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke,

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%       105         -
# 2  Extended offline    Completed without error       00%        92         -
# 3  Short offline       Completed without error       00%        63         -
# 4  Extended offline    Completed without error       00%        42         -

ewen@nas01:/$ sudo smartctl -d ata -l selftest /dev/sdb
smartctl 6.5 (build date Feb 14 2017) [x86_64-linux-3.10.102] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke,

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%       105         -
# 2  Short offline       Completed without error       00%        63         -
# 3  Extended offline    Completed without error       00%        42         -


just to be sure I knew where to find them later. (That also reveals that there was an Extended test and a short test done before the drives were shipped to me; presumably by the distributor of the "DS216+ and drives" bundle.)

Once that was done, I created a new Volume to hold the 3.5TB of data that I had in mind originally, leaving the remaining space for future expansion. Since there was already a manually created Disk Group, the Storage Manager -> Volume -> Create process automatically selected a Custom setup (and Quick was greyed out). It also automatically selected Multiple Volumes on RIAD (and Single Volume on RAID was greyed out), and "Choose an existing Disk Group" (with "Create a new Disk Group" being greyed out) since there are only two disks in the DS216+ both used in the Disk Group created above.

It told me there was 5.45TB available, which is about right for "6" TB drives less some overhead for the DSM software install (about 4.5GB AFAICT -- 2.4GB for root on md0 and 2GB for swap on md1). As described above I chose btrfs for the disk volume, and then 3584 GB (3.5 * 1024) for the size (out of a maximum 5585 GB available, so leaving roughly 2TB free for later use). For the description I used "Shared data on SHR1" (it appears to be used only within the web interface and editable later). After applying the changes there was roughly 3.36 TiB available in the volume (with 58.7MB used by the system -- I assume file system structure) -- and a /dev/vg1/volume_1 volume created in the LVM:

ewen@nas01:/$ sudo lvdisplay
  --- Logical volume ---
  LV Path                /dev/vg1/syno_vg_reserved_area
  LV Name                syno_vg_reserved_area
  VG Name                vg1
  LV UUID                4IdgrT-c5A6-3IOo-6Tq6-3rej-9nL9-i2SQou
  LV Write Access        read/write
  LV Creation host, time ,
  LV Status              available
  # open                 0
  LV Size                12.00 MiB
  Current LE             3
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     384
  Block device           253:0

  --- Logical volume ---
  LV Path                /dev/vg1/volume_1
  LV Name                volume_1
  VG Name                vg1
  LV UUID                J9FKic-QYdA-mTCK-W01z-dO7V-GDk6-JD41mC
  LV Write Access        read/write
  LV Creation host, time ,
  LV Status              available
  # open                 1
  LV Size                3.50 TiB
  Current LE             917504
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     4096
  Block device           253:1


which shows a 3.5TiB volume. There is 1.95TiB left:

ewen@nas01:/$ sudo vgdisplay
  --- Volume group ---
  VG Name               vg1
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               1
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               5.45 TiB
  PE Size               4.00 MiB
  Total PE              1429612
  Alloc PE / Size       917507 / 3.50 TiB
  Free  PE / Size       512105 / 1.95 TiB
  VG UUID               RjMnEQ-IKst-3N2V-3vJb-s8GE-15RO-qQOdOc


for future expansion (either of that volume, or creating new volumes).

The new volume was automatically mounted on volume1:

ewen@nas01:/$ mount | grep vg1-volume_1
/dev/mapper/vg1-volume_1 on /volume1 type btrfs (rw,relatime,synoacl,nospace_cache,flushoncommit_threshold=1000,metadata_ratio=50)

ready to be used (eg, by creating shares).

From here I was ready to create shares for the various data that I wanted to store, which I will do over time. It appears that thanks to choosing btrfs I can have quotas on the shares as well as the users, which may be useful for things like Time Machine backups.

ETA, 2017-04-23: Some additional file sharing setup:

  • In Control Panel -> File Services -> SMB/AFP/NFS -> SMB -> Advanced Settings, change the Maximum SMB protocol to "SMB3" and the Minimum SMB Protcocol to "SMB2" (Pro Tip: Stop using SMB1!)

  • Also in Control Panel -> File Services -> SMB/AFP/NFS -> SMB -> Advanced Settings, tick "Allow symbolic links within shared folders"

  • In Control Panel -> File Services -> SMB/AFP/NFS -> NFS, tick "Enable NFS" to simplify automatic mounting from Linux systems without passwords. Also tick "Enable NFSv4 support" to allow NFSv4 mounting, which allows more flexiblity around authentication and UID/GID mapping than earlier NFS versions (earlier NFS versions basically assumed you had a way to enforce the same UID/GID enterprise wide, via NIS, LDAP or similar).

Once that is done, new file shares can be created in Control Panel -> Shared Folder -> Create. With btrfs you also get an Advanced -> "Enable advanced data integrity protection" which seems to be on by default, and is useful to have enabled. If you do not want a #recycle directory in your share it is best to untick the "Enable Recycle Bin" option on the first page (that seems most useful on shares intended for Microsoft Windows Systems, and an annoying top level directory anywhere else).

Once the shared folder is created you can grant access to users/groups, and if NFS is turned on you can also grant access to machines (since NFS clients are authenticated by IP) in the "NFS Permissions" tab. Obviously you then have all the usual unix UID/GID issues after that if you are using NFS v3 or NFSv4 without ID mapping, and do not have synchronised UID/GID values across your whole network (which I do not, not least of which is because the Synology DS216+ makes up its own local uid values).

I had hoped to get NFS v4 ID mapping working, by setting the "NFSv4 domain" to the same string on the Synology DS216+ and the clients (on the Synology it appears to default to an empty string; on Linux clients it effectively defaults to the DNS domain name). But even setting both of those (in /etc/idmapd.conf on Linux) did not result in idmapping happening :-( As best I can tell this is because Linux defaults to sec=sys for NFSv4 mounts, the Synology DS216+ default to AUTH_SYS (which turns into sec=sys) for NFS shares, and UID mapping does not happen with sec=sys, because what is passed over the wire is still NFS v3 style UID/GID. (See confirmation from Linux NFS Maintainer that this is intended by modern NFS; the same confirmation can be found in RFC 7530.) Also of note, in sec=sys (AUTH_SYS) NFS UID/GID values are used for authentication, even if file system UID/GID mapping is happening for what is displayed, which causes confusion. (From my tests no keys appear in /proc/keys, indicating no ID mappings are being created.)

There is no UID/GID mapping because of /sys/module/nfs/parameters/nfs4_disable_idmapping being set to "Y" by default on the (Linux) client, and /sys/module/nfsd/parameters/nfs4_disable_idmapping being set to "Y" by default on the Synology DS216+. Which is change from 2012 to the client, and another change from 2012 for the server, apparently for backwards compatiblity with NFS v3. These changes appear to have landed in Linux 3.4; and both my Linux client and the Synology have Linux kernels greater than 3.4.

The idea seems to be that if the unix UID/GID (ie, AUTH_SYS) are used for authentication then they should also be used in the file system, as happened in NFS v3 (to avoid files being owned by nobody:nogroup due to mapping failing). The default is thus to disable the id mapping at both ends in the sec=sys / AUTH_SYS case. It is possible to change the default on the Linux client (eg, echo "N" into /sys/module/nfs/parameters/nfs4_disable_idmapping), but I cannot find a way to persistently change it on the Synology DS216+. Which means that NFS v4 id mapping can really only be used with Kerberos-based authentication :-( (In sec=sys mode, you can see the UID/GID going over the wire, so idmap does not work. This is mostly a NFS, and NFS v4 in particular, issue rather than a Synology NAS issue as such.)

Anyway effectively this means that in order to use the UID/GID mapping in NFS v4, you need to set up kerberos authentication, and then presumably add those Kerberos keys into the Synology DS216+ in Control Panel -> File Services -> SMB/AFP/NFS -> NFS -> Advanced Settings -> Kerberos Settings, and set up the ID mapping. All of which feels like too much work for now. (It seems other Synology users wish UID/GID mapping worked without Kerberos too; it is unfortunate there is no UID:UID mapping option available as a NFS translation layer, but that is not the approach taken by NFS v4. The only references I find to a NFS server with UID:UID mapping was the old Linux user-mode NFS server with map_static, which is no longer used, and thus not available on a Synology NAS.)

It is possible to set NFS "Squash: Map all users to admin" to create effectively a single UID file share, which is sufficient for some of my simple shares (eg, music), so that is what I have done for now. (See a simple example with screenshots and another example with screenshots; see also Synology notes on NFS Security Flavours.)

Setting "Squash: Map all users to admin" in the UI, turns into all_squash,anonuid=1024,anongid=100 in /etc/exports:

ewen@nas01:/$ sudo cat /etc/exports; echo


and results in files that are owned by uid 1024, and gid 100 no matter which user created them. Which I could then mount on my Linux client with:

ewen@client:~$ sudo mkdir /nas01
ewen@client:~$ sudo mkdir /nas01/music
ewen@client:~$ sudo mount -t nfs -o hard,bg,intr,rsize=65536,wsize=65536  nas01:/volume1/music /nas01/music/

and then look at with:

ewen@client:~$ ls -l /nas01/music/
total 0
drwxrwxrwx 1 1024 users 1142 Sep 10  2016 flac

For my network that is mostly acceptable for basic ("equal access for all") file shares, as gid 100 is "users" on my Linux machines, and thus most machines have my user in that group. (Unfortuantely there is no way in the UI to specify that all access should be squashed to a specific user-specified uid, or I would squash them to my own user in these simple cases. There is also no apparent way to assign uids to the Synology DS216+ users when they are created, so pesumably the only way to set the UIDs of users is by having them supplied by a directory server like LDAP.)

The main issue I notice (eg, with rsync) is that attempts to chown files as root or chgrp files as root fail with "Invalid argument" so this will not work for anything requiring "root" ownership. (I found this while rsyncing music onto the share, but all the instances of music files owned by root are mistakes, so I fixed them at the source and re-ran rsync.)

For more complicated shares I probably either need to use SMB mounts, with appropriate username/password authentication to get access to the share as that user (which also effectively results in single-user access to the share, but will properly map the user values for the user I am accessing as). Or to dedicate the NFS share to a single machine, in which case it can function without ID mapping, as the file IDs will be used only by that machine.

Note that on OS X cifs:// forces SMBv1 over TCP/445, and we turned SMBv1 off above -- so use smb:// to connect to the NAS from OS X Finder (Go -> Connect to Server... (Apple-K)), which will use SMB 2.0 since OS X 10.9 (Mavericks). (CIFS is rarely used these days; instead SMB2 and SMB3 are used, which also work over TCP/445; TCP/445 was one of the original dinstinguishing things of the original Microsoft CIFS implemenation. By contrast the Linux kernel "CIFS" client supports SMB 2.0 since Linux 3.7, so Linux has hung onto the CIFS name longer than other systems; it now supports CIFS, SMB2, SMB2.1 and SMB3, which was implemented by the Samba team.)

On a related note, while testing git-annex on a SMB mount I encountered a timeout, so I ended up installing a later version of git-annex. That allowed git annex init to complete, but transferring files around still failed with locking issues. (Possibly the ssh shell, and git server application for the Synology NAS provides another path to getting git annex working? See example of using git server application. Or using that plus a stand alone build of git-annex on the Synology NAS. Another option is the git annex rsync special remote, but that is content only and I think might only have the internal (SHA hash) filenames.)

Posted Sun Apr 16 11:48:21 2017 Tags:


A couple of months ago I bought a Numato Mimas v2 with the intention of running MicroPython on it.

Today, with a bit of guess work, a lot of CPU time, and some assistance from the #upy-fpga channel on FreeNode I managed to get it going. Below are my notes on how to get MicroPython on FPGAs running on my Numato Mimas v2. This project is very much a work in progress (I am told multiple people were working on it this weekend), so if you are following this guide later I would definitely suggest seeking out updated instructions.


  • Ubuntu 16.04 LTS x86_64 system

  • Numato Mimas V2 Spartan6 FPGA board, with set up to be able to upload "gateware" to the FPGA board (there is also a copy of installed as part of this envvironment setup below which is used for flashing the MicroPython FPGA gateware).

  • USB A to USB Mini B cable, to connect Numato Mimas V2 to the Ubuntu 16.04 LTS system.

  • Xilinx ISE WebPACK installed, reachable from /opt/Xilinx (or optionally installed within the Xilinx directory inside your build directory).

Before you begin it would be a very good idea to check that the Numato Mimas v2 sample.bin example will run on your Mimas v2, and that you can successful replace it with a program of your own (eg, the Numato tutorial synthesis example).

Building the gateware

The "gateware" consists of the compiled FPGA definitions of the soft CPU (lm32) and peripheral devices that you need. For MicroPython a relatively small "base" set is sufficient.


Clone the upy-fpga-litex-gateware repository, which originated with the HDMI2USB project (hence the dependencies listed):

git clone

Install the relevant bits of the environment in two parts, firstly as root:

cd upy-fpga-litex-gateware
sudo bash -x scripts/

which will install dozens of packages as direct or indirect dependencies, including some from Tim Ansell's Ubuntu PPA.

And then as a non-root user (eg, your own user) for the remaining parts:

cd upy-fpga-litex-gateware
bash -x scripts/

which will download a bunch more packages, and execute/install them. Among other things it installs lm32 cross build tools (binutils, gcc, etc), as pre-built binary tools. The install process is managed with conda, a Python environment management tool. (It currently actually installs a Python 3.6 environment, and then downgrades it to Python 3.5.1 for compatability, as well as a lot of other old tools.)

It also automatically git clones the relevant Enjoy Digital litex git modules, as listed in the README.

The environment install process will take several minutes, mostly depending on the download speed.


From a terminal which has not entered the Xilinx ISE WebPACK environment, set the desired PLATFORM and TARGET to select what will be built, then enter the upy-fpga environment:

cd upy-fpga-litex-gateware
source scripts/

All going well, it should do some checking, report the directories being used, and then change the prompt to include the PLATFORM and TARGET values. Eg,

(H2U P=mimasv2 T=base R=nextgen)

make help will show you the valid PLATFORM and TARGET values, but cannot be run until after scripts/ has been done; in theory you can change PLATFORM and TARGET after entering the environment, but it might be safest to start with a fresh terminal. (README.targets has some information on the possible TARGET values.)

From there, you can build the "gateware" for your selected PLATFORM/TARGET combination with:

make gateware

which will result in a lot of output, most of it from the Xilinx ISE WebPACK tools. This step will also take a few minutes, and will keep your CPU pretty busy. All going well you should end up with a build/mimasv2_base_lm32/gateware/top.bin file which is the system on a chip to be loaded onto the Mimas V2.

Next you can build the "firmware" to run on the softcore CPU to provide MicroPython on the FPGA. You can build this for your selected PLATFORM/TARGET combination with:

make firmware

This step appears to use a pre-compiled firmware file, and builds quite quickly. It should result in a build/mimasv2_base_lm32/software/firmware/firmware.bin file.

Gateware and Firmware install

Ensure that the Numato Mimas v2 "operation mode" switch (SW7) is set to program mode -- the side nearest the USB connector is program mode (see the Numato Mimas V2 documentation).

Bundle up the gateware, BIOS, and firmware togther with:

make image

(which runs ./, to create build/mimasv2_base_lm32/flash.bin.

Then install the gateware, BIOS and firmware bundle with:

make image-flash

(which effectively runs make image-flash-mimasv2 due to the PLATFORM setting).

Because the upload happens at 19200 bps, this will take a couple of minutes to complete -- it does an erase cycle, a write cycle, and a read-back verification cycle.

The upload process looks something like:

(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$ make gateware-flash-mimasv2
python $(which /dev/ttyACM0 build/mimasv2_base_lm32//flash.bin
* Numato Lab Mimas V2 Configuration Tool *
Micron M25P16 SPI Flash detected
Loading file build/mimasv2_base_lm32//flash.bin...
Erasing flash sectors...
Writing to flash 100% complete...
Verifying flash contents...
Flash verification successful...
Booting FPGA...
(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$

The Mimas v2 will reboot into the default firmware which is not MicroPython (see later in the document on getting MicroPython running).

Modern lm32 build environment for MicroPython

Building MicroPython needs a fairly recent crosscompiler build environment, newer than the one used by the MicroPython gateware.

This step is best done in a terminal which does not have the gateware configuration (above) in it, so start a fresh terminal.

To build this newer crosscompiler environment clone the lm32-build-scripts repository:

git clone

Then build the cross compilers on your system with:

cd lm32-build-scripts

It will download several key Linux build tools (gcc, gmp, mpfr, mpc, binutils, gdb, etc), and then use them to build a crosscompiler for lm32, designed to be installed in /opt/lm32.

This step will take several minutes, partcularly to download the required source to build. Expect your CPU to be very busy for a while as it does a make -j32 when building everything; this also makes the console output fairly tricky to follow, and it fairly difficult to tell how far through the build process it has reached. (Currently there does not seem to be any check that the downloads are complete, or as intended -- nor any stop and continue steps in the build process -- so it is a bit "hope this works". There is partial support for building a Docker environment with the cross-compilers, but it appears to do a one-shot build and then remove them; presumably it is there only for testing the build scripts work.)

Assuming that it finishes without obvious error, and the return code is 0:

echo $?

then we can assume that it worked. The build directory should include a lot of built code (approximately 2GB).

The built code can then be installed somewhere central with:

sudo mkdir /opt/lm32
sudo chown $USER:$USER /opt/lm32
(cd build && make install)

which will also generate a lot of output, but run much quicker.

After this /opt/lm32/bin should contain a bunch of useful cross-compile tools, eg:

ewen@parthenon:/opt/lm32/bin$ ls
lm32-elf-addr2line  lm32-elf-gcc-6.2.0   lm32-elf-gprof    lm32-elf-readelf
lm32-elf-ar         lm32-elf-gcc-ar      lm32-elf-ld       lm32-elf-run
lm32-elf-as         lm32-elf-gcc-nm      lm32-elf-ld.bfd   lm32-elf-size
lm32-elf-c++filt    lm32-elf-gcc-ranlib  lm32-elf-nm       lm32-elf-strings
lm32-elf-cpp        lm32-elf-gcov        lm32-elf-objcopy  lm32-elf-strip
lm32-elf-elfedit    lm32-elf-gcov-tool   lm32-elf-objdump
lm32-elf-gcc        lm32-elf-gdb         lm32-elf-ranlib


MicroPython is also best built in a new terminal, without the gateware build environment variables. It needs to be built from an in-development repository with changes for MicroPython on FPGA and the Mimas v2.

In a fresh terminal, clone the forked MicroPython repository, with Mimas V2 support in it:

git clone

(There are other repositories too; I chose this one to try first as it had been reported as working on the Mimas v2. Apparently the lm32-v2 branch is the main one being worked on at present.)

Enter the repository, and checkout the lm32-mimas2 branch:

cd micropython
git checkout lm32-mimas2

ETA, 2017-03-16: The upy-fpga/micropython has been rebased onto the upstream micropython/micropython, with the lm32 patches merged onto the master branch; it is now best just to use the master branch (and there is no lm32-mimas2 branch any longer).

Change into the lm32 directory, and build with a cross compiler:

cd lm32
PATH="${PATH}:/opt/lm32/bin" make CROSS=1

That should build fairly quickly, and result in a build/firmware.elf file. Convert that into a firmware.bin file that can be uploaded to the Mimas v2 with:

PATH="${PATH}:/opt/lm32/bin" make build/firmware.bin CROSS=1

ETA, 2017-03-13: Apparently one should copy the contents of the build/mimasv2_base_lm32/software/include/generated from the gateware build environment (above) into the micropython/lm32/generated directory before building, to keep them in sync. I did not do this, and presumably it worked due to having an old "compatible enough" version checked in.

Installing MicroPython on the Numato Mimas v2

To actually install MicroPython, we have a few options. Firstly we can build a complete flash image including MicroPython instead of the default firmware. Secondly we can reset the soft CPU running in the default firmware and trigger an upload of MicroPython to run that boot instead of the default firmware. Thirdly we can upload a flash image without any default firmware, and rely on always uploading the application we want to run.

Flash image including MicroPython

Return to the gateware build top directory, with the environment set up, ie as before (possibly you still have a suitable terminal open):

cd upy-fpga-litex-gateware
source scripts/

Make a directory to build up the MicroPython flash image:

mkdir micropython
cd micropython

And then copy over the MicroPython firmware.elf and firmware.bin file:

cp -p .../micropython/lm32/build/firmware.bin .


python -m -f firmware.bin -o firmware.fbi

to build a firmware.fbi file.

Change back up to the top directory, and then use mkimage to build a complete flash image including MicroPython:

cd ..
rm build/mimasv2_base_lm32/flash.bin
./ --override-firmware micropython/firmware.fbi

This should build a new flash image in build/mimasv2_base_lm32/flash.bin. with output something like:

(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$ ./ --override-firmware micropython/firmware.fbi

Gateware @ 0x00000000 (    341436 bytes) build/mimasv2_base_lm32/gateware/top.bin                     - Xilinx FPGA Bitstream
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff aa 99 55 66 30 a1 00 07 20 00 31 a1 03 80 31 41 3d 00 31 61 09 ee 31 c2 04 00 10 93 30 e1 00 cf 30 c1 00 81 20 00 20 00 20 00 20 00 20 00 20 00
    BIOS @ 0x00080000 (     19356 bytes) build/mimasv2_base_lm32/software/bios/bios.bin               - LiteX BIOS with CRC
98 00 00 00 d0 00 00 00 78 01 00 08 38 21 00 00 d0 e1 00 00 e0 00 00 3b 34 00 00 00 34 00 00 00 e0 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00
Firmware @ 0x00088000 (    153736 bytes) micropython/firmware.fbi                                     - HDMI2USB Firmware in FBI format (loaded into DRAM)
00 02 58 80 36 67 08 1a 98 00 00 00 d0 00 00 00 78 01 40 00 38 21 00 00 d0 e1 00 00 e0 00 00 3b 34 00 00 00 34 00 00 00 e0 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00
       Remaining space    1386360 bytes (10 Megabits, 1.32 Megabytes)
           Total space    2097152 bytes (16 Megabits, 2.00 Megabytes)

Flash image: build/mimasv2_base_lm32/flash.bin
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff aa 99 55 66 30 a1 00 07 20 00 31 a1 03 80 31 41 3d 00 31 61 09 ee 31 c2 04 00 10 93 30 e1 00 cf 30 c1 00 81 20 00 20 00 20 00 20 00 20 00 20 00
(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$

Then this custom flash image can be loaded onto the Numato Mimas v2, by ensuring that the "operation mode" switch (SW7) is in "program mode" (nearest to the USB connector), then running: /dev/ttyACM0 build/mimasv2_base_lm32/flash.bin

to program MicroPython onto the Mimas v2. This will take a few minutes to write, as it is uploading at 19200 bps.

The result should look something like:

(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$ /dev/ttyACM0 build/mimasv2_base_lm32/flash.bin
* Numato Lab Mimas V2 Configuration Tool *
Micron M25P16 SPI Flash detected
Loading file build/mimasv2_base_lm32/flash.bin...
Erasing flash sectors...
Writing to flash 100% complete...
Verifying flash contents...
Flash verification successful...
Booting FPGA...
(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$

Using the REPL of MicroPython on the FPGA

Unplug the Numato Mimas v2, to power it down.

Move the Numato Mimas V2 "operation mode" switch (SW7) to serial port mode (furthest away from the USB port), so that a terminal program on the computer can communicate with the softcore in the FPGA.

Then run:

screen /dev/ttyACM0 19200

to connect to the MicroPython REPL.

(For some reason MicroPython does not work with flterm as the serial console program, hence "make firmware-connect-mimasv2" which uses flterm does not work; thus the use of screen as a simple terminal emulator. ETA, 2017-05-15: Recent builds of flterm fix this issue so it is not necessary to start screen to interact with MicroPython.)

All going well, if you hit enter a couple of times, you should get a prompt, and then be at the Python REPL:

>>> print("Hello World!")
Hello World!

To get out of the screen session, use ctrl-a \ (backslash) to quit screen.

Uploading MicroPython at boot

The disadvantage of including MicroPython in the flash image is that the whole system needs to be reflashed for every change to MicroPython. As an alternative it is possible to program the Mimas v2 flash with the default application, and then upload the MicroPython firmware application over the serial link, through the BIOS boot loader.

To do this, build the default firmware image as above, and upload that:

make image
make image-flash

then change the operation mode (SW7) to "serial port" (away from the USB connector), and start flterm to upload the MicroPython firmware.bin into RAM on the Mimas v2:

flterm --port=/dev/ttyACM0 --kernel=micropython/firmware.bin --speed=19200

Once flterm is running, press SW6 (button 6, at top right), to send a reset to the soft CPU. (In theory one should be able to type reboot at the H2U> application prompt, but at present on the Mimas v2 that jumps to the wrong address and just hangs.)

You should see the BIOS/boot loader messages appear, and then it should prompt flterm to send the kernel image to run. The upload should start automatically and look something like:

LiteX SoC BIOS (lm32)
(c) Copyright 2012-2017 Enjoy-Digital
(c) Copyright 2007-2017 M-Labs Limited
Built Mar 12 2017 15:39:00

BIOS CRC passed (cdfe4dda)
Initializing SDRAM...
Memtest OK
Booting from serial...
Press Q or ESC to abort boot completely.
[FLTERM] Received firmware download request from the device.
[FLTERM] Uploading kernel (153728 bytes)...
[FLTERM] Upload complete (1.6KB/s).
[FLTERM] Booting the device.
[FLTERM] Done.
Executing booted program.
MicroPython v1.8.7-38-gafd8920 on 2017-03-12; litex with lm32
Type "help()" for more information.

Unfortunately the problem of MicroPython and flterm disagreeing about something still exists, so once you reach this point, you need to disconnect flterm (ctrl-c) and reconnect with screen at this point to use the REPL (ETA, 2017-03-15: unless you have a recent build of flterm):

screen /dev/ttyACM0 19200

and then the Python REPL should work:

>>> print("hello world!")
hello world!

To get out of the screen session, use ctrl-a \ (backslash) to quit screen.

A third option: no default application

It is also possible to build the flash image without a default application, and then simply rely on resetting the Mimas v2 and flterm uploading the application to run.

To do this:

rm build/mimasv2_base_lm32/flash.bin
./ --override-firmware none

which should result in something like:

(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$ ./ --override-firmware none

Gateware @ 0x00000000 (    341436 bytes) build/mimasv2_base_lm32/gateware/top.bin                     - Xilinx FPGA Bitstream
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff aa 99 55 66 30 a1 00 07 20 00 31 a1 03 80 31 41 3d 00 31 61 09 ee 31 c2 04 00 10 93 30 e1 00 cf 30 c1 00 81 20 00 20 00 20 00 20 00 20 00 20 00
    BIOS @ 0x00080000 (     19356 bytes) build/mimasv2_base_lm32/software/bios/bios.bin               - LiteX BIOS with CRC
98 00 00 00 d0 00 00 00 78 01 00 08 38 21 00 00 d0 e1 00 00 e0 00 00 3b 34 00 00 00 34 00 00 00 e0 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00 34 00 00 00
Firmware @ 0x00088000 (         0 bytes) Skipped                                                      - HDMI2USB Firmware in FBI format (loaded into DRAM)

       Remaining space    1540096 bytes (11 Megabits, 1.47 Megabytes)
           Total space    2097152 bytes (16 Megabits, 2.00 Megabytes)

Flash image: build/mimasv2_base_lm32/flash.bin
ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff aa 99 55 66 30 a1 00 07 20 00 31 a1 03 80 31 41 3d 00 31 61 09 ee 31 c2 04 00 10 93 30 e1 00 cf 30 c1 00 81 20 00 20 00 20 00 20 00 20 00 20 00
(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$

That can be programmed onto the Mimas v2, by putting the Mimas v2 into programming mode (SW7 to the side nearest the USB connector), then running: /dev/ttyACM0 build/mimasv2_base_lm32/flash.bin

Once that completes put the "operation mode" switch (SW7) back to the "serial console" mode (furthest from the USB connector), then run flterm as above:

flterm --port=/dev/ttyACM0 --kernel=micropython/firmware.bin --speed=19200

and hit enter a couple of times. You should get a BIOS> prompt. At that prompt you can type serialboot to get it kick off the application upload:

(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$ flterm --port=/dev/ttyACM0 --kernel=micropython/firmware.bin --speed=19200
[FLTERM] Starting...

BIOS> serialboot
Booting from serial...
Press Q or ESC to abort boot completely.
[FLTERM] Received firmware download request from the device.
[FLTERM] Uploading kernel (153728 bytes)...
[FLTERM] Upload complete (1.6KB/s).
[FLTERM] Booting the device.
[FLTERM] Done.
Executing booted program.
MicroPython v1.8.7-38-gafd8920 on 2017-03-12; litex with lm32
Type "help()" for more information.
(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$

(or just press SW6 to reset the soft CPU back into the start of the boot loader, once you have flterm running, as with the second option above).

As above, you will need to disconnect flterm (ctrl-c) once the MicroPython banner appears, and start screen to interact with the MicroPython REPL (ETA, 2017-03-15: unless you have a recent build of flterm):

screen /dev/ttyACM0 19200

To get out of the screen session, use ctrl-a \ (backslash) to quit screen.

Other references

ETA, 2017-03-13: Lots of proof reading edits, and tweaks based on advice from Tim Ansell.

ETA, 2017-03-15: A newer version of flterm is now available, which does work with MicroPython, so it is now possible to do both the MicroPython firmware upload and interact with MicroPython from one program (ie, no need to exit out to screen).

To update, after building everything, do conda install flterm, which should install flterm 2.4_15_gd17828f-0 timvideos:

(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$ conda install flterm
Fetching package metadata ...........
Solving package specifications: .

Package plan for installation in environment /home/ewen/work/naos/src/upy-fpga/upy-fpga-litex-gateware/build/conda:

The following packages will be UPDATED:

    flterm: 0+git20160123_1-0 timvideos --> 2.4_15_gd17828f-0 timvideos

flterm-2.4_15_ 100% |################################| Time: 0:00:01   6.17 kB/s
(H2U P=mimasv2 T=base R=nextgen) ewen@parthenon:~/work/naos/src/upy-fpga/upy-fpga-litex-gateware$

Run flterm as usual (eg, as explained above). You can tell that it is working properly if hitting enter at the MicroPython REPL gives you another prompt, and you are able to enter Python programs.

ETA, 2017-03-15: Katie Bell pointed out that there is a not-merged testing branch which enables controlling the LEDs on the Numato Mimas v2 board. It is in the lm32-leds branch of shenki's MicroPython repository on GitHub, with an example given in the lm32: Add leds module commit comment.

I was able to build it:

git clone micropython-shenki
cd micropython-shenki
git checkout lm32-leds
cd lm32
PATH="${PATH}:/opt/lm32/bin" make CROSS=1
PATH="${PATH}:/opt/lm32/bin" make build/firmware.bin CROSS=1

and then get it working on my Numato Mimas v2 board with:

cd upy-fpga-litex-gateware
source scripts/
cd micropython
mkdir leds
cd leds
cp -p ..../micropython-shenki/lm32/build/firmware.bin .
python -m -f firmware.bin -o firmware.fbi
cd ../..
flterm --port=/dev/ttyACM0 --kernel=micropython/leds/firmware.bin --speed=19200

and then press SW6 (top right) to reset the soft CPU into the boot loader, and load that newer build of MicroPython.

LiteX SoC BIOS (lm32)
(c) Copyright 2012-2017 Enjoy-Digital
(c) Copyright 2007-2017 M-Labs Limited
Built Mar 12 2017 15:39:00

BIOS CRC passed (cdfe4dda)
Initializing SDRAM...
Memtest OK
Booting from serial...
Press Q or ESC to abort boot completely.
[FLTERM] Received firmware download request from the device.
[FLTERM] Uploading kernel (180724 bytes)...
[FLTERM] Upload complete (1.6KB/s).
[FLTERM] Booting the device.
[FLTERM] Done.
Executing booted program.
MicroPython v1.8.7-182-gd86a88c on 2017-03-15; litex with lm32
>>> import litex
>>> leds = [ litex.LED(n) for n in range(1,9) ]
>>> print(leds)
[LED(1), LED(2), LED(3), LED(4), LED(5), LED(6), LED(7), LED(8)]
>>> for led in leds:
...     led.on()
>>> for led in leds:
GC: total: 1984, used: 1408, free: 576
 No. of 1-blocks: 7, 2-blocks: 7, max blk sz: 32, max free sz: 8

During the led.on() loop all the LEDs should turn on; during the loop, all the LEDs should turn off. But note the GC line indicating that about 75% of the memory resources are in use, just holding those 8 LED objects open -- so this may not be the most memory efficient approach.

(Unfortunately this particular MicroPython build does not have as many features turned on as the MicroPython for the ESP8266, so things like time.sleep() do not seem to be available.)

ETA, 2017-03-16: The upy-fpga/micropython has been rebased onto the upstream micropython/micropython, with the lm32 patches merged onto the master branch; it is now best just to use the master branch (and there is no lm32-mimas2 branch any longer).

Posted Sun Mar 12 22:12:05 2017 Tags:

Imagine, not entirely hypothetically, that you have a client that needs you to work on multiple systems only accessible via https (where the domain name needs to match), all located behind the client's firewall. Further suppose that the only access they can provide to their network is ssh to a bastion host -- even while located in their physical office, only "guest" network access to the Internet is available. Assume, also not entirely hypothetically, that they have no VPN server. Finally assume, again not entirely hypothetically, that no software can be installed on the bastion host, and that it runs Ubuntu Linux 12.04 LTS (hey, there is at least a month of maintenance support left for that version...).

In this situation there are a few reasonable approaches that preserve the browser's view of the domain name:

  • an outgoing (forward) web proxy, supporting CONNECT

  • transparent redirection of outgoing TCP connections to a proxy

  • tricks with DNS resolution (eg /etc/hosts), possibly combined with one of the above.

I did briefly experiment with transparent redirection of the outgoing TCP connections (which works well on Linux: iptables -t nat -A OUTPUT ...), but since I was working from a Mac OS X desktop system it was more complicated (Mac OS X uses pf, and pf.conf can include rdr statements to redirect packets, but intercepting locally originated traffic involves multiple steps and seems somewhat fragile and was not working reliably for me).

Instead I went looking for a way to implement a web forward proxy, on the bastion host. Since I could not install software on the bastion host, I needed to find something already installed which could be repurposed to be a forward proxy. Fortunately it turned out that the bastion host had been installed with apache2 (2.2.2), to support another role of the host (remote access to monitoring output). I then needed a configuration that could use apache2 in a forward proxy mode.

Apache 2.2 provides a forward proxy feature through mod_proxy, but it is definitely something you want to secure carefully as the documentation repeatedly warns. In addition the bastion host naturally was firewalled from the Internet to allow only certain ports to be reached directly, including ssh, so simply running a web proxy on some port on an Internet reachable IP was never an option.

To solve both of these problems I created a configuration to run another instance of Apache 2.2, with mod_proxy enabled in forward proxy mode, listening on localhost, that could be reached only via ssh port forward (based on examples from the Internet).

This involved creating a custom configuration to run Apache 2.2 with:


# Access control functionality
LoadModule authz_host_module /usr/lib/apache2/modules/

# Proxy functionality
LoadModule proxy_module         /usr/lib/apache2/modules/
LoadModule proxy_http_module    /usr/lib/apache2/modules/
LoadModule proxy_connect_module /usr/lib/apache2/modules/

# Logging
LogFormat "%h %l %u %t \"%r\" %s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogLevel warn
ErrorLog  logs/error.log
CustomLog logs/access.log combined

# PID file
PidFile   logs/

# Forward Proxy
ProxyRequests On

# Allow CONNECT (and thus HTTPS) to additional ports
AllowCONNECT 443 563 8080

as well as a section (in angle brackets) to match "Proxy *", which limited access to the proxy to localhost:

Order Deny,Allow
Deny from all
Allow from

(Full Apache 2.2 example config.)

Note that the above configuration will work only with Apache 2.2 (or earlier); the configuration for the Apache 2.4 mod_proxy, and other Apache features, changed significantly, particularly around authorization. (ETA, 2017-04-03: See end of post for update with Apache 2.4 configuration.)

The key features in the above configuration is that it loads the modules needed for HTTP/HTTPS proxying, and IP authentication, and then listens on for connections and treats those connections as forward proxy connections -- thanks to ProxyRequests On and the "Proxy *" section. The "Proxy *" section is intended to allow access only from localhost itself. It may be useful to also add user/password authentication to the proxy use.

Put the config in a directory, and then make a "logs" directory:

cd ....
mkdir logs

And create a simple wrapper script to start up the proxy on the bastion host (eg, called "go"):

#! /bin/sh
# Run Apache in forward proxy mode to be reached via ssh tunnel
exec apache2 -d ${PWD} -f ${PWD}/apache-2.2-forward-proxy.conf -E /dev/stderr

Then the proxy can be started up when needed with:

cd ....

and will run in the background. When run, you should see something listening on TCP/3128 on localhost on the bastion host, eg with netstat -na | grep 3128:

tcp      0       0*              LISTEN

From there, the next step is to ssh into the bastion host with a port forward that allows you to reach

ssh -L 3128: -o ExitOnForwardFailure=yes -4 -N -f HOST

which should cause ssh to listen on our local (desktop) system, here also on port 3128, so netstat -na | grep 3128 on the local system should also show:

tcp      0       0*              LISTEN

The final step is to set the proxy configuration of your web browser to use a web proxy at on port 3128, so that your web browser will use the proxy for HTTP/HTTPS connections. For Safari this can be done with Safari -> Preferences... -> Advanced -> Proxies: Change Settings..., which will open the system wide proxy settings. You need to change both "Web Proxy (HTTP)" and "Secure web proxy (HTTPS)" for this to work in most cases, ticking them and then setting the "Web Proxy Server" to "" and the port to "3128".

After that your web browsing should automatically go through the connection on your desktop system, via the ssh port forward to Apache 2.2/mod_proxy on the bastion host, and then CONNECT out to the desired system, giving a transparent HTTPS connection so that TLS certificate validation will just work.

To access from a command line client it is typically useful to set:

export http_proxy https_proxy

because most client libraries will look for those (lowercase) environment variables when deciding how to make the connection. This enables using, eg, web REST/JSON APIs from Python, with the proxy.

If the bastion host is restarted then the proxy will have to be manually restarted (as above); but if it is running on a legacy Linux install chances are that the client will be unwilling to reboot the host regularly due to being uncertain if it will boot up cleanly with everything needed running. I got through the entire project without having to restart the proxy.

The main catch with this configuration is that because the Safari proxy setting are system wide, they affect all traffic including things like iTunes. I worked around that by using another browser which had its own network connection settings (Firefox) for regular web browsing, and turning the proxy settings on and off as I needed them to work on that project. If this were to be a semi-permanent solution it might either be better to use a browser with its own proxy settings (eg, Firefox; or one in a VM) dedicated to the project, and leave the system wide settings alone. Or perhaps to create a Proxy Auto-Config (.pac) file which redirected only certain URLs to the proxy -- it is possible to arrange to load those from a local file instead of a web URL.

ETA, 2017-04-03: Inevitably, given that the support for Ubuntu 12.04 LTS runs out this month, the client upgraded to the bastion host Ubuntu 14.04 LTS (but not Ubuntu 16.04 LTS; presumably due to being a smaller jump). This brings in Apache 2.4 instead of Apache 2.2, which brings in a non-trivial changes in configuration syntax.

The incremental differences are relatively small though, for a basic config. You need to load a couple of additional modules:

# Apache 2.4 worker
LoadModule mpm_worker_module /usr/lib/apache2/modules/

LoadModule authz_core_module /usr/lib/apache2/modules/

and change the Proxy authorization section to be:

Require ip

rather than "Order Deny, Allow", "Deny from all", "Allow from".

It is now also possible to use additional proxy directives including:

ProxyAddHeaders On
ProxySourceAddress A.B.C.D

but these are optional. With those changes the same minimal configuration approach should work with Apache 2.4.

(Full Apache 2.4 example config.)

Posted Tue Mar 7 09:15:35 2017 Tags:

A month ago I ordered a pair of Numato Mimas v2 Spartan 6 FPGA Development boards, after hearing about them at Linux.Conf.Au 2017.

This past weekend, after finally getting free of the project that ate my life for the last six months, I got a chance to start installing the tools necessary to start exploring FPGA programming. I am doing this on the Dell XPS 13 running Ubuntu 16.04 LTS I bought late last year to help with such development. While it can dual boot Microsoft Windows 10, ideally I would like to do everything under Ubuntu Linux 16.04 LTS.

As there were quite a few setup steps, and some dead ends, I am mostly recording this process for my own reference.

Xilinux ISE WebPACK install on Linux

The first step was to get the Xilinx ISE WebPACK installed, which is a FPGA synthesis tool for the Spartan 6 available with a no-cost license; it runs under Linux and (older versions of) Windows. The Xilinx ISE WebPACK has been effectively end of lifed (the download page lists final update as October 2013), but still functions at present.

I basically followed the J-Core Open Processor instructions to install the Xilinx ISE WebPACK, but with fewer frustrating steps. There is now both a "Multi-File Download", and also a "Full Product Installation". I chose to download the "Full Installer for Linux" (6GB), which took two attempts, but worked successfully on the second attempt (the first failed quite early). As far as I could tell the download was just a regular browser download, not a "javascript downloader". (I also had to create a Xilinx account even to download the software, and it needed a physical address -- not a Post Box -- before it would allow the account to be created. Apparently there is due to ITAR restrictions...)

After downloading the Xilinx ISE WebPACK installer the next steps are:

cd /var/tmp
mkdir xilinx
cd xilinx
tar -xvf ~/Downloads/Xilinx_ISE_DS_Lin_14.7_1015_1.tar

The tarfile now extracts into a subdirectory, so to start the installer do:

cd Xilinx_ISE_DS_Lin_14.7_1015_1

At that point a GUI installer (eg, common OS X or Windows style) starts up. The first couple of steps have you accept the Xilinx ISE WebPACK license, and then a huge list third party licenses (mostly for Open Source Software AFAICT). There's two checkboxes to tick on the first page (the second to acknowldge the "WebTalk" spyware), and one on the second page. (The third party licenses appear to be in idata/usenglish/idata/licenses/unified_3rd_party_eulas.txt, which is easier to review in a file viewer than within the GUI.)

When you reach the point to choose what software to install, select "ISE WebPACK", then disable installing the other features (ie, the ones covered by other licenses). The "WebTalk" spyware is mandatory to install (and can only be disabled with a paid license), but the J-Core WebPACK install instructions imply that it runs only in the ise GUI tool, not the command line tools. I left the "use multiple cores during install" option selected, and let it create the environment files, but did skip trying to get a ISE WebPACK license during the install.

On the destination screen I paused and did:

cd /usr/local/
mkdir xilinx
sudo chown $USER:$USER xilinx

in a terminal, and then told the Xilinx installer to use /usr/local/xilinx as its install location. For completeness after the install I also did:

cd /opt
ln -s /usr/local/xilinx Xilinx

just in case anything tried to use the default.

The install process itself ran relatively quickly (eg, just a few minutes at least on a modern laptop with SSD), so that part seemed fine.

To actually be able to use the Xilinx ISE WebPACK I followed the J-Core ISE WebPACK install guide hints and created a script to "enter the ISE WebPACK environment". I created /usr/local/bin/xilinx with:

#!/bin/bash -i
# Wrapper script to enter Xilinx environment
# From

source /usr/local/xilinx/*/ISE_DS/
export PS1="[Xilinx] $PS1"
exec /bin/bash --noprofile --norc

and made that executable. Then I can enter a shell in the Xilinx ISE WebPACK runtime environment with:


I also followed the lead of the J-Core ISE WebPACK install guide and pre-emptively moved the files in the Xilinx ISE WebPACK install out of the way (apparently their build is incompatible with, eg, Firefox on modern Linux):

cd /usr/local/xilinx/14.7/ISE_DS/ISE/lib/lin64
cd /usr/local/xilinx/14.7/ISE_DS/common/lib/lin64

I use Firefox as my default browser so I did not need the step to make that my default browser.

The Xilinx ISE WebPACK GUI is started with:


run from within the Xilinx ISE WebPACK environment (ie, xilinx script above has been run, the prompt includes [Xilinx]).

On first launch there will be a license error -- click "OK" on the error, and the license manager will pop up. Within the license manager select "Get Free Vivado/ISE WebPack License", and then click "Next". The popup window will then show what will be in the license key. Within that window click on "connect now" to open a browser window and request the no-cost license. (In my case the NIC it was tied to seemed to be 000000000000; not sure if that is a more recent relaxation of the licensing terms for the end of life software, or due to only having Wifi and a VPN connected at the time. Hopefully it does not become an issue later.)

When the browser window pops up, log in with your Xilinx account (noting that the Username field wants the short username that you used when creating the account not the email address...), and then select "ISE WebPack License" as the one you want. Click on "Generate Node-Locked license" to get a popup confirmation window, and then "Next" on the next two popup windows to actually generate the license file.

An email should arrive, with a Xilinx.lic attachement. Save that attachment. Then go back to the "Xilinx License Configuration Manager" windo, in the "Manage license" tab. Click the "Load License" button and browse to the license file you saved, and click Open. You should get a popup that the license installation was successful.

At this point the Xilinx ISE WebPACK software should be ready for use.

Numato Mimas v2 programming tools

At this point I started trying to follow the Numato Beginners Guide to learning FPGA, which uses the Numato Mimas v2 as an example board.

Pretty quickly I ran into the problem that the guide said I needed the "Mimas V2 Configuration downloader software". That software, available from the Downloads tab of the Numato Mimas v2 page, is only available for Microsoft Windows ("Configuration Tool (Windows)"). At first glance when looking at the Numato Mimas v2 Documentation, I wondered if I could work without the "configuration tool". But some more research clarified that this "configuration tool" was in fact the "software download tool", and without an equivalent tool the Field Programmable Gate Array was not going to be very Field Programmable...

After quite a bit of searching I turned up a discussion on the Numato Labs Community forum about programming the Mimas v2 from Linux, which showed people using a python script ( Originally it looked like a dead end, as the server given in the link to the software turned out to be dead -- but going to the third page of the discussion thread turned up the fact that Numato moved to GitHub in late 2015, and that the MimasV2Config tool could now be found within their Numato samplecode repository (specifically in FPGA/MimasV2/tools/configuration/python).

So as my next setup steps I did:

cd /usr/local
sudo mkdir numato
sudo chown $USER:$USER numato
cd numato
git clone

According to the dicussion forum that script needs (a) Python 3, and (b) the Python3 serial library. So I did:

sudo apt-get install python3 python3-serial

which (on Ubuntu Linux 16.04 LTS) installed Python 3.5.2, and python3-serial version 3.0.1.

The discussion forum also identified that Ubuntu Linux installs modemmanager which really wants any serial device, including USB serial device, that appears to be a modem -- and automatically sends AT commands to them on connect. This obviously confuses various devices which are not modems, apparently including the Numato Mimas v2.

To be safe, since I do not use modems on this laptop, I explicitly uninstalled modemmanager:

sudo apt-get purge modemmanager

(if you do use modems, including 3G/4G modems, on the system you will need to be quite a bit more subtle about keeping modemmanager away from the Numato Mimas v2 serial port).

Then to make the tool easier to run I created a symlink to it:

cd /usr/local/bin
sudo ln -s ../numato/samplecode/FPGA/MimasV2/tools/configuration/python/ MimasV2Config
chmod +x ../numato/samplecode/FPGA/MimasV2/tools/configuration/python/

and tested that would run well enough to produce help text with:


which should report an invalid number of arguments, and the usage:


For ease of use, you probably also want to add your user to the dialout group, which udev auto-assigns to modem-like/serial-like devices, so that you can run the programming tool without running as root (eg, use vigr).

Putting it all together

At this point I went back to the Numato Beginners Guide to Learning FPGA, and started following the examples for the Mimas V2. If it is not already running, then do:


to start up the Xilinx ISE WebPACK environment.

I created a project following the example in the [Mimas-V2-Debug-Probe] (

  • Top-level source type: HDL

and then on the second page:

  • Family: Spartan 6

  • Device: XC6SLX9

  • Package: CSG324

  • Speed: -2

  • Preferred language: Verilog

(which are also shown in screenshots in part 3 of the Numato Beginners Guide to Learning FPGA).


Then right click on the project, and choose "New Source..." to add a Verilog Module. I then cut'n'paste in the example from part 3 of the Numato Beginners Guide to Learning FPGA):

module myModule_tb();
wire out;
reg clock;

always begin
 #1 clock =!clock;

initial begin
 //Initialize clock
 clock = 0;

 //End simulation

myModule notGate(clock, out);

module myModule(A, B);
input wire A;
output wire B;
assign B = !A;

replacing of the auto-generated boilerplate, saved that file, and then went to "Simulation" mode (radio button at top left). From there, highlight the "myModule_sim" file in the project hierachy, and then go down to "ISim Simulator" and expand that, then right click on "Simulate Behavioural Model" and choose "Run". ISim will launch, and then very rapidly finish (since it is configure to run for only #10).

To see the waveform, click on the Default.wcfg tab, and then zoom out a lot so that 10 ns fits on the screen (instead of ps), which should show the expected "not gate" behaviour. (If you are too far zoomed in, all you will see are very straight lines...)

Implementation for actual hardware

To implement this simple example on actual hardware, we need the "User Constraints File" (ucf) which maps functions (eg, switches, LEDs, etc), to actual hardware. There is a reference User Constraints File for the Numato Mimas v2", which gives the mappings available -- with everything commented out, and given generic names. It is linked from the Numato Mimas v2 downloads page.

Part 4 of the tutorial also includes the relevant snippets for the test project. These seem to be:

# User Constraint File for NOT gate implementation on Mimas V2

# Onboard LEDs
NET "B" LOC = T18;

# Push Button Switches.
# Internal pull-ups need to be enabled since
# there is no pull-up resistor available on board

NET "A" LOC = M16;

which can be created in a new file by changing to "Implementation mode" (radio boxes at the top level), and then right click on the project and choosing "New Source" to add an "Implementation Constraints File" called notgate.ucf. Then cut'n'paste the text above into the file (if copying from the Numato tutorial, beware it has smart quotes, and you will need to allow those to be converted to regular quotes for it to work).

Then update the Verilog code to be just:

module myModule(A, B);
   input wire A;
   output wire B;
   assign B = !A;

without the simulation framework code used in part 3.

Save the file, and right click on the module and select "Implement Top Module". That runs a whole series of steps to build the binary image to upload to the FPGA (you can watch the progress in the output section at the bottom). It will hopefully finish with "Process ... completed successfully", and no warnings or errors.

To get a file to download to the Numato Mimas v2, it is necessary to go to right click on "Generate Programming File" and choose "Process Properties...", then tick the box "Create Binary Configuration File" and "Apply" that change and click "OK" to close the dialog box. Then right click on "Generate Programming File" again and choose "Run", and it should successfully generate a bitstream file myModule.bin in the project directory.

Loading onto the Numato Mimas v2 board

Connect the computer up the Numato Mimas v2 board via a USB A to USB Mini B cable. To my surprise this is a different cable than the ESP8266 and IoTuz used (slightly taller connector), but fortunately I did have one sitting around, from an unpowered hub which is otherwise not very useful. (It is also useful to ensure that the supplied standoffs for the board have been installed so it is not just a bare board sitting on something which may result in shorts.)

The Numato Mimas v2 board should power up (it is USB powered by default), and if it is fresh from the factory will probably be running some sort of test routine -- mine had the 7-segment displays cycling from 111 through 999, and a LED chaser on D1 to D8.

Running "dmesg | tail" should show a newly discovered device "Numato Labs Mimas V2 Spartan6 FPGA Development Board", and /dev/ttyACM0 should now exist. Check the permissions on /dev/ttyACM0 and your groups:

ls -l /dev/ttyACM0

to make sure that /dev/ttyACM0 is writable by a group you are in (by default "dialout, which hopefully you are in as described above).

Then change to the project directory that the Xilinx ISE WebPACK was using, and run:

MimasV2Config /dev/ttyACM0 myModule.bin

it should connect, detect the flash (Micron M25P16 SPI Flash in my case), erase the flash, write the flash, verify the flash, and then rebooting the FPGA. (The steps should be similar to Configurating Mimas V2 Using Configuration Tool, but with the output to the console.)

The FPGA should start up without the factory test pattern (since it should be running new code). And following the Numato Learning FPGA Part 4 example instructions, pressing SW3 on the board should cause LED D8 to light up. Sadly in my case the factory test pattern was no longer running, but neither did pressing SW3 or any of the other buttons do anything :-( And power cycling the board (by unplugging the USB) did not make it work.

To test the programming alone, I downloaded the mimasv2_sample_bin_file.bin, and tried programming that instead:

MimasV2Config /dev/ttyACM0 mimasv2_sample_bin_file.bin

which also appeared to flash correctly, and the board restarted with the factory test routing running. From this I concluded that the programming interface works, and I have some other issue.

The next most obvious issue was that I set something up wrong in the Xilinx ISE WebPACK, so to eliminate that possibility I downloaded the example Xilinx ISE implementation project for Mimas v2, unpacked that and opened that, and then followed through the "Implement Top Module" and "Generate Programming File" steps again, giving me a new myModule.bin (the zip also had a mymodule.bin -- note case difference which was identical to what I rebuilt; but both were different from the one I had built myself).

When I uploaded the one built from the downloaded project it worked:

MimasV2Config /dev/ttyACM0 myModule.bin

with pressing switch SW3 causing LED D8 to turn on while the switch is held down.

Doing a diff between the ucf file in the downloaded project, and the ucf in my test project turned up the difference -- when I cut'n'paste the User Constraints File snipped from the example 4 website, I accidentally copied the first one instead of the second one. The first one has the Mimas v2 default names for things, and the second one has "LED" changed to "B" and "SW" changed "A" to match the names used in the code.

Fixing that difference, then doing "Implement Top Module" and "Generate Programming File" in the project I had created gave me an updated myModule.bin, which was bit-for-bit identical with the downloaded one. When I downloaded that one to the Mimas v2 it also worked. To be sure that it was not just running the downloaded example, I wrote the Mimas v2 factory sample code in, and then my own built example:

MimasV2Config /dev/ttyACM0 mimasv2_sample_bin_file.bin
MimasV2Config /dev/ttyACM0 myModule.bin

and both worked.

So now I have a Linux-only programming environment for the Numato Mimas v2 set up. And have learnt that at least the Xilinx ISE WebPACK will GUI will happily continue on if it cannot match up the named inputs and outputs -- so that is definitely an issue to look out for in future use.

ETA, 2017-03-07: Useful Getting Starting with Numato Mimas v2 blog post, with lots of reference pictures and a slightly different small example (linked from the Knowledgebase tab of the Numato Mimas v2). Also of interest, FPGArduino (also on GitHub) which creates an Arduino-compatible microcontroller on the FPGA (via blog post describing how to install it on the Mimas v2 -- including an updated serial loader, etc; possibly some of those steps are no longer required; see also FPGArduino guide to JTAG programming on Linux, which points to a WiFi JTAG project based on an ESP8266!). Nearby is the f32c softcore, available under a BSD license. There is also a way to run PacMan on the Mimas v2!

ETA 2017-03-11: Corrected typo in programming device ID (should be XC6SLX9, for which there is a Xilinx Spartan 6 BSDL Model for use with a JTAG interface (Xilinx login required); BSDL (Boundary Scan Description Language) is a VHDL subset to describe pin maps; see also for a collection of BSDL files found from vendors).

Posted Mon Mar 6 15:00:29 2017 Tags:


I have been attending Linux.Conf.Au for over 10 years -- I have been to every one since LCA2004 in Adelaide. It is a very useful conference to stay in touch with developments in Linux and Free/Open Source Software. Simon Lyall and I have run the Linux.Conf.Au Sysdmin Miniconf for nearly as long -- Simon and I alternate years as "main organiser", and help out in the other years. (Simon has a good summary of the LCA2017 Sysadmin Miniconf: Session 1, Session 2, and Session 3; and all the slides and videos from this year are also linked from the LCA2017 Sysadmin Miniconf Programme.)

Over recent years there has also a focus on Open Hardware as an extension of the Open Source Software movement. In particular there has been an Open Hardware Miniconf since 2010 -- originally called an "Arduino Miniconf", but getting a more general name as interest spread out beyond the Arduino Platform (see also the Wikipedia summary of Arduino history). Each of those Open Hardware Miniconfs has had an assembly project which a select few (usually about 30 -- limited by room size) got to build and take home.

Because I started out in 8-bit computers, including building a few expansion modules, I have always been a bit attracted to Open Hardware and kept an eye on the developments over the last decade -- including the Arduino, the Raspberry Pi, and other development boards. But Open Hardware often ended up being crowded out by other things. In particular attending the Open Hardware Miniconf assembly session usually got crowded out by conflicting events like the Sysadmin Miniconf, or by being booked out very quickly.

This year, at LCA2017, I finally got a chance to play with a few of the bits of Open Hardware. I happened to see the sign up announcements for two different Open Hardware sessions within a few hours of them opening -- and thus was able to quickly sign up before they filled up. It was also one of the years when the Open Hardware Miniconf did not conflict with the Sysadmin Miniconf, so I could do both. (Though the Sysadmin Miniconf consumed all of my Monday, and the Open Hardware Miniconf consumed all of my Tuesday, so I did not see any other LCA2017 Miniconfs!)

Open Hardware Miniconf

The LCA2017 Open Hardware Miniconf assembly project was called "IoTuz", and was essentially an ESP32 development board -- a recently released 32-bit Wifi/Bluetooth/CPU system on a chip from Expressif, who actually sponosored the Miniconf (by donating the ESP32 devices used).

The IoTuz hardware was supplied (for AUD$100) "partly assembled" (with the surface mount parts pre-assembled), and a collection of through-hole components to be added by attendees (parts list). That was fortunate as the last time I did any soldering -- over 20 years ago -- everything in the hobbiest market was through-hole.

My hardware assembly went pretty slowly, taking the entire assembly session and about another hour later in the afternoon. Partly it seems to have been due to the (temperature controlled) soldering iron I had chosen not heating properly -- I made much faster progress once one of the helpers pointed out that the iron did not seem to be melting the solder as fast as expected, and suggested I try another soldering iron. But also not having done any soldering in the last 20 years made for fairly slow progress (I also originally assumed the issues with melting the solder were all my unpracticed technique, and not the equipment...).

By the end of the Miniconf I did have a pretty much completely assembled IoTuz, in part thanks to one of the helpers who soldered the more closely pitched LEDs with much more skill than I had. Fortunately my board did have the CN3063 battery charge controller installed, along with the two small (red) wires reversing the pinouts due to a late-discovered layout issue -- so I could also test, briefly, that it appeared to charge the supplied battery. (At the entry to the Miniconf they were trying to ensure the boards without the battery charge controller/modification went to Melbourne based attendees, where the designers could easily get hold of the board again to fix the issue later.) The missing items were the speaker and microphone -- the speaker because it did not arrive in time, and the microphone due to a layout issue. I believe the speakers finally arrived towards the end of LCA2017, but I did not hear they had arrived in time to collect one. (AFAICT, from the, the intended speaker was the CUI Inc CVS-1508, DigiKey 102-2498-ND, which seems to be available in New Zealand for under NZ$5 quantity one. The planned microphone was apparently Knowles SPU0410HRH5RH-PB (DigiKey 423-1138-1-ND) which also seems to be available in New Zealand for under NZ$1.50, quantity one. However there is an issue with the circuit layout for the Microphone (see end of page), where pins 2 and 4 need to be crossed over at MK91 (on the centre left of the board), which seems tricky for me to assemble by hand; my board may never have a microphone.)

On the Miniconf day my IoTuz "sort of" worked -- it seemed to power on safely and I was able to program it, but watching the console, it crashed very soon after boot, every time. Originally I assumed that was an assembly error on my part, but after rechecking the board (and adding the remaining components I had originally skipped, due to time constraints, in case their absence was causing stray signals), I found out that others were also experiencing similar crashes.

This past weekend -- my first one back at home since the conference! -- I had a chance to try the IoTuz board with the latest IoTuz firmware (following the IoTuz Software instructions), and found that it actually ran in a stable fashion. (I think the fix to the crashing was commit e9a65c5ebb by Angus, based on the timing of this Twiter reply the next day.)

Actually having the board running in a stable fashion did a lot to make me feel like I had successfully assembled the board, and gotten something useful out of the Miniconf. The ExpressIf develoment enviroment (ESP32 IDF) and MQTT (MQTT Wikipedia Page) look to be interesting to explore further.


Automate your home with MQTT

The second Open Hardware session at LCA2017, which I also managed to get onto the hardware list for, was a 1.5 hour tutorial -- "Automate your home with MQTT" -- which provided a very useful breadboard kit (AU$40) based around the ESP8266 (ESP8266 on Wikipedia) also from ExpressIf (the ESP8266 is an earlier chip, which became very popular with hobbiest and led to the design of the ESP32).

The supplied ESP8266/Breadboard kit seemed really good value, and included:

which gives a lot of scope for easily experimenting with simple circuits.

The tutorial was essentially self-guided, starting with a set of pre-supplied examples -- although part way through I realised they suggested following the even shorter 1 hour tutorial, which skipped through the examples even quicker. (Warning: raw HTML; I cannot find a rendered version online.) There were a bunch of helpers in the room to assist anyone who got stuck, but the instructions were fairly well written so pretty easy to follow for anyone with some development and wiring experience.

I got at least half way through the tutorial in the hour and a half available -- far enough to prove the programming tools were working, the hardware was basically working as expected, and to get the network access and MQTT reporting working. The main challenge was that the supplied multi-colour LED was a common anode rather than common cathode LED -- which meant the common line was Vcc rather than Gnd, and the programming settings for the individual colours all needed to be inverted. That difference required some thinking through the logic in the example programs, but was simple enough to accomodate once we realised the substitution.

Despite not "finishing" the tutorial I got a lot of useful experience out of the tutorial, including more insight into how MQTT works and some experience programming in Lua (Lua on Wikipedia). Lua was originally intended to be an embeddable scripting language (eg for ad-hoc automation of parts of a larger program), but also seems to work well for simple "hardware automation" scripting. I did not get as far as the NodeRed part of the tutorial, but it seems like an interesting "wiring up" environment to experiment with later.

Together with the supplied kit to take away, the whole tutorial was very good value, and I was very glad that I had been able to take part. The breadboarding kit should be useful for lots of other simple prototyping later on. (I also took the opportunity to purchase a second ESP8266 breadboard module -- for another AU$8 -- to have a second one for additional experiments.)



Tim Ansell created Tomu a "tiny ARM microprocessor (that fits inside the USB port)", with the intention of being able to turn it into something that can be used for authentication like the YubiKey 4 Nano, but at a lower cost.

Since he was selling the bare (assembled) board, at LCA2017 for AU$10 (a slight loss I think; he talked about them costing US$10 to manufacture...), and I happened to be nearby when someone else asked about buying one I also bought one in case it turned out to be useful.

Tim warned everyone buying them that while the board was assembled it did not even have a bootloader installed. Fortunately some people have documented how to program the bootloader using a Raspberry Pi and OpenOCD.

Since I did not actually have a Raspberry Pi, I have ordered a Raspberry Pi 3, a Raspberry Pi Cobbler Plus breakout board with a breakout cable, and associated bits from NiceGear so that I would have the parts necessary to attempt to program in the bootloader. Hopefully those will arrive this week, and together with the breadboard from the MQTT tutorial kit I think it should be enough bits to program in the boot loader. (Plus of course the Raspberry Pi 3 should be useful for other things on its own.)


Numoto Mimas v2

I learned about the last hardware item also via Tim Ansell -- one of his many projects is to improve the MicroPython on FPGA support. This is primarily to improve the programming environment for the HDMI2USB Video Capture System, particularly to benefit the Numato Opsis board (used for capturing the video the last few years) that Tim Ansell helped design.

Tim organised a BoF session on MicroPython on the FPGA at LCA2017. During that BoF he mentioned that their main development target is the Numato Mimas v2, which costs (just!) under US$50 in quantity one, for a self-contained development board. Once Tim highlighted that this was also the same board that the j-core Open Processor (which was also presented about at LCA2017) was targetting for development, I decided to order one as soon as I was home -- I have always wanted to play with programmable hardware. (I actually ended up ordering two because US$50 is more than cheap enough to have a couple around to experiment with -- it works out a tiny bit cheaper than the Raspberry Pi 3. They turned up pretty much as soon as I got home -- having failed to be delivered while I was still travelling, due to being shipped much faster than I expected!)

So far I have not even had time to take the boards out of their static bags, let alone do anything with them. But the idea of simulating a processor in hardware seems exciting, particuarly an Open Source processor. (Not all the tools are Open Source -- the minimal FPGA synthesis tool is a "free to use" limited license of a closed source tool from Xilinx.)



By the end of next week it seems likely I will have "one of everything" of what is currently popular in Open Hardware. Then all I have to do is actually find time to do some work with those items.

Posted Mon Feb 6 22:13:06 2017 Tags:

I recently bought a Dell XPS 13 (9360) to dual boot under Microsoft Windows 10 and Ubuntu Linux 16.04 LTS. One of the key requirements for its use as a "conference" laptop was being able to output over HDMI at 720p (ie, 1280x720, a 16:9 ratio), as well as ideally over VGA at, eg, 1024x768 (a 4:3 ratio) for legacy projectors.

During the purchase Dell's website offered the Dell DA200 USB-C to HDMI/VGA/Ethernet/USB-3 adapter at what seemed like a reasonable price, so I purchased that as my only non-base option so that I had it available to try out. However having seen it fail miserably for another Linux user at this year's LCA Sysadmin Miniconf, I did not have that high hope of it working properly in Linux.

Trying it out at home connected via HDMI to my Samsung 6-series TV confirmed what I expected -- the Dell DA200 is still not properly supported under Linux, but HDMI through the Dell DA200 did work under Microsoft Windows 10 (although it appeared all the screen sizes were received by the TV as 1080p, suggesting they were being scaled on the driver side instead).

Since there was a suggestion that fixes after the Linux 4.4.0 kernel included in Ubuntu 16.04 LTS might help, I tried installing the latest Ubuntu kernel (4.8.0; equvialent to what shipped with 16.10 -- Yakkaty Yak). To do that I did:

sudo apt-get install linux-image-generic-hwe-16.04-edge

whch dragged in the right kernel packages, and then rebooted. The system came up properly with the Linux 4.8.0 kernel (including unlocking the crypto disk partition; phew), but the Dell DA200 still did not work very well. I found that I could get 800x600 and 832x624 (two 4:3 ratios) to display via HDMI out the DA200, but no higher resolutions were properly recognised by my Samsung 6-series TV. FreeDesktop Bug #93578 shows that I am not alone in having this problem, and support is still "work in progress". (FreeDesktop Bug #94567 suggests some people got 1400x900 working, but nothing higher, however it is unclear if that was over the DA200/HDMI or not.)

After some "Internet Research" it seemed like the best option was to avoid the HDMI on the Dell DA200 -- people seem to have fairly good success with the Ethernet port (which was another reason I got that adapter) and the USB-3 port, but not with the HDMI.

For HDMI output the most common suggestion was to get a USB-C to DisplayPort/HDMI adapter, which had nothing else in it. As best I can tell these single-use adapters are either the DisplayPort over USB-C Alt Mode, with a DisplayPort to HDMI adapter built in, or perhaps the HDMI over USB-C Alt Mode. These "Alternate Modes" are another way of using the pins of USB-C, without using the USB protocol -- ie, basically what the previous generation of Apple devices did with Mini-DisplayPort and Thunderbolt 1/2 out the same physical port, but with different signals/adapters. My understanding is that those USB-C Alt Modes allow basically passive adapters, but also require support in the laptop for that Alt Mode.

Based on the Dell XPS 13 (9360) specifications I think what the Dell supports is the Thunderbolt 3 USB-C Alt Mode, which includes DisplayPort 1.2. And thus a passive USB-C to DisplayPort 1.2 adapter should work. But it appears that neither DisplayPort 1.3 over USB-C (a different Alt Mode) nor HDMI over USB-C (another different Alt Mode) will actually work -- the former because there is no explicit support listed, and the latter because HDMI over USB-C was only announced very recently.

So it appears we are in for what one blogger calls a "total nightmare" of physically compatible adapters/cables which may or may not work depending on the "Alt Modes" implemented by a given device with USB-C ports. Joy.

Apparently there are also multiple different types of USB-C cable -- with different lanes connected, but the same connectors -- in addition to multiple different ways that something like an HDMI adapter can be implemented. All with the same physical connector. (Apparently the documentation should tell you if it is DisplayPort Alt Mode or HDMI Alt Mode or one of the others; which assumes there is any real documentation.) Some people report there are even incompatibilities with charging over USB-C.

It seems the latest generation of electrical engineers has forgotten all the lessons learned in the last 30-50 years about using different connectors for incompatible things. Seriously, if it fits, it should work (see also the USB-C HDMI adapters are flaky with the MacBook Pro blog post by the same author; they also suggest using a DisplayPort adapter).

In theory one can tell the implementation apart by the USB-C Alt Mode Logo on the device, but in practice there is a lot of hardware out there which does not get many useful markings at all. So I think this will be another USB Full Speed/USB High Speed mess, but turned up to 11.

To try to shortcut some of this mess (most of which I was only slightly aware of before I started writing this blog post), in the interests of expediencey I went looking for a USB-C to HDMI adapter that (a) someone had reported as working under Linux and (b) I could purchase, off the shelf, today, a couple of days before Christmas.

The intersection proved to be the Moshi USB-C to HDMI Adapter which gdx reported working under Linux and was in stock at a "technology store" near me at a merely slighly ridiculous price. Technical information was fairly light -- "UHD 4K output at 60 fps" is about as close as it gets to a specification, along with "Plug-n-Play: no software drivers or power adapter required" being a strong hint it is a passive adapter. Comparing the "UHD 4K output at 60 fps" with the DisplayPort Resolution/Refresh Support, one can guess that the Moshi adapter supports DisplayPort 1.2 -- which tops out around UHD 4K @ 60 fps (max 75 fps); if it had supported DisplayPort 1.3 then we can guess it would have claimed either UHD 4K @ 120 fps, or UHD 5K @ 60 fps. And conveniently DisplayPort 1.2 is a part of the Thunderbolt 3 over USB-C Alt Mode which is the USB-C Alt Mode supported by the Dell XPS 13 (9360).

The good news is that when I plugged it in, the Moshi USB-C to HDMI Adapter "just worked" in Ubuntu Linux 16.04 LTS, at least with the Linux 4.8.0 kernel (backported from Yakkaty Yak, Ubuntu Linux 16.10). I appeared to be able to pick the screen resolution as expected, including picking 720p (1280x720 @ 60Hz), and my Samsung 6-series TV even detected the selected output rate (cf, the Dell DA200 where the TV was detecting 1080p in all selected resolutions). So somewhat by luck (and thanks to gdx's post on Reddit) I appear to have a working solution. (For the record it also seems to work in Microsoft Windows 10 -- consistent with the "no drivers" promise -- although it seemed to only initialise the external HDMI display at all after I had logged in on the primary display. By contrast in Ubuntu Linux 16.04 LTS, the HDMI display comes up at 1080p -- maximum resolution of the Samsung 6-series TV I was using -- before I log in, then switches to the resolution I picked in the settings after I log in.))

From the above research I expect there are other passive USB-C to DisplayPort 1.2 adapters, with or without a DisplayPort to HDMI adapter wedged into them, which should work too (eg, the Kanex KU31CHD4K USB-C to HDMI 4K Adapter adapter which appears to be Thunderbolt 3 based might work; or the Kanex K181-1016-WT8I USB-C/DisplayPort adapter might work with a DisplayPort to HDMI adapter on the outside; the Dell USB-C to HDMI adapter may also be DisplayPort 1.2 Alt Mode, but merely claims HDMI 2.0 compatibility so it is hard to know how it is implemented; and the Google USB-C to DisplayPort adapter also might work, but the "requirements" are so vague -- "video-enabled USB Type-C port" -- as to be useless). In particular it seems that HDMI resolutions covered by HDMI 1.4 standard have a good chance of "just working" with passive adapters, thanks to DisplayPort Dual Mode which is basically a passive repurposing of the DisplayPort cables for HDMI signals... 4K UHD is covered by the HDMI 2.0 standard, which is supported as a Dual Mode by DisplayPort 1.3. (Basically a "Standard HDMI cable" is equivalent to HDMI 1.4, and a "High Speed HDMI Cable" is equivalent to HDMI 2.0 -- see HDMI guide to finding the right cable. So a claim of "4K @ 60 Hz" also hints at HDMI 2.0 compatibility.)

But it is clearly non-trivial to identify (a) how a given USB-C HDMI adapter is implemented from the "specifications" provided, or (b) whether or not they will be compatible with a given piece of hardware, or software. Worse still, even the passive adapters seem to be fairly non-trivially priced (eg, 5% of the laptop cost), so trying them at random is probably not a cost/time efficient approach.

Posted Thu Dec 22 19:40:56 2016 Tags:


As a word of warning to other readers, this is basically a shaggy dog story, containing extremely extensively yak shaving to get to the point of installing a dual-boot system. Anyone hoping for a pithy guide to a dual-boot installation should look elsewhere. (But unfortunately those "elsewhere"s omit most of the hardships of the journey, hence this extensive yak shaving version.)

Almost every step in this process contained a risk of leaving the computer unbootable, or overwriting data. I do not recommend following anything in this blog post unless you are certain that you know exactly what it will do, and that you can get yourself out of any trouble you get yourself into. For basically all of it, my recovery strategy was basically "well it is a new machine, and I could always just do a factory install and start again". If you have data you care about on the system I would recommend experimenting somewhere else -- perhaps in a throw away VM? Proceed at your own risk. Even reading further may harm your sanity...


Since I have been using OS X laptops for a while, the only laptop I have which "runs Linux" (without an Apple logo on it) is now pretty ancient -- the HP NC6220 is around 11 years old. Using an Apple laptop at a Linux Conference is frowned upon, and the NC6620 fairly rapidly became unsuitable in the last couple of years due to the switch from 1024x768 VGA for projection to 720p HDMI for projection. (The HP NC6220 is also... rather heavy to lug around by modern standards. At 2.3kg it is roughly twice the weight of modern "travel" laptops.)

I had been considering getting a "conference" laptop to run Linux for a while, and Apple's recent hardware moves (fewer expansion ports, removing the Esc key (!!), etc) definitely encouraged having another attempt at "Linux on the Desktop". (I had previously run "Linux on the Desktop" for about 10 years, from around 2000-2009, through actual desktops and two separate laptops -- with the HP NC6220 being the last Linux laptop before 6+ years of OS X on the Desktop.)

Looking at my options, the Dell XPS 13 range was getting a fair number of recommendations as a reasonable choice, particularly since there was a "Dell XPS 13 Developer Edition" pre-installed with Ubuntu Linux. Sadly that "Developer Edition" is only available in North America and certain European countries -- and definitely not available in New Zealand. For recent models it appears the hardware is basically identical between the Windows edition and the "Developer" edition; older models seemed to have some hardware differences in things like the Wifi cards. (I guess "Developer" is a code for "wants a Linux OS" :-) And hence Linux compatibility was important. FWIW, I ruled out a Chromebook as an option pretty early on because I am not keen on being tethered to the Googleverse, and terminals are not always enough; beside most of the Chromebooks are painfully low on resources. Also Dell's Chromebook was not available in New Zealand at the time I was buying.)

Late last month, Dell had a sale on selected Dell XPS 13 (9360) laptops, including the Z510891NZ model which was on sale for about NZ$2125 -- about 15% off -- with a roughly two week delivery time. It is a FHD (1920x1080p) model with a reasonably good CPU (i7-7500U, TurboBoost to 3.5GHz), 8GB of RAM and 256GB of SSD -- ie the bare minimum RAM/disk to be of any real use (same as the MacBook Air I bought earlier in the year to have a travel Mac for photography software, but obviously I was not going to dual boot that one).

After a bunch of investigation of other models, including the more expensive QHD (3k across) models (which had options for more RAM/SSD) and older "end of line" models (eg, 9350 - the late 2015 model), I eventually decided I could not justify the higher end models unless it was going to be a "permanent" desktop -- and if I was going to spend that much money it might as well be for the latest model, which would hopefully have the longest useful life as a "travel" laptop, as well as a better supported Wifi card. That may or may not have been the best choice, as it appears going with an Intel SkyLake processor means that USB-2 devices are not supported; where as they were supported on all previous models :-( I think this is just removing the EHCI hub and leaving only the xHCI hub, in which case USB 2.0 devices should work in backwards compatibility mode -- and so far that seems to be the case. But operating systems without USB 3 (xHCI controller) support will not work. Fortunately Linux has supported USB 3 (xHCI controller) for years.

I ordered the Z510891NZ model, with all the base options -- including Windows 10 Home, rather than the "Pro" upgrade. It is a Dell XPS (9360 - Late 2016) model, with 1920x1080p, 8GB of RAM and 256 GB SSD. (If it were my main work laptop, 16GB of RAM and 512GB-1TB of SSD would have been the bare minimum; my existing work laptop has that, and the storage is still pretty tight. But with a travel laptop I can leave most of my work behind -- and running Linux natively means I can mostly avoid the need to run VMs on my laptop, which are usually there to run Linux...)

The only extra I bought at order time was the Dell DA200 USB-C to HDMI/VGA/Ethernet/USB 3 adapter, since it was relatively inexpensive and potentially useful. However it is unclear how well the Dell DA200 adapter is actually supported under Linux, so I may need some different adapters to use with Linux.

For reference:

Dual Boot

Unlike my previous "Linux on Laptop" installs, where I booted straight off the Linux install media the first time I powered the laptop on and overwrite everything on the disk, this time around I plan to retain Windows 10 Home as a dual boot option. The main reasons are:

  • Modern vendor laptops do not come with Windows reinstallation media in the box, so it is harder to change your mind later and reinstall (at present it seems Dell provide a means to download recovery media given a Dell Service Tag; but presumably that is valid only for the warranty period or similar; they also have an at-purchase option to get reinstall media, but it is a substantial fraction of the cost of the laptop so I did not do that).

  • The Microsoft of 2016 is less hostile than the Microsoft of 10 years ago -- they even joined the Linux Foundation and release software on GitHub; it appears Linux everywhere has the newer Microsoft leadership thinking about cooperation more than domination.

  • Occasionally I run into tasks for which having a Microsoft Windows system would be helpful -- and I have no other Microsoft Windows systems (and mostly never have, apart from one "for games" computer many many years ago).

  • Having been bought with a Windows license, the warranty covers only Windows and not Ubuntu Linux, so in the event of a hardware fault being able to demonstrate it on Windows may help.

So while I do not expect to use Microsoft Windows much, I am inclined to retain it on the smallest reasonable portion of the drive that I can, at least until I determine what I am doing with this laptop long term.

On the Linux side, installing Ubuntu 16.04 LTS is a matter of pragmatism -- I prefer Debian-like Linux operating systems, and Ubuntu Linux is the one that has been used by Dell for the "Developer Edition", which means in theory it should be easier to get the hardware working. I am not really a fan of the direction that Canonical has taken Ubuntu Linux Desktop, but for a first install shortly before travel, pragmatism is winning. Dell even have a guide to Ubuntu and Windows 10 dual boot on their hardware, so saying I am "running Ubuntu" at least stands a chance of being understood by the support people.

So my dual boot will be Windows 10 Home and Ubuntu 16.04 LTS.

Windows First

Microsoft Windows needs to be first on the hard drive; it also includes no ability to dual boot anything else. (I guess Microsoft cooperation with Linux, etc, has not got that far yet!) Since it came pre-installed on the hard drive that is fairly easily accomplished -- and just requires shrinking the Microsoft Windows disk partition to leave room at the end to install Linux, then have Linux (grub) take control of the boot menu.

Windows Setup

On first power on, Microsoft/Dell's Windows 10 pre-install guides you through finishing setting up the operating system, like most other modern OS installs. There is nothing especially surprising to watch out for other than:

  • Gigantic legal agreements, displayed in two half-width columns with Microsoft on the left and Dell/others on the right (it appears the Microsoft legal agreement is roughly half length of the Dell/others agreements); these are naturally "all or nothing"...

  • The "Get going fast" page which tries to get you to skip past your ability to opt out of leaking information to Microsoft/Dell; at that point you really want the "Customise" button. There are three pages of customisations (despite the first one having a scroll bar they appear to all fit on the first screen), and I turned all but one of them off (I left "Use SmartScreen online services" on because it seems the most likely to be useful to protect Windows from the regular drive-by attacks).

  • Skip the Microsoft Account page (click on "Skip this step" link at the bottom)

  • If you enter a (full) name with a space in it at the "Who is going to use this PC?" prompt then you will get a home directory with a space in it. Joy. (Cf Apple and Linux which will usually try to give you, eg, a first name as your username/home directory and/or ask for a full name separately.)

  • Choose "Not Now" for Cortana to try to avoid an always-on microphone... (No idea if they leave the microphone on anyway :-( I too would prefer a computer/laptop without a microphone; or at least one that can be definitively disabled. If I need a microphone then I would prefer to plug in a headset, or a decent recording setup.)

  • You can leave the "Support and Protection" page completely empty and just click "Next", which seems to have the effect of skipping it (and/or turning that into "remind me later").

I also chose "remind me later" at the Dell XPS registration page (I bought it direct from Dell; they should know I have it, so I do not feel compelled to give Dell more information). If you get to the end of the Dell XPS Welcome App it tries to get you to activate McAfee/Dropbox; it appears the only way out is to close the app, so that is what I did. (They keep reappearing periodically, eg, after rebooting; and the only opt out option remains to be to close them. No wonder Windows users are annoyed by computers...)

Windows updates

It is probably worth installing all the Windows updates before proceeding.

A while after booting the notification window tells you "we are adding some new features to windows. This could take a few minutes", which I assumed indicates that auto-update defaults to turned on. And checking in Windows Settings -> Update & Security -> Update Settings that is true -- they are on by default. (It claims "Available updates will be downloaded and installed automatically, except over metered connections (where charges may apply)" -- but I doubt its ability to determine that, as many NZ connections are metered including the one I was installing on. People have had horror stories of thousand dollar bills from Windows update over mobile tethering, etc.) So far I have not gone digging to try to find a way to avoid automatic updates/downloads using up my bandwidth... but fortunately I do not plan to run Windows very often, so mostly I get to choose when it updates by when I run Windows :-)

That same "Update Settings" screen initially told me that no updates were available, but unsurprisingly for a "new out of the box" machine when I told it to actually check for updates now, it found several to download and install. Including an Adobe Flash update (surprise!) and a cumulative Windows security update (surprise!). So I let those install and rebooted before continuing.

Update Dell Firmware

Dell firmware updates are released fairly regularly and have helped with Linux support in the past. They are most easily installed from within Windows -- so it is worth checking the installed firmware and updating it if required before proceeding further.

The Dell Update Utility pre-installed in Windows prompted to install a BIOS Update, while I was preparing a USB drive for the making the Windows Recovery Media (below), so I let it go ahead and (try to) install the BIOS Update. I believe it was trying to install the Dell XPS 9360 BIOS version 1.2.3, 1.2.3, which was released 2016-12-14 (in North America presumably, so literally today as I was setting the system up; yesterday when I checked the latest BIOS available was 99.3.24, 1.0.7 released 2016-09-26). But it did not say which version it was trying to install.

For some reason that automatic BIOS update did not seem to do anything (it just restarted without an error message): I keep getting repeatedly prompted to do the BIOS update by Dell Update Manager (eg, each boot), despite trying a few times and even shutting down; and when I go into setup (F2 when the Dell logo is displayed; there's no prompt to do so) I can see that BIOS version 1.0.7 is still installed.

Since I did not actually seem to need to install a new BIOS the day after it was released, I elected to leave actually installing this BIOS update for later, and carry on with creating Windows recovery media.

With the benefit of hindsight that was a mistake, as one of the fixes appears to have been relevant to much of my frustration with creating the Windows Recovery disk: the right USB port not being activated for safe removal -- ie devices plugged in there not showing up as "removable" -- and it took me a long time to accidentally try plugging it in the left hand USB port (which worked) by mistake, and longer still to realise that was important.

Given that the automatic BIOS update never worked (apparently never actually triggered the firmware update process in the BIOS at all), after creating multiple Windows Recovery disks, I went back and explicitly upgraded the BIOS by manually downloading the 1.2.3 BIOS update, and manually running it. It prompted for all the various firmware it was going to update (clearly it is actually a "firmware bundle"), and then rebooted -- and on reboot updated the firmware one at a time, ending with a cold restart of the system (even power light went off briefly). After it was done the BIOS had been updated to 1.2.3, as visible in the F2 setup screen.

The BIOS update included:

  • System BIOS with BIOS Guard: 1.2.3

  • Embedded Controller: 1.0.3

  • Intel Management Engine (VPro) Update:

  • Intel Management Engine (Non-VPro) Update:

  • Main System TI Port Controller 0: 1.2.6 (unchanged from 1.0.7 BIOS)

  • Dino2MLK Board Map: 1.0.1 (unchanged from 1.0.7 BIOS)

  • PCR0 XML: (unchanged from 1.0.7 BIOS)

so there were four separate firmware update processes run, from within the UEFI environment.

(For reference supposedly one should disable BitLocker when updating the BIOS, presumably in case the TPM loses the encryption key for it... but in my case it seems to have worked without doing so. Windows Disk Management claims my drive is "Bitlocker Encrypted" -- out of the box, I assume -- but the encryption panel tells me that I "need a Microsoft Account" to complete the encryption -- presumably to escrow the decryption key. Personally I would prefer to just have to reinstall. So I chose not to "Turn Off" encryption, because it was unclear if I would be able to turn it on again. Also it is unclear whether or not BitLocker encryption is actually fully active, since the system apparently completely boots without prompting for a password -- unlike what happens on a Mac with an encrypted drive, where you get a pre-boot prompt for a password.)

Windows recovery media

Pre-installed Microsoft Windows systems generally do not ship with install media by default any longer -- if it is available, it is a "value add" extra cost (around 3-5% of the computer cost in my case).

Instead Microsoft/Dell provide a means to (attempt to) create your own recovery media from the running pre-installed system. The process seems to be fairly error prone (it took me at least a dozen attempts to get started creating a recovery drive with the system files included trying solutions at random; and that first attempt that I got started failed very quickly without an indication why).

The error messages are extremely opaque, leaving the user to try to guess what might have gone wrong, and find obscure commands (eg, "sfc /scannow" to check for file system corruption; reagentc /info to check the Windows Recovery Environment is enabled) to try to see if they will fix the issue (neither found any problem in my case). The pathological desire to never show a user any error message other than "sorry it did not work" leaves users guessing what might be the problem, and causes much more irritation than simply displaying "incomprehensible" error messages -- at least users can search for the "incomprehensible" error message and find more specific information than searching for "it did not work" will return. I would have given up much earlier if it were not for the fact that I planned to do multiple things that risked breaking the Windows install: change the drive mode to AHCI; resize the main Windows partition to make room for Linux; install Linux onto partitions specified by number.

With the benefit of hindsight, it appears the main issue that I ran into was that in the Dell XPS 13 (9360) BIOS 1.0.7, the right USB port had the issue "no 'safe remove' on taskbar when plug-in USB key to right USB port" (supposedly fixed in the just released BIOS 1.2.3) which meant that devices plugged in there were not seen as removable, and thus not possible candidates for creating a recovery image. But it did not occur to me to try the left USB port until well after I had already concluded that 16GB was actually not large enough; I figured this out only after spending a lot of time (slow) formatting a larger drive so that it would work. Even a message saying "no removable drives found" would have helped, as would a message saying "the removable drives found is too small"; but the tool has neither message. (Ironically I did actually try to update the BIOS to a version with this fix before trying the tool; but the BIOS update failed -- see above.)

What worked for me:

  • Find a 16GB or larger USB drive (in practice about 10GB seems to be actually used on the Dell XPS 13 (9360) with Microsoft Windows 10 Home at present; so a "16GB" USB drive which is actually only 14.5GiB or 14.8GiB should actually work...)

  • Ensure nothing is needed from the USB drive. If it has previously been used elsewhere (eg on a Mac) you may have to explicitly ensure that it has (a) a single primary partition, and (b) that primary partition has the partition type "C" (W95 FAT32), otherwise it may not be recognised (I think this was the case for multiple frustrating attempts with a previously used "32GB" USB drive, previously formatted on OS X, which was not recognised by the recovery tool until I explicitly changed the partition type under Linux, and made a fresh Fat32 filesystem on it).

  • Connect the USB drive to the left USB 3 port on the Dell XPS 13, not the right USB 3 port, so that it is detected as a removable drive (perhaps unless you have successfully updated the BIOS to 1.2.3 -- see above -- but I have not tried again since; the hint that it is not detected as removable is the words "Local Disk" displaying in the File Manager...)

  • Use Disk Management (right-click on the Windows logo) to format the USB drive as a FAT32 file system, using a full format (ie not the quick format) unless it is brand new out of the box. That will take quite some time for a larger USB 2 drive (apparently multiple hours in my case). Among other things the full format forces all sectors to be written and reallocated, and ensures any old file system data does not remain on the drive to cause confusion. The full format should not be necessary for a drive that is brand new; just a quick format.

  • Leave the drive plugged in! There is both no need to eject/unplug it, it also seems to work better if stays plugged in after the format. (Despite the instructions telling you to plug it in after starting the recovery tool.)

  • Open the "Create a recovery drive" control panel, eg, by searching windows for "create recovery drive"; it appears this will eventually run the "Create USB Recovery" app, but apparently one should not run that directly.

  • Permit "Recovery Media Creator" (published by "Microsoft Windows") to make changes to your device (ie, "yes" to the User Access Control).

  • Make sure "Back up system files to the recovery drive" stays checked, so that the recovery media can be used to do a full reset/reinstall from the drive (rather than just the recovery partition; if it is unchecked, you basically get a separate boot disk from which you can attempt to repair the existing install, or using the factory recovery partition).

  • Once the stars align so that your USB drive is recognised as an option you will see your USB drive as an "available drive", and "Next" will be selectable; select your USB drive and click "Next".

  • Confirm that the drive can be overwritten and you are okay with all content being lost (hopefully nothing left on it before this point), by clicking on "Create".

  • All going well, the recovery drive should be created, with a progress bar covering preparing the drive, formatting the drive, copying utilities, backing up system files, and copying system (most of the progress bar). It should end with "The recovery drive is ready" and a "Finish" button.

  • Click "Finish" to exit, and the app will close. You should end up with a "RECOVERY" drive. In my case the first RECOVERY drive I successfully made had 21GB free out of 29GB (ie, it took about 8-10GB), so I made a second one on a smaller smaller "16GB" (14.5GiB) USB drive once I had figured out the USB drive had to be placed in the left USB 3 port on the Dell XPS 9360 with BIOS 1.0.7.

Some versions of the the tool apparently offer an option to delete the recovery partition from the drive, if it is copying from the Recovery Partition but this did not appear in my case (good, as I had planned not to delete the recovery partition: on my system the various recovery partitions take about 10GB, which is space that would be useful, but not so useful as to be worth making it impossible to use that as a recovery mechanism later on).

Generally it seems creating the Recovery Drive without the system files (ie, bootable only; no reset/reinstall options) works for many people, and takes relatively little space (under 1GB it seems; it asked for a 512MB USB drive, and worked for me with a "1GB" one). That Recovery drive without system files is basically equivalent to a 1990s emergency boot floppy -- just scaled up to larger software sizes.

But the version with the system files seems to be very problematic for lots of people, with very little indication of the specific cause of the problem, due to the pathological desire never to display a specific error message. There is no good reason for a tool that almost every user is "recommended to run" to be this error prone and user hostile.

For reference, TenForums has a good illustrated guide to creating the recovery drive, which indicates that "Back up system files to the recovery drive" is actually creating a system backup. It also has another article on using the recovery drive, and an explanation of the Windows 10 advanced startup options. (Also of note, an article on how to find your Windows Product Code -- although it appears to also be displayed in the "System" control panel, which would seem easier to find...)

The other plausible "bare metal" recovery option is to download Dell's Windows Recovery Image for the Service Tag, which in my case is a 6.4GB file. Then follow the instructions to create Windows 10 install media -- which are Windows specific but appear to be basically making a FAT32 file system from the command line, and copying the contents of the downloaded ISO onto it. So I have downloaded that file too (about a 6 hour download...), for safe keeping (since my faith it will remain available for a long time is pretty low).

For later reference, using the recovery drive, which involves pressing F12 during the Dell logo display (for the "one time boot menu"), and choosing to boot off the USB drive (most likely the obscurely named UEFI Boot option that is not the "Windows Boot Manager"). Unless you have an activity indicator on the USB drive, the main indication is that the boot is much slower than normal... and you should arrive at a menu letting you choose your keyboard layout. From there the Troubleshoot menu will allow doing a Factory Image Restore (presumably from the restore partition), and offer other Advanced Options to repair the system, including a command prompt. (The recovery drive with the full system image adds "Recover from a drive" to the options at the top menu, in addition to "Factory Image Restore" which is from the recovery partition.)

In theory no product key is required to use the the recovery media providing the recovery media is made on an activated system (check via Start -> Settings -> Update & Security -> Activation). I am not sure if that is true of doing the factory restore from the recovery partition; hopefully it is, as the days of Microsoft Certificate of Authenticity being plastered over every laptop seem to be over. (The only sticker is an Intel i7 inside one.)

Changing the disk to AHCI mode

Linux will only install on the NVMe hard disk when it is in AHCI mode, not when it is in "RAID" mode. Dell defaults to setting it to "RAID" mode, and installing Microsoft Windows 10 Home in "RAID" mode. (This "RAID" mode uses Intel Rapid Storage Technology (IRST) which appears to use a RAID mode on the individual storage chips on the NVMe SSD; but a driver for this particular IRST seems to be only available for Microsoft Windows, so all the Linux install instructions for the Dell XPS describe changing to AHCI mode.)

Microsoft Windows is understandably surprised at having its disk device changed, and needs some reassurance before it will boot again -- but it can be persuaded to work in AHCI mode fairly easily. The most useful instructions on changing Microsoft Windows to AHCI mode list the steps:

  • Right click the Window icon and select to run the Command Prompt (Admin) from among the various options, to start an Administrator Command Prompt.

  • Invoke a Safe Mode boot with the command:

    bcdedit /set {current} safeboot minimal
  • Restart the PC.

  • When the Dell logo appears, press F2 to enter the BIOS.

  • Navigate to System Configuration -> SATA Operation

  • Change from RAID On (Intel Rapid Restore Technology) to AHCI mode.

  • Read the warning about changing the SATA Operation mode (basically the OS may be confused -- hence asking for safe mode above)

  • Answer "Yes" to the "are you sure you want to continue?" question.

  • Hit "Apply" (at the bottom of the screen), ensure "Save as custom user settings" is ticked, then click "Ok" on the "Apply Settings Confirmation" window.

  • Exit by the BIOS settings by hitting the "Exit" button at the bottom right.

  • The system will do a cold restart (the powered on indicator goes off briefly then comes back on), then boot up Windows.

  • Windows 10 will launch in Safe Mode, as requested above. (There will be various warnings about things that cannot run in Safe Mode, while an Administrator is logged in, which can be temporarily ignored.)

  • Right click the Window icon and select to run the Command Prompt (Admin) from among the various options, to start an Administrator Command Prompt.

  • Cancel Safe Mode booting with the command:

    bcdedit /deletevalue {current} safeboot
  • Restart your PC once more and this time it will boot up normally but with AHCI mode activated.

That uses a minimal "safe mode" boot to allow Windows to redetect the hard drive device, after which it should then boot normally. Note that the command includes the word "safeboot" not "safemode"; if the /deletevalue is refused then check that you have entered the command correctly. (Some people noticed freezes on the SSD in Windows, for which there is apparently an update that helps -- a Samsung 950 Pro driver update I think; for now I am hoping that the update has made it into the base system in the last 6 months. So far I have not seen any issues, but the system has had only minimal usage to date.)

In theory the disk access will be somewhat slower in AHCI mode, but it does have the benefit of actually working in both Microsoft Windows 10 and Ubuntu Linux 16.04 LTS.

Shrink Windows disk to make room for Ubuntu Linux

It is possible to shrink the Windows drive from within Windows these days, at least with Windows 10. (Other people have used other freeware tools.)

To do it with the built in Windows tools it is supposed to be possible to:

  • Right click on the Start button, and run "Disk Management"

  • Find C:, the Active/Primary NTFS partition

  • Right click on that partition, and choose "Shrink Volume..."

  • Enter the amount to shrink by, ie make available for other use

  • Click on the "Shrink" button

But this did not work for me (see below).

In terms of choosing the amount to shrink, with only a "256 GB" (238 GiB) drive installed, and dual booting, space is very tight. Other than Microsoft Windows 10's partition (C:) there are several other pre-installed partitions:

  • 500MB UEFI boot partition at the start of the drive

  • 450MB "Recovery" partition (UEFI boot for recovery, I think -- the label is "WINRETOOLS"), after the Microsoft Windows 10 partition.

  • 9.85GB Recovery partition (I assume with the factory install image on it; the label is "Image"), after that.

  • 1.08GB Recovery partition (with the label "DELLSUPPORT, presumably some sort of diagnostics or similar) at the end of the drive.

So that is a total of 12GB used up by non-Windows/non-Linux things, leaving 226GB to be shared by Microsoft Windows 10 and Ubuntu Linux 16.04 LTS. That 226GB is the default size of the Microsoft Windows 10 partition. Of that 226GB, Microsoft Windows 10 is using 32GB basically out of the box, leaving 194GB free.

Given that I am mostly going to use Linux, I wanted most of the space for Linux, but I did want to leave some free space for Windows, in case I wanted to run third party applications there. So a reasonable option appeared to be about 33% to 40% of the drive for Microsoft Windows, leaving 60% to 66% for Linux. Fiddling with some numbers, the best round numbers seemed to be taking 140GB for Linux (about 62% of the drive) and leaving 86GB for Windows (about 38% of the drive), which leaves a bit over 50GB free for Windows. That is not dramatically smaller than the usable space for Microsoft Windows on a 128GB SSD (about 107GB for Windows) so it seems like it should be tight, but okay.

So I chose to shrink the Windows C: partition by 140GB, to leave 86GB (about 50GB free) for Windows.

Unfortunately when I tried to do so via the Disk Management GUI, all I got was an error from "Virtual Disk Manager" that "The parameter is incorrect" (without actually bothering to say which parameter is incorrect :-( ). The Windows Application Log suggest that the error is Event ID 257 (link is Windows 8.1, which got a hot fix, not Windows 10 though; there seems to have been an issue with SSD defrag on Windows 8). There was also a code 0x80070057 in the Event Log, which seems to appear in various other situations (mostly around upgrades to Windows 10 from earlier versions of Windows) without a clear resolution. For reference, this Error ID 257 seems to be different from the more common Event ID 259 error -- Event ID 259 seems to relate to unmovable files.

After a lot of stumbling around trying various other things, I eventually found that I could shrink the drive, but only if nothing ever tried to ask for the maximum amount the drive could shrink -- and the GUI version of Disk Manager always asks what the maximum it is that it can shrink the drive before even asking how much you want to shrink the drive.

The work around was to shrink the disk at the command line:

  • Right click on the Windows logo and run a "Command Prompt (Admin)"

  • Run "diskpart"

  • Do "list volume" to see the volumes

  • Do "select volume N" to select the volume to shrink; in my case "select volume 0"

  • Guess a safe amount to shrink and attempt to shrink that amount; I found that I could not shrink by the full amount that I wanted in one go (it was refused as above the minimum amount), but I could get there in three steps.

  • Shrink with: "shrink desired=NNNNN" where NNNNN is the amount in megabytes to shrink the existing volume by. The three steps I used were:

    shrink desired=51200
    shrink desired=51200
    shrink desired=40960

    to shrink by a total of 140GiB (those numbers being 50 * 1024 and 40 * 1024 respectively).

  • Use "list volume" again (potentially after each shrink command as you go) to verify that it happened as planned. In my case the result was an 86GB C: as intended.

  • Use "exit" to leave diskpart

In particular it is vital not to run "shrink querymax" as that is the command that fails with Error 257, 0x80070057, for reasons I still have not established. Once "shrink querymax" it appears you have to quit diskpart and start it again before it will do any shrinking, even of a specified size.

When you are done in the command line, going back into the Disk Management GUI should allow you to verify that you have an 86GB C:, and 140GB of "Unallocated Space" immediately after it.

After shrinking the drive it is useful to reboot Windows a couple of times to make sure it is happy with the new, smaller, partition.

For the record, things that did not work to resolve the problem of using the Disk Management GUI to shrink the drive (or diskpart's shrink querymax):

  • "sfc /scannow" (no errors, no change in symptoms)

  • "DISM.exe /Online / Cleanup-image /Scanhealth" (no errors, no change in symptoms)

  • "DISM.exe /Online / Cleanup-image /Restorehealth" (which failed early on with Error 1726 -- some sort of RPC failure -- the first time but worked the second time. However it made no difference to the ability to shrink the drive :-( See also more on running DISM checks.)

  • To rule it out, as it was a common issue shrinking volumes, I also tried the suggestion of sancho.s on, following the download3k instructions. These temporarily disable several key Windows features in the hope of getting them out of the way of shrinking the drive; may sure you have good backups of anything you want to keep first! Unfortunately they made no difference in my case, so I tried to turn everything back on again (although obviously the system restore points prior to turning system restore back on are now lost :-( ).

  • I also tried temporarily going back to "Raid on (IRST)" mode for the disk, but that also appeared to make no difference -- ie, same symptoms when attempting to shrink the drive. (But for what it is worth the command line shrinking was done in "Raid On (IRST)" mode, and I put it back to AHCI mode after the shrinking was complete. I do not think the disk mode made any difference.)

  • Temporarily turning off McAffe AntiVirus (which Dell had force-installed :-( ) also did not change the symptoms (and it seems to turn itself on at each boot too).

  • For completeness I also tried to do the Shrink in Safe Mode, but got told "This service cannot be started in Safe Mode" -- a different message, but no more helpful. (It is also unintuitive why shrinking the disk should be blocked in safe mode; shrinking the disk is the sort of thing you would expect to want to do with fewer processes running...)

  • The only commonly suggested thing that I did not try was deleting the NTFS Journal, but mostly because I figured out how to make it work with the command line diskpart first...

Depending on how I end up using the laptop, it is possible I may end up re-partitioning it again -- which will involve re-installing Ubuntu Linux, but hopefully should not require reinstalling Windows (as I think it can Expand/Shrink upwards). For the lack of other suggestions, I tried following the guide to shrink beyond where unmovable files are located -- basically by disabling the features that include those files.

(For reference, it is also possible to shrink the drive from within the Ubuntu Linux installer, using gparted, but it seemed safer to let Windows shrink itself, on a fairly fresh install, as Windows will understand NTFS best. In hindsight it seems like it would have been a lot faster to shrink the drive in the Ubuntu installer, or maybe to use the freeware Partition Manager suggested by this answer.)

ETA, 2017-03-30: Michael Wisniewski, who was following this guide emailed to point out that it is worth jumping ahead to the "Adding Windows back into the grub boot menu" (at the end of this post) at this point, and explicitly disabling Fast Boot in Windows 10 before carrying on, as that will make grub automatically recognise the Windows partition during the install and save some of the extra mucking around with rebooting that I needed to do (as detailed below). If you do, be sure to shut down and power off Windows 10 completely before carrying on, to ensure the Fast Boot files are flushed from the disk.

Ubuntu Linux 16.04 LTS

Creating an install USB drive

For reasons which are not completely obvious, Ubuntu Linux (and most other Linux distros) still distribute their installation media as CD/DVD ISO images, even though most installs these days are done on machines without physical CD/DVD drives. So modern installs are typically either from virtualised CD/DVD drives (eg, via Dell iDRAC) or from USB drives -- and the same features providing virtualised CD/DVD drives usually also provide virtualised USB drives now.

Possibly it is time for Linux distros to start providing install media in a form which can just be directly copied onto a USB drive (eg, with dd). However since that has not happened there are literally dozens of tools to create a USB install disk, starting from an existing Ubuntu Linux system, a Windows system with third party software, or on a Mac (but it is not clear if the media created on a Mac will only work on a MacBook, as it is not clear what ends up in the .img file, as the UDRW image format seems to be Mac specific). What these tools actually do seem to be a pretty well guarded secret, for non-obvious reasons. Although some of them, like rufus are actually open source, so presumably one could reverse engineer the process... Unfortunately many of the recommendations online seem not to work when the computer is in UEFI mode (which you would think was pretty common these days).

Fortunately for a UEFI install the steps actually required appear to be pretty simple:

  • Ensure your computer is booting in UEFI mode (very likely with a modern Windows 10 pre-install, especially if Secure Boot is enabled)

  • Find a 2GB+ USB drive that you can completely overwrite (even the partition table will be overwritten in the next step; so all contents will be lost). Plug the device into a Linux computer.

  • Create a removable USB drive with an msdos (MBR or Master Boot Record) partition table, with a single FAT32. (Supposedly the drive should have a GPT (GUID Partition table) with a single FAT32 file system on it, eg from another Linux system, but when I tried with a gpt partition table, the USB drive was not recognised in the Dell XPS 13 UEFI boot menu. I do not know why.) These instructions are for an msdos partition table which seemed to work for me even in UEFI boot mode:

    sudo apt-get install parted
    sudo parted /dev/sdN print              # Check it is right device!
    sudo parted /dev/sdN mklabel msdos      # Overwrite partition table
    sudo parted /dev/sdN print | grep msdos # Check it is a msdos table now
    sudo parted /dev/sdN mkpart primary fat32 0% 100%
    sudo parted /dev/sdN print              # Verify partition created
    sudo mkfs.vfat -F 32 -n UBUNTU -v /dev/sdN1      # Make file system
    sudo parted /dev/sdN print              # Check fat32 detected
    sudo parted /dev/sdc align-check opt 1  # Verify alignment

    where /dev/sdN is the device of your removable drive (be extra careful that you have identified your removable drive, and not one of your internal drives!)

    At the end, for the msdos partition table you want something like:

    ewen@tv:~$ sudo parted /dev/sdc print
    Model: SanDisk Cruzer Blade (scsi)
    Disk /dev/sdc: 8003MB
    Sector size (logical/physical): 512B/512B
    Partition Table: msdos
    Number  Start   End     Size    Type     File system  Flags
     1      1049kB  8003MB  8002MB  primary  fat32        boot, lba

    Note that sector 34 is the first available sector when using a GPT, which is 17408 bytes into the disk; but it may not be well aligned. parted is very opinionated about partition alignment, but unwilling to share its recommendations beyond "the computer says no". Using the calculation tool posted here it seems like 2048s (ie, 1MiB, but not 1MB) is likely to result in acceptable alignment, and the default when 0% is specified; 100% is a shortcut for "until end of the device". This 0% 100% shortcut seems poorly documented (see eg attempt 4 of the blog post with the calculation tool). It is really unfortunate that parted (a) requires you to enter the start/end values, and (b) does not bother to share its recommended alignment when telling you the provided start/end values are not to its taste. Using percentages seems to be the only way to trick parted into picking its favourite alignment rather than simply critiquing your own guesses with useless "no, guess again" messages.

  • Download the latest Ubuntu 16.04 LTS 64-bit (amd64) desktop install ISO, currently ubuntu-16.04.1-desktop-amd64.iso, and ideally at least verify the SHA1SUM (in the releases/16.04/SHA1SUMS file).

  • Copy the contents of the ISO, either by mounting it or by directly extracting it with something like p7zip. I chose to use p7zip because it was available, and there were already plenty of yaks shaved so far.

    sudo apt-get install p7zip-full
    # Mount USB drive, if not auto-mounted; uid/gid values set to my own
    sudo mount -o uid=$(id -u),gid=$(id -g) /dev/sdN1 /mnt
    cd /mnt
    7z x /var/tmp/ubuntu-16.04.1-desktop-amd64.iso
    # Verify that the EFI Boot files also got extracted
    ls EFI/BOOT/
    # Unmount the drive
    sudo umount /mnt

    Most of what is extracted is a squashfs file system container file; the rest is just ancillary files to get the system to boot in various environments.

  • For certainty, explicitly mark the partition as bootable:

    sudo parted /dev/sdN set 1 boot on
    sudo parted /dev/sdN print | grep boot    # Check for boot flag

Then remove the USB drive from the computer your are creating it on, ready to be used as an Ubuntu 16.04 LTS liveboot or install disk.

See also hints on dual booting with UEFI install, and working with Ubuntu secure boot; although I did not pursue these when trying to figure out why the GPT partition table version would not boot.

Booting the Ubuntu 16.04 LTS installer

Plug the USB drive created above into the Dell XPS 13 (9360). Because of superstition, I plugged it into the left USB 3 port (see above for why!). Power on the machine, or restart it.

When the Dell Logo appears press F12 to get the "one-time boot menu". All going well you should see:

  • "Boot mode is set to: UEFI; Secure Boot: ON"

  • "UEFI BOOT:" menu listing two items

  • The second item should be something like:

    UEFI: SanDisk, Partition 1

    where the vendor matches the vendor of your USB drive.

(The first item is labelled "Windows Boot Manager", and appears to actually boot the Dell SupportAssist partition at the end of the drive; it is unclear why it does not have a better name than "Windows Boot Manager".)

Highlight the "UEFI: SanDisk, Partition 1" (second) entry and press return to boot it. You should get a GRUB menu offering "Try Ubuntu without installing" and "Install Ubuntu" as the first two options (in a minuscule font, in the top left corner, even on only a 1080p display -- apparently whoever wrote this has amazing eyesight, or really likes negative space!).

The "Install Ubuntu" (second entry) starts a simplified installer for Ubuntu Linux 16.04 LTS, which it turns out (see below) is too simple for my requirements. It is also possible to use the "Try Ubuntu without installing" to kick off an install to the hard drive, from within the live environment -- and that is what I needed to do (see below).

Ubuntu 16.04 LTS disk layout

My aim with this installation is to have:

  • Microsoft Windows 10 / Ubuntu Linux 16.04 LTS dual boot

  • UEFI Secure Boot enabled (so Microsoft Windows 10 will run)

  • Ubuntu 16.04 LTS installed on an encrypted partition (because it is a laptop, and will be used for travel)

Unfortunately it turns out that this complexity is beyond the standard Ubuntu 16.04 LTS Installer -- ie the one you get with "Install Ubuntu" -- so various manual setup is required. (The default installer will let you have either (a) encrypted LVM by itself on the drive (wiping everything else) or (b) dual boot with Windows -- but not both. This may be partially a limitation of the Desktop installer... as I found hints that maybe the other installers were more cooperative; but I had already downloaded the Desktop installer and did not feel like, eg, downloading the Alternative installer in the hope of more options.) Sadly hunting for answers turns up a lot of older information, so I tried to stick to Ubuntu Linux 14.04 guides on the assumption they would be more relevant to Ubuntu Linux 16.04.

The recommended approach is to boot Ubuntu in as a live environment -- "Try Ubuntu without installing" -- and then use that live environment to set up the disk before invoking the "Install Ubuntu 16.04 LTS" option. This requires a series of manual steps.

When taking this approach on the Dell XPS 13 (9360) the first thing to note carefully is:

  • /dev/sda will be the USB drive you booted from

  • There is no /dev/sdb

  • The drive you want to install onto is /dev/nvme0n1, because the modern Linux device naming convention aims for "stable device names" at the expense of ease of use.

  • There are partitions on that /dev/nvme0n1 with the suffix pN, for each partition, eg, /dev/nvme0n1p1, /dev/nvme0n1p2 and so on.

  • The free space created above (by shrinking the Windows partition) will be in the middle of the drive -- between /dev/nvme0n1p3 and /dev/nvme0n1p4.

  • The EFI boot partition is at the start, as /dev/nvme0n1p1.

  • The easiest way to check all of this is:

    sudo parted --list

    which will discover and list all storage devices and their partitions. Once you find the right one, it is more useful to do:

    sudo parted /dev/nvme0n1 print

    and avoid the confusion of the other devices.

The closest guide to what I wanted recommended creating two partitions for Linux -- one for /boot, to be unencrypted, and one for an encrypted LVM volume that contains everything else. The /boot partition needs to be 512MB-1GB, so that there is room for multiple kernels; I chose 512MB because I was tight on space already (and know from past experience that with Ubuntu's rate of kernel version number churn, 256MB is too tight). The LVM partition can be the remainder of the free space.

To actually partition the drive, it is easier to use the gparted GUI rather than fight with the command line options. Run with:

sudo gparted /dev/nvme0n1

and verify that the disk partition map shown looks (essentially) the same as you saw in Windows 10.

Then create the partitions by:

  • Highlighting the "unallocated" 140GB

  • Choosing Partition -> New

  • For the first partition choose:

    • Free space preceding (MiB): 0

    • Free space following (MiB): 512 (leaving room for /boot after it)

    • Create as: primary partition (yay GPT, and many primary partitions)

    • Partition name: Ubuntu

    • File system: unformatted

    with the partition size auto-calculated to use all but 512 MiB of the unallocated space, and the Label: left empty since we are not making a file system. Once it looks okay, click "Add" and verify the "New Partition #1" appears where you expected.

  • For the second partition chose:

    • Free space preceding (MiB): 0

    • New size (MiB): 512

    • Free space following (MiB): 0

    • Create as: primary partition

    • Partition name: UbuntuBoot

    • File system: ext2

    • Label: UbuntuBoot

    with the hope that the Partition Name/Label will help us identify this boot partition later. Click "Add" and verify that "New Partition #2" appears where you expected it.

At this point you should have a "New Partition #1" of 139.50 GiB, and a "New Partition #2" of 512 MiB, assuming you shrunk the Windows partition by the same amount described above.

Assuming everything looks okay, click on the green tick icon at the top to "Apply All Operations", and then after double checking what will be done confirm you are sure you want to Apply the changes. It should fairly quickly tell you that all operations completed successfully, and then you will have two more partitions: /dev/nvme0n1p7 (139.50GiB) and /dev/nvme0n1p8 (512MiB). One more:

sudo parted /dev/nvme0n1 print

should confirm those partitions exist.

Now it is possible to set up the encrypted container:

sudo cryptsetup luksFormat --cipher aes-xts-plain --key-size 512 \
             --hash sha512 --iter-time 2000 /dev/nvme0n1p7

being extra careful to choose the larger newly created partition (since otherwise it will overwrite data you want to keep). When you are certain it is the newly created partition you have specified type "YES" to overwrite it. (Here 512 bit means AES256 due to the way the XTS cipher mode works.)

You need to enter a passphrase for the container (twice), without which the container will be inaccessible. Be sure it is a passphrase you will remember, but will not be easily guessed by someone else. Long but memorable is good :-) (I have a feeling that the direct way that the passphrase turns into the encryption key means it is not possible to change the passphrase later; so choose carefully.)

The format phase completes pretty quickly, as it is effectively a "quick format". Before carrying on, we want to open the LUKS volume, and then overwrite the volume with zeros (forcing encrypted data to be written out to the underlying file system).

To open the volume:

sudo cryptsetup luksOpen /dev/nvme0n1p7 dell

and then enter your passphrase again (hopefully you still remember it!).

It is useful to use the pv (pipe viewer) tool to be able to monitor the progress of overwriting the disk. That is not available on the Live CD, but is available in the Ubuntu repository. Ideally one would just turn on the Wifi, and install it, but for some reason the live CD was unable to connect to my Wifi (yet the Install mode was able to connect fine). So I downloaded the package manually, picking the latest version which was before the 2016-04-xx release date of Ubuntu 16.04 LTS. Then copied that file over on another USB drive, and installed it with:

cd /media/ubuntu/...       # USB drive auto-mounted location
sudo dpkg --install pv_1.5.7-2_amd64.deb

Then zero the volume in a root shell

pv -tprebB 16m /dev/zero | sudo dd bs=16M of=/dev/mapper/dell

where the pv options mean:

  • -t: enable timer mode

  • -p: enable progress bar

  • -r: enable the rate counter

  • -e: enable guessing the estimated completion time

  • -b: turn on bytes copied counter

  • -B 16m: transfer in 16MB chunks (for efficiency)

and the target device is the open LUKS volume (not the raw underlying partition), so that all the writes go through LUKS, and thus get encrypted. Note that the 16m for pv is in lower case and the 16M for dd is in upper case, to make both tools happy.

On my system this seemed to proceed at about 250MiB/s, and thus would take several minutes but under an hour to complete; I went off to do something else at this point. It was done by the time I came back, reporting that it finished in under 9 minutes -- I guess there are some advantages to a fast processor, fast storage, and small disks. (In my day job I am more used to many terrabyte RAID arrays, which take literally days to initialise.)

Once that is done it is possible to set up LVM within the encrypted volume, with two logical volumes -- one for root, and one for swap. I allowed 2GiB for swap, which should be plenty to allow RAM overcommit, but if the system ends up swapping beyond that there's probably something badly wrong. (If you want to hibernate the system you would need more swap space I think.) To set up LVM:

sudo pvcreate /dev/mapper/dell
sudo vgcreate vg /dev/mapper/dell
sudo lvcreate -n root -L 137.5G vg
sudo lvcreate -n swap -l 511 vg       # 2G - 1 extent: remaining extents
sudo vgdisplay -v vg

And then initialise those two new filesystems:

sudo mkfs.ext4 -b 4096 -j -L root -v /dev/vg/root
sudo mkswap -L swap /dev/vg/swap

Installing Ubuntu 16.04 LTS

Now it is possible to launch the Ubuntu Linux 16.04 LTS installer from within the Live CD, by double clicking on the "Install Ubuntu LTS" icon (top left).

The Ubuntu 16.04 LTS installer asks for your install language, and your Wifi network/credentials -- the joys of computers without Ethernet ports bundled! Fortunately the password entered there actually works; it is unclear why the same password given to Network Manager does not work.

The installer then offers to let you install third party modules, but only if you turn off Secure Boot -- ah the joys of kernel/module signing :-(. I declined the third party modules, since my Wifi clearly works without third party modules and generally the Dell XPS Ubuntu Linux install guides suggest that by Ubuntu 16.04 LTS pretty much everything works "out of the box"; I would rather have Secure Boot enabled, to make Windows/Linux dual booting easier.

At the Installation Type (install disk) prompt, you need to choose "Something else", so that you can tell the installer about the partitions that you have carefully prepared for it. You want to select:

  • /dev/mapper/vg-root as an ext4 file system, to use for /; which the installer is allowed to format (that seemed to be necessary before "Install Now" was activated).

  • /dev/mapper/vg-swap as a swap disk, which does not need to be initialised.

  • /dev/nvme0n1p8 (the unencrypted partition made earlier) as ext2, to be /boot, which the installer is allowed to format. (I had earlier tried to make that Fat32 on the assumption the UEFI booting will need a Fat32 partition to boot from, but the Ubuntu Linux installer will not accept /boot on a Fat32 partition, so I changed it to ext2 based on the advice that journalling cost 30MB. In hindsight there is already an EFI boot partition on the disk, which is FAT32, so /boot just needs to be readable by grub without unlocking the encryption.)

To set the use for each one highlight the partition then click "Change". Be very careful not to select any of the NTFS partitions, or the raw partition underneath the LUKS encryption.

I left the "Device for boot loader installation" set to /dev/dm-0`` based on [this post recommending *not* overwriting the MBR with grub` when booting via UEFI]( (It also contains the advice to disable Windows Fast Startup, and an explanation of how it works and why to disable it -- detail I wish I had realised before getting this far into the Linux install.)

/dev/dm-0 is /dev/mapper/dell, ie inside the LUKS encrypted container. That seems a safe enough place to install an MBR record, but unlikely to have any effect on booting. (If necessary to reinstall grub to the MBR, this post explains how to convert to UEFI booting with grub.)

After double checking the disk partitions, click on "Install Now". It will prompt that it is going to overwrite the partitions you selected to format (/dev/mapper/vg-root for /; /dev/mapper/vg-swap; and /dev/nvme0n1p8 for /boot). Check they are the partitions you intended and then click on "Continue". (In my case it also needed to update the partition table to convert partition #8, ie /dev/nvme0n1p8 from fat32 to ext2, as I had set originally created it as fat32 earlier due to a misunderstanding. Actually that fat32 partition is the EFI partition which needs to end up mounted at /boot/efi, and the Ubuntu installer seems to do that automatically.)

There is a pause while the disk is updated, and then it asks for your location (eg, for the timezone), which to my surprise guessed correctly (my assumption is based on network-driven geolocation). It then asked for the keyboard layout, before asking for the user account to create. There is an option to "encrypt home folder" which I did not select, because the entire install is on an encrypted disk.

Once you click "Continue" after the user account prompt, Ubuntu Linux 16.04 LTS will actually install. Complete with the usual "captive audience" in-installation advertising. At least some of the install appears to be downloaded from the Internet (rather than coming off the install CD).

One the installer finishes, click on "Continue Testing" to drop back to the live environment, so as to be able to configure it to actually boot.

Configuring Ubuntu Linux 16.04 LTS to boot

After the install finishes mount all the partitions used in the install under /mnt:

sudo mount /dev/vg/root /mnt
sudo mount /dev/nvme0n1p8 /mnt/boot
sudo mount --bind /dev /mnt/dev

and then transition into that environment, and mount some more useful virtual filesystems:

sudo chroot /mnt
mount -t proc proc /proc
mount -t sysfs sys /sys
mount -t devpts devpts /dev/pts
mount /boot/efi

Within the chroot environment, create /etc/crypttab:

UUID=$(sudo blkid /dev/nvme0n1p7 | cut -f 2 -d '"')

echo "# TARGET SOURCE KEYFILE OPTIONS" | tee /etc/crypttab
echo "${LUKS} UUID=${UUID} none luks,retry=1,lvm=${VG}" | tee -a /etc/crypttab

# Check your work
cat /etc/crypttab

And the file /etc/initramfs-tools/conf.d/cryptroot:

echo "CRYPTROOT=target=${LUKS},source=/dev/disk/by-uuid/${UUID}" | tee /etc/initramfs-tools/conf.d/cryptroot

# Check your work
cat /etc/initramfs-tools/conf.d/cryptroot

Then run:

update-initramfs -k all -c

to update the initial ramdisk to have those boot options.

Finally edit /etc/default/grub to update the kernel command line so that it reads:


with those values expanded out, and the double quotes present. Eg:

echo "cryptopts=target=${LUKS},source=/dev/disk/by-uuid/${UUID},lvm=${VG}" | tee /tmp/cmdline

vi /etc/default/grub
# r /tmp/cmdline
# and adjust into place between the double quotes

Check the values make sense with:

grep GRUB_CMDLINE_LINUX /etc/default/grub

then run:


to update the grub boot time configuration files. (There will be some errors about /run/lvm/lvmetad.socket connection failures, which I think can be ignored at this point.)

For completeness, check that there is actually an "ubuntu" entry on the EFI boot partition:

ls /boot/efi/EFI/ubuntu

and that it has sensible EFI files on it, including shimx64.efi, which is the file needed to kick start UEFI Secure Boot into grub.

At this point, if you are very lucky, Ubuntu Linux will boot automatically; if it does not, you get to boot off the Live CD again, and try to fix up the boot environment.

Unmount the partitions you mounted, and exit out of the chroot:

umount /boot/efi
umount /boot
umount /dev/pts
umount /proc
umount /sys
sudo umount /mnt/dev
sudo umount /mnt

Then use the cog in the upper right to choose Shutdown -> Restart.

All going well, the system should restart into grub, which should then have an option to boot Ubuntu. When Ubuntu starts booting it should prompt to unlock the LUKS device ("dell" in my case), and when that password is entered, the system should boot to a graphical login window. In my case, to my complete surprise, this just worked the first time. (At minimum I was expecting it to boot Windows instead, due to not having disabled Fast Boot.)

Assuming you reach the login window, log in and check that the system is running as expected:

mount | egrep "root|boot"
swapon -s

The "swapon" output probably says /dev/dm-2 if you followed the instructions above; you can check which logical volume that is by:

ls -l /dev/mapper | grep dm-2

and it should turn out to be vg-swap.

Assuming it all looks okay, try rebooting again to make sure that works, and also try shutting down/powering off, and powering on again.

Thanks to the Ubuntu encrypted LUKS/LVM guide for all the directions in getting this going.

Adding Windows back into the grub boot menu

The main issue that I noticed with the way that I went through the install is after installing Ubuntu Linux 16.04 LTS, there was no longer an option to boot Microsoft Windows 10 -- instead grub is started automatically, despite trying to avoid overwriting the boot record.

I was able to get back into Microsoft Windows 10 by going to the "one-time boot menu" (F12 when the Dell logo is displayed), and choosing the "ubuntu" option, which then booted Windows (surprise!). (If this did not happen then it probably would have been necessary to boot the Windows recovery USB drive made earlier.)

I believe both of the issues above happened due to having left Windows Fast Boot enabled when I installed Linux (because I did not realise how important it was to turn it off until half way through the Ubuntu install. Thus (a) Microsoft Windows 10 did not end up in the grub boot menu, and (b) the UEFI boot menu was set to boot up Microsoft Windows.

To disable Fast Boot in Windows 10:

  • Go to Control Panel -> Hardware and Sound -> Power Options

  • Click on "Choose what the power buttons do"

  • Click on "Change settings that are currently unavailable"

  • Untick "Turn on fast startup (recommended)", so it is disabled.

  • Click on "Save changes" at the bottom, then exit the Control Panel.

Then shut down the computer so that Windows does a full shutdown, and powers off.

Power the computer back on again. grub should launch automatically. Boot Ubuntu Linux and log in. Once logged in, open a terminal and run:

sudo update-grub

and grub should find the Linux boot volumes, and also find the Windows volume automatically this time (ie, now that Fast Shutdown is turned off).

Restart again. Let grub launch. There should be a "Windows Boot Manager" option in the grub menu. Select that and verify that it boots Windows. (Due to the Ubuntu/grub background being non-black, it does not look quite as elegant as when booted directly from the Dell prompt, but it does work.)

Restart again, and verify that Ubuntu Linux still boots. Power the system off, and on again, and verify that Microsoft Windows 10 will boot from the grub menu. Power the system off again and verify that Ubuntu Linux will boot up.

At this point, the "dual boot" installation is basically complete. What remains is getting the two systems to share the hardware nicely... starting with the hardware clock (Windows assumes the hardware clock holds local time; Linux assumes it holds UTC :-( ). And installing/configuring all the applications in each OS. All of which is beyond the scope of this (very long) post :-)

Posted Sun Dec 18 10:50:05 2016 Tags:

Last Updated: 2016-10-29


As I found last month while setting up an OS X Server, OS X 10.9 (Mavericks) is no longer supported by Apple -- since approximately the time that macOS Sierra (10.12) was released (it appears Apple only actively supports three releases). This also resulted in a lot of third party vendors rolling out new softare releases without support for OS X 10.9 (Mavericks): Adobe, VMWare, etc.

So it was clear that I needed to upgrade my last OS X 10.9 system fairly soon, so that it could have current security updates, and also so that current software could be installed (I already had 2-3 updates that I could not install due to an unsupported OS). Unfortunately my last OS X 10.9 system was also a system that I used all day, every day, so being without it for an extended period would be disruptive....

This weekend is the last "long weekend" in New Zealand before the end of the year, and originally seemed like the best opportunity to do the upgrade without interruptions. It did not quite turn out like that due to CVE-2016-5195 and being in the middle of dealing with multiple customer issues (including network latency). But knowing that I did not have another option in the near future, and following the success setting up OS X 10.11 El Capitan on a client and server it still seemed worth going ahead. Particularly since it happened to be a weekend just after my Internet bandwidth usage cap just reset, so I had less need to be concerned about the volume of downloads (around 40GB downloaded in two days, mostly for this upgrade -- the rest being a few hours of streaming TV).

This hardware is actually capable of running macOS Sierra, but I chose to go with OS X 10.11 (El Capitan) because I was familiar with setting it up, and because macOS Sierra is still quite new and applications software is still catching up.


I started preparing for the update a few weeks ago, by checking all the commerical software I had for what was supported on OS X 10.9 (Mavericks) through OS X 10.11 (El Capitan). Most of the software that I had was supported on both, but not always with the same application version. In some cases I either had already purchased an upgrade to a version including support for OS X 10.11 (El Capitan) or there was a free upgrade available; in other cases the only way to go from the version I was running to a version that would work on OS X 10.11 (El Capitan) was to purchase an upgrade.

Where possible I upgraded applications to versions that were supported on OS X 10.9 and OS X 10.11 in advance of upgrading to OS X 10.11 (El Capitan). This was particularly useful with key tools like Alfred, where I had a free upgrade to Alfred 3. But also tools like SuperDuper! have been very good about updating for each new OS version.

I also made a list of the applications for which paid upgrades would be required, and their cost. The main two affecting me are:

  • VMWare Fusion -- I needed to go from 6 Pro to 8.5 Pro (6 does not support 10.11; 8.5 Pro does not support 10.9)

  • OmniGraffle Pro -- I needed to go from 5 Pro to 6 (which supported both 10.9 and 10.11), but unfortunately did not purchase the upgrade in time so can only upgrade to 7 (which does not support OS X 10.9)

In addition I made lots of backups -- not just the two Time Machine backups that happen automatically, but also two clones (with SuperDuper!) of the whole hard drive (one before the SuperDuper! upgrade, and one after) and also additional backups of the key things on my hard drive. Just in case. Two of the backups were to external drives that I usually leave detached; and I also detatched the external drive with my Time Machine backups during the upgrade.

I also identified about 30GB of files that could be "temporarily" removed (ie, relying only on the backup copies -- I made extra copies of those ones) to ensure that there was about 50GB of free space during the whole upgrade process.

OS X Upgrade

Before starting the upgrade, I installed the last few OS X 10.9 updates and then rebooted OS X 10.9 -- to ensure that the installer was running on a freshly booted system. Unfortunately that boot... failed to complete (grey screen, with a spinning cursor). I left it 15 minutes, before giving up -- powering it off (holding the power button down), then powering it on again. On the next boot (after entering the key to unlock the disk) I got told the machine had failed -- and choose not to restore the open applications. To be sure, I rebooted the machine again -- which worked fine. It seemed safe to proceed. I removed the 30GB of sacrifical files, and checked that there was 50GB free.

I chose to do this upgrade from within the OS (rather than booting off the USB key I made earlier), because the drive was encrypted and I wanted to retain the drive contents (thus a USB boot was not necessary). To start the upgrade:

  • Run /Applications/Install OS X El

  • Agree to the license terms

  • Agree to install to "Macintosh HD"

  • Click on the "Restart" button in the application

The tool reported that it would take about 26 minutes, and reboot multiple times -- which turned out to be basically true (it took around 30 minutes, and rebooted several times). The progress messages were updated somewhat sporadically, but were relatively useful in tracking how far the upgrade had got. There was one period where the fans spun up to full speed, but that settled down again after a couple of minute (I assume either a lot of compression/crypto work happening, or perhaps temperature control firmware changes).

After the base OS upgrade completed, it booted into an abbreviated form of the OS X 10.11 (El Capitan) setup pages -- asking about my Apple ID (which I chose not to provide at that step), and whether I wanted to share diagnostic/usage data with Apple (no thanks).

Then a fairly normal OS X 10.11 (El Capitan) desktop came up, mostly with my preferences for that laptop. The main surprises were:

  • Calendar showed (6) "sync" conflicts, but it was unclear where it was syncing to, in order to find those conflicts; most of the records looked basically the same, so I chose a winner somewaht arbitrarily (favouring my phone calendar first, then the OS X Calendar, over the iPad calendars -- as that is my experience with what I use to enter events, and also what most robustly handles recurring/day spanning events; the iPad does poorly at both, particularly "all day" events which basically forever have gradually creeped from one day to multiple days :-( ).

  • Apple's also auto-launched, to my surprise -- but did show new mail, so clearly its preferences were okay. The main thing I noticed is that my Smart Folders were missing almost all the mail they should have had, but I assumed that was because the Spotlight database had been wiped and was being rebuilt (they did repopulate overnight, I assume once the Spotlight reindex completed).

At this point I forced a Time Machine backup (to my OS X Server, as I was not going to connect external drives yet), before carrying on. It had 33GB to back up. I let that finish before the next steps.

Installing OS X Updates

The next step was to install the Apple patches that had been released since the "Install OS X El" had last been released. I installed the OS X Security Update first (with a restart), and then Safari (another restart; unclear why they cannot be scheduled to be updated together).

Once those were done I installed the application updates that were present:

  • XCode (4GB)

  • iMovie (2GB)

  • iPhone (2GB) -- to the last released version, AFAICT, from 2015 (supposedly makes migration to the Photos app easier)

  • Aperture 3.6 (0.6GB), from 2014 (not sure if it will run under OS X 10.11 El Capitan, as the last supported version was supposed to be 10.10; but I basically gave up trying to use Aperture when Apple discontinued it so it does not matter either way

ETA, 2016-10-26: Then just a couple of days after I had the last set of security updates, another base OS X Security Update (10.11.6) came out -- in (partial?) mitigation of issues found with XNU task_t (it is not clear if they are sufficiently mitigated in 10.11.6; the blog post implies that only the macOS 10.12 Sierra refactoring significantly helped mitigate the root cause). So that was two more reboots -- for the 10.11.6 security update, and another Safari update.


My next step was to run sudo port selfupdate to see what the status of MacPorts was like. It turns out that MacPorts checks the OS version and refuses to run:

Error: Current platform "darwin 15" does not match expected platform "darwin 13"

which then directs you at the MacPorts Migration page.

The Migration process is:

  • Install the new version of XCode (done, above)

  • Ensure the XCode command line tools are installed, and recognised:

    xcode-select --install
    sudo xcodebuild -license

    by choosing "Install" (rather than all of XCode), and agreeing to the license (typing "agree" at the end).

  • Test that the compiler now works, eg with a simple "Hello World" application

  • Download and install the latest MacPorts base package for your OS version (which fortunately I already had handy from my previous OS X 10.11 El Capitan installs); after which "sudo port ..." commands at least ran without errors

  • Remove all ports, keeping a list of what they were:

    port -qv installed | tee myports.txt
    port echo requested | cut -d ' ' -f 1 | tee requested.txt
    sudo port -f uninstall installed
    sudo rm -rf /opt/local/var/macports/build/*
  • Reinstall all ports, using the restore_ports.tcl script:

    chmod +x restore_ports.tcl
    sudo ./restore_ports.tcl myports.txt
    sudo port unsetrequested installed
    cat requested.txt | xargs sudo port setrequested

Because I had about 1700 ports installed (not all of them active; I usually keep the previous active version of each port around, just in case of problems, so that I can roll back)... that was always going to take quite a while.

It also failed in various ways, so I was glad that I had run it in a script session and could review the failures. The main ones seemed to be:

  • Failures due to packages that had been replaced (eg gnome-icon-theme with gnome-theme-standard) where the maintainer decided the correct solution was to throw an error, rather than install a transitional package which depended on the new package :-(

  • Failures due to packages that had been superceded, with the same "throw an error rather than install a transitional package" issue (Debian does this so much better). Perl 5.16 (p5.16) was an especially common case of this for me.

  • Perl 5.22 (p5.22) packages failing to build and/or download, claiming they were unable to download the dist files from anywhere (possibly a transient network errors, or possibly related to the logged reports of too many open files).

  • There is also a Perl 5.24 (p5.24) which seems more relevant now.

At the (failed) end of the first attempt about 240 ports (of 1700 installed before) had been reinstalled.

I get the impression that restore_ports.tcl had not really been tested with, and fixed for, large numbers of ports. And/or tested with lists of ports that are several years old. In particular it almost certainly should not be trying to install inactive ports, as either (a) there is an active port with the same name that is needed instead, or (b) the port is not being used. All my Perl 5.16 (p5.16) ports were inactive.

Ignoring the inactive ports, there were around 1000 active ports to reinstall; and ignoring Perl 5.22 (p5.22), which I am happy to consider replaced by Perl 5.24 (p5.24), there are about 800 ports to install.

In the end I restarted the port install process with:

 port install gnome-theme-standard

which pulled in quite a few dependencies. Then I built a variation of myports.txt without p5.16 or p5.22 ports in it, including only the active ports, and tried again with that:

 egrep -v 'p5.16|p5.22' myports.txt | grep active | tee myports-active.txt
 sudo ./restore_ports.tcl myports-active.txt

that did some cleanup for the already installed ports, and then started installing with the p5.24 ports, before carrying on to install other ports.

That got most of the way through, before failing out again with "too many open files" -- so I had to run:

 sudo ./restore_ports.tcl myports-active.txt

a second time to finish the missed ports. The second run with myports-active.txt ran to completion with few errors. I ran:

 sudo ./restore_ports.tcl myports-active.txt

again to check, which showed two packages that failed to install: py-dateutil and jpilot. For both of those sudo port install PACKAGE failed.

For py-dateutil:

sudo port clean py-dateutil
sudo port install py-dateutil

was sufficient to get it going.

For jpilot, which is important to me still, it turned out that the configure script was failing, due to a missing dependency for Perl XML::Parser, which turned out to be because the Perl being used was MacPorts Perl 5.22, but I had skipped installing the p5.22 packages, because they were all failing -- intending to switch to Perl 5.24. However I had omitted to switch to Perl 5.24, and it is not the default (yet). Since now seemed like as good a time as any to switch (and the port select framework does not work for Perl), I followed the magic incantation and reinstalled the perl5 package:

sudo port install perl5 +perl5_24

which then changed the /opt/local/bin/perl link to point at Perl 5.24 (as verified with perl -v).

After that, I was able to install jpilot again with:

sudo port clean jpilot
sudo port install jpilot

I ran:

sudo ./restore_ports.tcl myports-active.txt

one last time to be sure, and this time it completed without errors, getting to around 800 ports installed.

Finally I reset the "requested" flags appropriately with:

(egrep -v '^gnome-icon-theme|^p5-|^p5.16|^p5.22|git-core' requested.txt |
     sort | uniq | tee requested-kept.txt)
sudo port unsetrequested installed
cat requested.txt | xargs sudo port setrequested

where that list of things to exclude was found somewhat iteratively, as port setrequested will fail if it is given something that is no longer installed.

Overall the process took several hours, a lot of downloads, and some fincky manual attention to get back to the point of a mostly working set of ports.

I suspect the restore_ports.tcl script might benefit from either (a) explicitly closing files it does not need, (b) ensuring that the open file handles do not leak to children, or (c) ensuring that it reaps child processes promptly; I did not try to investigate which handle limit was being hit -- ie a process-wide one or a system-wide one. It would also clearly benefit from not trying to install ports which were not active before (the information needed is in the output it reads).

ETA, 2015-10-25: I found that the active perl version had reset itself back to 5.22, which seemed to be due to "perl5 @5.22.2_0+perl5_22" for some reason becoming the active port, possibly related to the "setrequested" above, or possibly related to updating something else. To fix this I did:

sudo port activate perl5 @5.22.2_0+perl5_24
sudo port uninstall perl5 @5.22.2_0+perl5_22

so that only the desired "flavour" of perl5 was installed.

I found that I had to forcably rebuild irssi (which uses perl as an extension module) to get it to pick up this change, as it appeared to have been built elsewhere with perl5 set to perl5.22:

sudo port upgrade -s -n --force irssi

(the "-s" forces a source rebuild; the "-n" limits the rebuild to just the named port; the "--force" ensures it gets rebuilt even if it is just the same version). After that my irssi helper scripts seemed to work again.


Once MacPorts was sorted out (above), which provides git, rsync, etc, for my git-annex install, I think upgraded git-annex by:

  • Moving the old version aside:

    cd /Applications/OpenSource
  • Opening the git-annex.dmg that I had downloaded recently, and dragging that into /Applications/OpenSource

  • Verified that my git-annex links in /usr/local/bin still worked:


    (which they did as they pointed at the same path that I had just put the new version into)

I then did a bit of testing with git-annex to make sure that it was working as expected -- eg, git annex sync in some existing repositories, etc. It seemed okay -- and I did not expect any problems given I had the same version working on another system, albeit with a different setup.

Once that seemed to be working, I forced another Time Machine backup to my Time Machine server.

VMware Fusion 8.5 Pro

Even though VMware state that "A simple upgrade to support a new OS shouldn't cost you", that appears to only be true if you have already purchased all the other upgrades. So instead the best I get is VMWare Upgrade Pricing being allowed to jump from VMware Fusion Pro 6 to VMware Fusion Pro 8.5 -- and in theory VMware Fusion Pro 8.5 supporting both OS X 10.11 El Capitan and macOS Sierra to allow a bit more wiggle room for future upgrades. (It appears to be no cheaper to upgrade from an earlier Pro version to the latest Pro version than it is to upgrade from any supported earlier version or the current non-Pro version to the current Pro version... which is an odd penalty for repeatedly upgrading to the Pro version.)

I purchased the upgrade earlier this week in anticipation of upgrading; it cost around AUD$100 plus tax -- so a moderately priced upgrade, at least given jumping multiple versions in one go.

After purchasing and downloading VMWare Fusion 8.5 Pro (which appears to be one download file for both versions, without bundled drivers), the upgrade process seems to be:

  • Verify the checksum of VMware-Fusion-8.5.0-4352717.dmg:

    shasum -a 256 VMware-Fusion-8.5.0-4352717.dmg

    (which should be 2a19b1fd294e532b6781f1ebe88b173ec22f4b76d12a467b87648cc7ff8920f1 according to the information on the VMWare download page at present)

  • open VMware-Fusion-8.5.0-4352717.dmg

  • Double click on the icon in the dmg to run the installer

  • Agree to running the "application downloaded from the Internet" (so long as you are happy with how you downloaded it)

  • Enter your password to give the installer administrative privileges

  • Agree to the terms and conditions

  • Choose "I have a license key for VMWare Fusion 8", and cut'n'paste the license key you have into the license key box (you should be rewarded with a yellow tick). Choose Continue

  • Enter your password again to allow the installer to proceed

  • Verify the "thank you" screen mentions "VMware Fusion Pro" if you purchased the professional license

  • Untick "Yes, I would like to help improve VMware Fusion" to reduce the amount of information leaked back to VMware

  • Click "Done"

VMWare Fusion 8.5 should launch, and VMware Fusion->About should show that it is "Professional Version 8.5.0 (4352717) -- and if you had it installed before your existing VMs should be present in the Virtual Machine Library.

There is no further prompts for where to install, etc; as far as I can tell it automatically installs into /Applications/VMWare -- complete with embedded space -- and the previous version seems to be ignored if it was located somewhere else.

I prefer to have my purchased appplications in /Applications/Purchased if possible, and also to remove the older version (wich will not run on OS X 10.11 El Capitan anyway), so I:

  • Went to /Applications/Purchased and trashed the old VMware

  • Dragged the new VMware from /Applications/ into /Applications/Purchased

Launch some of the VMs to make sure that they run correctly. On upgrading packages in some of my Ubuntu Linux VMs, I noticed reports that:

/etc/vmware-tools/ /usr/lib/vmware-tools/moduleScripts/ not found

a few times; I'm not sure if they were a result of an automatic update happening, or an error with the older VMware Tools on the newer VMware Fusion.

So after upgrading the packages, and then rebooting, I also went to force the VMware Tools upgrade with Virtual Machine->Reinstall VMware Tools which told me to "Mount the virtual CD in the guest, uncompress the installer, and then execute to install VMware Tools". On Ubuntu Linux, assuming it is already set up for building the VMware Tools, that is something like:

sudo mount /dev/sr0 /media/cdrom
sudo mkdir -p /usr/local/src/vmware-tools-distrib
cd /usr/local/src/
sudo tar -xpf /media/cdrom/VMwareTools-10.0.10-4301679.tar.gz
cd vmware-tools-distrib
sudo ./

But it turns out that open-vm-tools is (a) available for (at least) Ubuntu 14.04 LTS, and (b) recommended by VMWare, as advertised in the installer.

So instead I uninstalled the existing VMWare tools by:

sudo /usr/local/bin/
sudo shutdown -r now

(exact location of the uninstaller depends on the version: /usr/bin/ is mentioned in the VMWare documentation).

Once the system came back up again, I installed open-vm-tools:

sudo apt-get install open-vm-tools
sudo apt-get install open-vm-tools-desktop    # system with X11

and rebooted again to make sure they were running correctly on boot.

I also tested my wrapper script to launch VMs from Alfred, and they did successfully start the VMs -- but unfortunately they left the Virtual Machine Library window open. Previously I had a closewindow script to automatically close the Virtual Machine Library window after a short delay (via AppleScript), but even after fixing a capitalisation issue, that fails with:

31:96: execution error: VMware Fusion got an error: every winow whose
name = "Virtual Machine Library" doesn't under the "close" message. (-1708)

which is not very helpful -- and unfortunately there still does not seem to be an option to either (a) not show the Virtual Machine Library window or (b) automatically close it when a Virtual Machine is launched :-(

My "close window" script, which uses AppleScript, seems to have been broken by VMware removing AppleScript support in VMware Fusion 7 :-( It appears that vmware.sdef was removed in the release (GA) version of VMWare Fusion 7, and has never reappeared -- with VMWare Fusion 7, some people were able to make it work by restoring vmware.sdef from their VMware Fusion 6 install... but it is unclear if that also works for VMware Fusion 8.

After some hunting around I found that I could use vmrun to start the VMware Virtual Machine without the annoying Virtual Machine Library window appearing at all. Usage, at least in my case with the "custom" location of VMware is:

"/Applications/Purchased/VMware" start /VMDIR/VMNAME.vmx

where VMDIR is the directory containing the VMX file, and the rest is the name of the vmx file itself. It appears to work whether or not VMware Fusion is already running, and whether or not the VM is already running (without error; if it is already running then it just silently exits).

So I have changed my Automator Wrappers for Alfred to use that approach instead of the open VMDIR and closewindow approach I was using previously.

Adobe updates

Use the Adobe launcher to (a) upgrade itself, and then (b) install the outstanding application upgrades (which relied on being on OS X 10.10 or higher). Other than a few downloads being required, it was pretty easy to get back up to date.

OmniGraffle Pro

I have OmniGraffle Pro primarily for interaction with people who use Visio -- which means that I need OmniGraffle Pro. Version 5, which I already own, does not work on OS X 10.11 El Capitan, which means that I need to upgrade -- and the only upgrades available are paid upgrades to the current version, OmniGraffle 7 released earlier this month (OmniGraffle 7 Release Notes). Since OmniGraffle 7 did not work on OS X 10.9 Mavericks, I put off upgrading until after I had OS X 10.11 EL Capitan working.

Upgrading OmniGraffle to version 7 is a matter of:

  • Going to the OmniGroup Store

  • Clicking on OmniGraffle for Mac

  • Entering your existing license owner and key to "View Upgrades"

  • It should offer a US$99.99 upgrade to OmniGraffle 7 Pro; click "Buy"

  • Pay for the license

  • (Hopefully!) receive email with license key in it

  • Download OmniGraffle 7, noting that there are fairly recent updates at present (the OmniGraffle 7.0.2 I downloaded a week or so back in preparation is already out dated by an OmniGraffle 7.0.3...)

  • open OmniGraffle-7.0.3.dmg

  • Agree to the license agreement

  • Trash the old OmniGraffle

  • Drag the new into a suitable location, eg, /Applications/Purchased

  • Run the new from Finder so as to be able to agree to running an application downloaded from the Internet

  • Close the "Welcome to OmniGraffle 7" window

  • Go to OmniGraffle->Licenses, and then choose "Add License..."

  • Cut'n'paste the Owner and License Key values from the email sent by the OmniGroup Store, and choose "Save"

  • Check that the "Thank You" window shows "You have licensed OmniGraffle 7 Pro", and that the "trial version expires" text has vanished from the top right.

After which hopefully OmniGraffle 7 Pro will work in an equivalent manner to OmniGraffle 5 Pro. (I suspect that the vendor stencils that I downloaded may have vanished, but I will leave figuring out how to get those going again for later.)

Backups, backups, test reboot

At this point I forced another Time Machine backup to each of the external Time Machine drive, and my Time Machine Server. Then did a test reboot.



My printer driver seemed to work automatically, so I left that one alone.

Wacom Intuos Pro tablet

The Wacom Intuos Pro tablet drivers had been updated to 6.3.18-4 (released 2016-10-14) in the time since I installed them on the last system. And the system I was upgrading had a much older version (6.3.8 I think; it did not seem easy to find out which version was installed).

So I:

  • Ensured the tablet was not connected to the computer.

  • Downloaded the latest driver version

  • Removed the previous driver, which I had to look up on, as Wacom had managed to break their website/FAQ section somewhere since August :-( For reference: run the Wacom Utility, and click on "Remove" in the "Tablet Software" section, then enter your password to allow the removal to happen. (It should say "The Tablet Software has been removed", and the Menu Bar icon should go away if you had that enabled.)

  • open WacomTablet_6.3.18-4.dmg

  • Run the Install Wacom Tablet.pkg installer

  • Agree to the license, confirm the installation (158MB, down from 250MB last time I installed it!) and note that it is necessary to restart the computer after the install finishes. Enter your password to allow the installer to make changes.

  • After the installer has copied the files, click "Restart" to restart the computer (and thus load the drivers).

  • Once the system comes back up, plug in the tablet and make sure it is recognised, and can be used to move the pointer around. There is a "Wacom Tablet" tab added to the System Preferences which can be used to configure the tablet buttons, etc.

"FTDI" USB Serial adapter

As I found when installing my previous OS X 10.11 El Capitan system, the situation system for the "FTDI" USB Serial adapters (bought as a store branded item, about a decade ago -- the store is now out of business!) is... problematic. Due to having a different VID/PID (ProductId=0x0421 (1057), VendorId=0x0b39 (2873)) than the current modern devices, they do not work with Apples FTDI driver -- and OS X 10.11 El Capitan system protections prevent modifying the driver Info.plist to match the VID/PID while still leaving the normal system protections enabled.

FTDI do have a version 2.3.0 signed driver for OS X 10.9+, released 2015-04-15, which in theory will work with the ProductId/VendorId and OS X 10.11 El Capitan. But as I explained in my previous post, FTDI are very consumer hostile, so it is a lottery whether or not they will "brick" the devices -- permanently disable them.

I was expecting to have to undertake the "FTDI lottery" (install new driver, see if the device would ever work again) to find out whether or not my device would work with the later FTDI driver.

To my surprise I found that without doing anything, when the USB Serial device was plugged in to my upgraded system, it worked. It both appeared in /dev/ (eg, /dev/tty.usbserial-OCBAMDL5) as it did before the upgrade and seemed to pass serial data back and forth as before.


ksextstat | grep -v

showed that "com.FTDI.driver.FTDIUSBSerialDriver (2.2.18)" (ie, the version I had installed before upgrading) was still being loaded, to my surprise.

The FTDI download page implies (by omission) that the 2.2.18 driver is not signed (ie, it is listed as only suitable through OS X 10.8; driver signing was introduced later), and digging around in:


seems to confirm that there is no code signature for it.

Some hunting on the Internet turned up online discussion suggesting there is an exception list of known older drivers that are allowed to load without their own signature (ie, Apple is vouching for them). This is in:


and contains a list of kernel extensions and hashes, as a whitelist, in OSKextSigExceptionHashList. That list seems to include several versions of the com.FTDI.driver.FTDIUSBSerialDriver, including 2.2.18 (presumably with the hash for the version I already have installed on my upgraded system is one of them -- but the currently downloadable 2.2.18 may not be; some more information on kernel exceptions). Apparently the OSKextSigExceptionHashList contains most of the unsigned third party kexts known to Apple as of when the signing requirement was introduced with OS X 10.9 (Mavericks) -- which may be the reason why the FTDI 2.2.18 driver worked even in OS X 10.9 (Mavericks). The hashes are 160 bits long -- same length as a SHA1 sum, for instance, but not many other hash functions -- but it is not clear what they are a hash of, or exactly what algorithm is used (I cannot find a match for the hash of Info.plist or FTDIUsbSerialDriver with SHA1 or RIPEMD-160). Presumably it uses something similar to what codesign uses, but I have not managed to identify what algorithm that uses either, or precisely what the signature is calculated over (see also Apple TN2206 on code signing).

Since /System/Library/Extensions is not supposed to be written to by third party vendors (since OS X 10.10 Yosemite IIRC) I suspect it would not be possible to use the automatic installer of the FTDI 2.2.18 driver again -- but it may be possible to install it by hand onto another system in /Library/Extensions.

However this does mean that my upgraded system is working, without me having to risk installing a newer driver to brick the hardware. So I am declaring success for now.

ETA, 2016-10-29: Looks like I declared success a bit too soon. Today on trying to use the USB serial adapter that had been plugged in for days, I found that it seemed not to work although the driver was loaded and the /dev/tty.usb... device was present. Looking in dmesg I found:

211310.991669 USB to Serial Converter@41112000: AppleUSBDevice::ResetDevice: <software attempt to RESET>

repeated frequently. With some searching for that message I found a post about OS X El Capitan and its refusal to reset USB devices which suggests the problem is that OS X 10.11 El Capitan is does not wake the USB device from sleep with a "Reset" any longer -- and thus presumably the USB device stayed asleep. (Other people have had similar issues with older versions, and with Arduino Development on OS X 10.11 El Capitan.)

Armed with that information I found that I was able to wake the USB Serial converter up by stopping the application using it and then doing the usual "have you tried unplugging it and plugging it back in again". That is a (barely) tolerable work around for my main USB Serial usage (occassional Pilot backups). The only other periodic using is serial console where usually I'll start plug in a different adapter and start the client application almost immediately, which should not run into this issue.

My best guess is that OS X 10.11 El Capitan is now putting that USB port (on my keyboard) to sleep and not waking it up again unless something tries to access it -- whereas OS X 10.9 Mavericks may not have been putting it to sleep. Possibly the later driver version might actually call USBDeviceReEnumerate, rather than just ResetDevice, which seems like it might avoid this problem, but possibly not. (Apparently "they rewrote the USB stack in 10.11", and the driver has not been updated since OS X 10.9ish anyway.) Alternatively it might be possible to write something, perhaps with libusb that will wake the device up by forcing re-enumeration. There is a darwin_reset_device which will to force reenumeration in some situations, but it is unclear if it would be triggered in this situation. (If not it may be possible to follow the same logic and force re-enumeration, digging into the internal structures.)

(For amusement: found while debugging this, how to hook up a serial terminal to OS X.)

Subsequent issues found

ETA, 2016-10-28: On first trying to start a new backup with SuperDuper! by plugging in an external USB backup drive (which had a "scheduled" backup), I was greeted by the error:

sh: /usr/bin/lockfile: No such file or directory (127)

This turns out to be a known issue with SuperDuper! after upgrading to OS X 10.11 El Capitan, after beta 4 -- the lockfile command being used was part of procmail, which was removed in OS X 10.11 El Capitan. (And indeed "which lockfile" returns nothing now.)

Apparently the update of SuperDuper! that I had already installed (while still on OS X 10.9) includes a bundled lockfile command... but it appears that the auto-triggered backup scripts didn't get re-written automatically, even when running SuperDuper! after upgrading SuperDuper! and before upgrading to OS X 10.11 El Capitan

The (only, AFAICT) fix is to open SuperDuper! manually, remove every scheduled job, and recreate each one from scratch, at which point the scheduled jobs get written with the new lockfile path. This is hinted at by the "And that means, unfortunately, that users have to delete and recreate their schedules." line in the Shirt Pocket announcement, but the reason for needing to do so is never really explained in any detail.

As far as I can tell SuperDuper! handles these "Backup On Mount" commands using the AppleScript found in "SuperDuper!.app/Contents/Resources/Backup on Mount.scpt", which runs at login and on drive connect, from launchd. This is done via ~/Library/LaunchAgents/com.shirtpocket.backuponmount-login.plist and ~/Library/LaunchAgents/com.shirtpocket.backuponmount.plist which run a copy of the script, in "~/Library/Application Support/SuperDuper!/Scheduled Copies/Backup on Mount.scpt" for reasons that are not clear. It uses the StartOnMount feature of launchd to trigger that. Then if it finds a newly appeared drive which matches a drive for which it has a scheduled job, it will start that scheduled job.

After a lot of debugging I found that:

  • The copy of "Backup on Mount.scpt" appeared to have been updated to call the correct lockfile (although for some reason it was not byte for byte identical to the one in /Applications/Purchased/SuperDuper!.app; and I still do not understand why there is a copy made). At a guess this copy of "Backup on Mount.scpt" was updated when I ran SuperDuper! after upgrading it, on OS X 10.9 Mavericks, before the OS upgrade.

  • It appeared that the "Backup on Mount.scpt" ran correctly, but the /usr/bin/lockfile failure reports continued.

  • Digging deeper it turned out that every scheduled job had its own copy of "Copy Job.applescript", which itself had a hard coded call to /usr/bin/lockfile -- and presumably that call was the one that was failing :-( (These jobs are in "~/Library/Application Support/SuperDuper!/Scheduled Copies/Smart Update FOO from BAR.sdsp" for anyone looking; there appears to be a version compiled into an Application in "Copy`" too.)

  • It appears these scripts are built from a "template" with some sort of string substitution, and then saved into the Scheduled Job, and compiled into an application that can be run on drive attach. (AFAICT the template is /Applications/Purchased/SuperDuper!.app/Contents/Resources/Copy Job Script.template.)

  • Using directory names with spaces and especially with exclamation marks in them is SuperAnnoying! (at least when trying to look at anything in the command line); and using script files that are CR terminated only (Old School MacOS) is painful on modern macOS (which expects unix style LF termination).

The result of all of this is that there basically is no reasonable way to make SuperDuper! Scheduled Jobs work again on OS X 10.11 El Capitan other than deleting every Scheduled Jobs, and creating each one again from scratch. So I have deleted all the scheduled jobs. And now I have to remember as I get each external drive back from off site to re-add the scheduled job so that its backup runs automatically :-( As well as be SuperCareful! to recreate the jobs with the correct drives as source as destination, carefully navigating my way through the unfortunate SuperDuper! UI that states things backwards so that you have to read very slowly to ensure the backup source and destination are correct. ("Smart Update will copy and erase what's needed to make DESTINATION identical to your selections from SOURCE" is really not clear English, and that is what the final "are you sure" dialogue says :-( I have relied on having pre-triggered jobs to avoid having to navigate this confusion at the end of a long work week when I needed to make a backup and really did not want to overwrite the week's work with a several week old backup...)

The whole design feels... excessively fragile to problems like lockfile being removed between OS releases. It is rather unclear to me why the "Copy Job.applescipt" could not just be a tiny wrapper that passed a few strategic parameters to a centrally maintained script that would be updated any time SuperDuper! was updated -- entirely avoiding the need to manually delete and recreate things.

For future reference, the cache of "discovered drives" is kept in "~/Library/Caches/TemporaryItems/com.shirtpocket.lastVolumeList.plist" should you need to debug it further. It is created by the "Backup on Mount.scpt" on first run, eg when run at login with -runAtLogin as an argument. An equivalent file is created on each run in the same directory, to use to compare and figure out "what is new", to determine if a job should be run.

Also for future reference, launchd is ignoring the "ThrottleInterval" of 0, as logged to the system log:

Oct 28 19:07:52 ashram[1] (com.shirtpocket.backuponmount): ThrottleInterval set to zero. You're not that important. Ignoring.

(which is in ~/Library/LaunchAgents/com.shirtpocket.backuponmount.plist for unknown reasons).

Finally for the record, Time Machine also uses StartOnMount jobs to trigger its automatic backup via /System/Library/LaunchAgents/ The other common way is apparently an AppleScript "Folder Action" on /Volumes (where external drives mount). (Also, more on launchd.)

Posted Mon Oct 24 18:01:47 2016 Tags:


Last month I set up a macOS Server on an old MacBook Pro. Since then I have added the second external ("data") drive, and set up some file shares -- using the macOS Server Application to set up the shares. Other than deciding what to name the directories and shares it was pretty simple to set up.

Since I have been using git-annex for managing media files for a couple of years, I also wanted to set up a git-annex server as a central point for storing one of the copies of those files, which hopefully would always be online (versus the many copies on offline external drives, which are more fiddly to access).

For this system I was aiming to avoid installing the (large) Apple Developer Tools, and MacPorts (which require installing a substantial fraction of the Apple Developer Tools, to be able to build programs), so I wanted to find a "stand alone" way to install git-annex and have it work as a remote target via ssh.

The pre-built git-annex bundle for OS X includes a bunch of git and related tools within the bundle, it seemed like it should be possible -- but actually doing so, for "server" usage (eg, git annex sync and git annex copy ... initiated from another system) proved more subtle than I expected. Since there is not really a git-annex installer to look after these details on OS X, and they do not seem to be documented anywhere easily found, I am recording the steps needed for future reference.

Installing git-annex on OS X

The git-annex install page links to pre-built binaries for Mac OS X. To start, I downloaded the most recent release for OS X 10.11/El Capitan (and its signature file to use with verifying git-annex downloads).

After verifying the download (the same way as last time), I then installed git-annex by:

  • mkdir /Applications/OpenSource (if it does not already exist)

  • open git-annex.dmg

  • Dragged the into /Applications/OpenSource/

  • mkdir /usr/local/bin (if it does not already exist)

  • Created symlinks to key git and git-annex related programs in /usr/local/bin so they can be (easily) on the $PATH:

    cd /usr/local/bin
    for FILE in git-annex git git-shell git-receive-pack git-upload-pack; do
      if [ -f "${FILE}" ]; then
        sudo ln -s /Applications/OpenSource/$FILE .

That list of programs (git-annex, git, git-shell, git-receive-pack and git-upload-pack) was determined partially experimentally (without some of those, a remotely initiated git annex sync failed with weird errors), and partly by comparing the Apple bundled "proxy" programs (which just prompt to install the Apple Developer Tools):

ewen@bethel:~$ ls /usr/bin/git* | cat

with the git-annex bundled wrapper programs:

ewen@bethel:~$ ls /Applications/OpenSource/* | cat

The aim being to ensure that for each Apple-provided proxy program the git-annex bundled wrapper programs should be on the $PATH first (something we arrange in the next setup step). I assumed that because git-annex did not provide wrappers for git-cvsserver and git-upload-archive that weren't necessary to the functionality of git-annex.

Setting your $PATH

To ensure that /usr/local/bin is on your $PATH either create ~/.bashrc with this contents, or add it at a suitable place in your ~/.bashrc:

# Ensure that /usr/local/bin is in the path
if echo "$PATH" | grep "/usr/local/bin" >/dev/null 2>&1; then
  : # Already there, great!
  # Not present, prepend to the start of the path

Note that it is important that /usr/local/bin ends up in your path before /usr/bin, in order that (the symlinks to) git-annex wrapper programs effectively hide the Apple-provided Developer Tools proxy programs; if you do not want /usr/local/bin at the very beginning of your $PATH then at least ensure it gets inserted before /usr/bin.

Before carrying on, ensure that the basic git and git-annex binaries do actually run now:

source ~/.bashrc
hash git
hash git-annex
git --version

You should get a git version report, and a git-annex help message, rather than Apple's prompt for you to install the Developer tools.

Now that git is working for basic things you might also want to do some common git setup:

git config --global ...    # Insert your email address
git config --global ...     # Insert your name

which should run without any complaints.

Creating the git-annex wrappers

The final setup step is to persuade git-annex to create the remaining wrappers it needs (which surprisingly are not files that can be copied or linked to, but actually created on "first run" if they are not present), and then ensure that these are also in /usr/local/bin so that they are now on your $PATH.

To do this:

  • Run /Applications/OpenSource/ to enter a shell with git-annex on the $PATH and create two wrapper programs:

  • Copy these wrapper programs into /usr/local/bin:

    cd /usr/local/bin
    sudo cp -p ~/.ssh/git-annex-shell ~/.ssh/git-annex-wrapper .

Testing, and git-annex server usage

To test that this is working, from a client machine do:

ssh SERVER 'echo $PATH'

and make sure that /usr/local/bin appears before /usr/bin in the result echo'd out.

Then to "clone" a git-annex onto the server, with all interaction going from client to server (ie, not relying on the server being able to ssh into the client), assuming NAME is the name of your git-annex:

On the client:

git bundle create /tmp/NAME.bundle --all
scp -p /tmp/NAME.bundle SERVER:/tmp

On the server:

git clone /tmp/NAME.bundle
git annex init '....'        # Enable git-annex, name it, eg SERVER
git remote remove origin     # Remove dependency on NAME.bundle
rm /tmp/NAME.bundle

Back to the client, test that it is working an copy content into it:

git anenx sync
git annex copy --to=SERVER .

Hopefully the git annex sync runs cleanly as normal without any problems; if you get any strange errors be sure to check the install steps above and make sure that all the git-related and git-annex-related programs are in /usr/local/bin -- the errors when some tools that git-annex needs are its own bundled versions and some are the Apple-provided proxy programs are very weird.

(Note that because git-annex does not use a bare repository, it is not possible to use the normal git trick of just doing a git push into a blank git init'd repository -- it is not possible to git push into a non-bare repository.)

In theory updating this setup to a later git-annex version should just be a matter of moving the aside, and dragging in a new version -- the symlinks should keep pointing at the current (ie, new) version, and the two wrapper scripts that are created are so simple that they should not need changing.

Posted Sun Oct 9 08:10:56 2016