Fundamental Interconnectedness

This is the occasional blog of Ewen McNeill. It is also available on LiveJournal as ewen_mcneill, and Dreamwidth as ewen_mcneill_feed.

About 18 months ago I wondered why the XFS FAQ recommended a stripe width half the number of disks for RAID-10, as the underlying rationale did not seem to be properly explained anywhere (see XFS FAQ on sunit, swidth values). The answer turned out to be because the RAID-0 portion of RAID-10 dominated the layout choices.

I suggested extending the FAQ to provide some rationale, but Dave Chiner (the main Linux XFS maintainer) said "The FAQ is not the place to explain how the filesystem optimises allocation for different types of storage, and pointed at a section of the XFS admin doc on alignment to storage geometry, which at the time -- and now, 18 months later -- reads:

==== Alignment to storage geometry

TODO: This is extremely complex and requires an entire chapter to itself.

which is... rather sparse. Because Dave had not had time to write that "chapter to itself".

At the time I offered to write a "sysadmin's view" of the relevant considerations, which got delayed by actual work, but still would be greatly appreciated.

I eventually posted what I had written to the XFS mailing list in February 2018, where it seems to have been lost in the noise and ignored.

Since it is now nearly a year later, and nothing seems to have happened with the documentation I wrote -- and the mailing list location is not very searchable either -- I have decided to repost it here on my blog as a (slightly) more permanent home. It appears unlikely to be incorporated into the XFS documentation.

So below is that original, year old, documentation draft. The advice below is unreviewed by the XFS maintainers (or anybody, AFAICT), and is just converted from the Linux kernel documentation RST format to Markdown (for my blog). Conversion done with pandoc and a bunch of manual editing, for all the things pandoc missed, or was confused by (headings, lists, command line examples, etc).

I would suggest double checking anything below against other sources before relying on it. If there is no other documentation to check, perhaps ask on the XFS Mailing List instead.


Alignment to storage geometry

XFS can be used on a wide variety of storage technology (spinning magnetic disks, SSDs), on single disks or spanned across multiple disks (with software or hardware RAID). Potentially there are multiple layers of abstraction between the physical storage medium and the file system (XFS), including software layers like LVM, and potentially flash translation layers or hierachical) storage management.

Each of these technology choices has its own requirements for best alignment, and/or its own trade offs between latency and performance, and the combination of multiple layers may introduce additional alignment or layout constraints.

The goal of file system alignment to the storage geometry is to:

  • maximise throughput (eg, through locality or parallelism)

  • minimise latency (at least for common activities)

  • minimise storage overhead (such as write amplification due to read-modify-write -- RMW -- cycles).

Physical Storage Technology

Modern storage technology divides into two broad categories:

  • magnetic storage on spinning media (eg, HDD)

  • flash storage (eg, SSD or NVMe)

These two storage technology families have distinct features that influence the optimal file system layout.

Magnetic Storage: accessing magnetic storage requires moving a physical read/write head across the magnetic media, which takes a non-trivial amount of time (ms). The seek time required to move the head to the correct location is approximately linearly proportional to the distance the head needs to move, which means two locations near each other are faster to access than two locations far away. Performance can be improved by locating data regularly accessed together "near" each other. (See also Wikipeida Overview of HDD performance characteristics.)

4KiB physical sectors HDD: Most larger modern magnetic HDDs (many 2TiB+, almost all 4TiB+) use 4KiB physical sectors to help minimise storage overhead (of sector headers/footers and inter-sector gaps), and thus maximise storage density. But for backwards compatibility they continue to present the illusion of 512 byte logical sectors. Alignment of file system data structures and user data blocks to the start of (4KiB) physical sectors avoids unnecessarily spanning a read or write across two physical sectors, and thus avoids write amplification.

Flash Storage: Flash storage has both a page size (smallest unit that can be written at once), and an erase block size (smallest unit that can be erased) which is typically much larger (eg, 128KiB). A key limitation of flash storage is that only one value can be written to it on an individual bit/byte level. This means that updates to physical flash storage usually involve an erase cycle to "blank the slate" with a single common value, followed by writing the bits that should have the other value (and writing back the unmodified data -- a read-modify-write cycle). To further complicate matters, most flash storage physical media has a limitation on how many times a given physical storage cell can be erased, depending on the technology used (typically in the order of 10000 times).

To compensate for these technological limitations, all flash storage suitable for use with XFS uses a Flash Translation Layer within the device, which provides both wear levelling and relocation of individual pages to different erase blocks as they are updated (to minimise the amount that needs to be updated with each write, and reduce the frequency blocks are erased). These are often implemented on-device as a type of log structured file system, hidden within the device.

For a file system like XFS, a key consideration is to avoid spanning data structures across erase blocks boundaries, as that would mean that multiple erase blocks would need updating for a single change. Write amplification within the SSD may still result in multiple updates to physical media for a single update, but this can be reduced by advising the flash storage of blocks that do not need to be preserved (eg, with the discard mount option, or by using fstrim) so it stops copying those blocks around.

RAID

RAID provides a way to combine multiple storage devices into one larger logical storage device, with better performance or more redundancy (and sometimes both, eg, RAID-10). There are multiple RAID array arrangements ("levels") with different performance considerations. RAID can be implemented both directly in the Linux kernel ("software RAID", eg the "MD" subsystem), or within a dedicated controller card ("hardware RAID"). The filesystem layout considerations are similar for both, but where the "MD" subsystem is used modern user space tools can often automatically determine key RAID parameters and use those to tune the layout of higher layers; for hardware RAID these key values typically need to be manually determined and provided to user space tools by hand.

RAID 0 stripes data across two or more storage devices, with the aim of increasing performance, but provides no redundancy (in fact the data is more at risk as failure of any disk probably renders the data inaccessible). For XFS storage layout the key consideration is to maximise parallel access to all the underlying storage devices by avoiding "hot spots" that are reliant on a single underlying device.

RAID 1 duplicates data (identically) across two more more storage devices, with the aim of increasing redundancy. It may provide a small read performance boost if data can be read from multiple disks at once, but provides no write performance boost (data needs to be written to all disks). There are no special XFS storage layout considerations for RAID 1, as every disk has the same data.

RAID 5 organises data into stripes across three or more storage devices, where N-1 storage devices contain file system data, and the remaining storage device contains parity information which allows recalculation of the contents of any one other storage device (eg in the event that storage device fails). To avoid the "parity" block being a hot spot, its location is rotated amongst all the member storage devices (unlike RAID 4 which had a parity hot spot). Writes to RAID-5 require reading multiple elements of the RAID 5 parity block set (to be able to recalculate the parity values), and writing at least the modified data block and parity block. The performance of RAID 5 is improved by having a high hit rate on caching (thus avoiding the read part of the read-modify-write cycle), but there is still an inevitable write overhead.

For XFS storage layout on RAID 5 the key considerations are the read-modify-write cycle to update the parity blocks (and avoiding needing to unnecessarily modify multiple parity blocks), as well as increasing parallelism by avoiding hot spots on a single underlying storage device. For this XFS needs to know both the stripe size on an underlying disk, and how many of those stripes can be stored before it cycles back to the same underlying disk (N-1).

RAID 6 is an extension of the RAID 5 idea, which uses two parity blocks per set, so N-2 storage devices contain file system data and the remaining two storage device contain parity information. This increases the overhead of writes, for the benefit of being able to recover information if more than one storage device fails at the same time (including, eg, during the recovery from the first storage device failing -- a not unknown even with larger storage devices and thus longer RAID parity rebuild recovery times).

For XFS storage layout on RAID 6, the considerations are the same as RAID 5, but only N-2 disks contain user data.

RAID 10 is a conceptual combination of RAID 1 and RAID 0, across at least four underlying storage devices. It provides both storage redundancy (like RAID 1) and interleaving for performance (like RAID 0). The write performance (particularly for smaller writes) is usually better than RAID 5/6, at the cost of less usable storage space. For XFS storage layout the RAID-0 performance considerations apply -- spread the work across the underlying storage devices to maximise parallelism.

A further layout consideration with RAID is that RAID arrays typically need to store some metadata with each RAID array that helps it locate the underlying storage devices. This metadata may be stored at the start or end of the RAID member devices. If it is stored at the start of the member devices, then this may introduce alignment considerations. For instance the Linux "MD" subsystem has multiple metadata formats, and formats 0.9/1.0 store the metadata at the end of the RAID member devices and formats 1.1/1.2 store the metadata at the beginning of the RAID member devices. Modern user space tools will typically try to ensure user data starts on a 1MiB boundary ("Data Offset").

Hardware RAID controllers may use either of these techniques too, and may require manual determination of the relevant offsets from documentation or vendor tools.

Disk partitioning

Disk partitioning impacts on file system alignment to the underlying storage blocks in two ways:

  • the starting sectors of each partition need to be aligned to the underlying storage blocks for best performance. With modern Linux user space tools this will typically happen automatically, but older Linux and other tools often would attempt to align to historically relevant boundaries (eg, 63-sector tracks) that are not only irrelevant to modern storage technology but due to the odd number (63) result in misalignment to the underlying storage blocks (eg, 4KiB sector HDD, 128KiB erase block SSD, or RAID array stripes).

  • the partitioning system may require storing metadata about the partition locations between partitions (eg, MBR logical partitions), which may throw off the alignment of the start of the partition from the optimal location. Use of GPT partitioning is recommended for modern systems to avoid this, or if MBR partitioning is used either use only the 4 primary partitions or take extra care when adding logical partitions.

Modern Linux user space tools will typically attempt to align on 1MiB boundaries to maximise the chance of achieving a good alignment; beware if using older tools, or storage media partitioned with older tools.

Storage Virtualisation and Encryption

Storage virtualisation such as the Linux kernel LVM (Logical Volume Manager) introduce another layer of abstraction between the storage device and the file system. These layers may also need to store their own metadata, which may affect alignment with the underlying storage sectors or erase blocks.

LVM needs to store metadata the physical volumes (PV) -- typically 192KiB at the start of the physical volume (check the "1st PE" value with pvs -o name,pe_start). This holds both physical volume information as well as volume group (VG) and logical volume (LV) information. The size of this metadata can be adjusted at pvcreate time to help improve alignment of the user data with the underlying storage.

Encrypted volumes (such as LUKS) also need to store their own metadata at the start of the volume. The size of this metadata depends on the key size used for encryption. Typical sizes are 1MiB (256-bit key) or 2MiB (512-bit key), stored at the start of the underlying volume. These headers may also cause alignment issues with the underlying storage, although probably only in the case of wider RAID 5/6/10 sets. The --align-payload argument to cryptsetup may be used to influence the data alignment of the user data in the encrypted volume (it takes a value in 512 byte logical sectors), or a detached header (--header DEVICE) may be used to store the header somewhere other than the start of the underlying device.

Determining su/sw values

Assuming every layer in your storage stack is properly aligned with the underlying layers, the remaining step is to give mkfs.xfs appropriate values to guide the XFS layout across the underlying storage to minimise latency and hot spots and maximise performance. In some simple cases (eg, modern Linux software RAID) mkfs.xfs can automatically determine these values; in other cases they may need to be manually calculated and supplied.

The key values to control layout are:

  • su: stripe unit size, in bytes (use m or g suffixes for MiB or GiB) that is updatable on a single underlying device (eg, RAID set member)

  • sw: stripe width, in member elements storing user data before you wrap around to the first storage device again (ie, excluding parity disks, spares, etc); this is used to distribute data/metadata (and thus work) between multiple members of the underlying storage to reduce hot spots and increase parallelism.

When multiple layers of storage technology are involved, you want to ensure that each higher layer has a block size that is the same as the underlying layer, or an even multiple of the underlying layer, and then give that largest multiple to mkfs.xfs.

Formulas for calculating appropriate values for various storage technology:

  • HDD: alignment to physical sector size (512 bytes or 4KiB). This will happen automatically due to XFS defaulting to 4KiB block sizes.

  • Flash Storage: alignment to erase blocks (eg, 128 KiB). If you have a single flash storage device, specify su=ERASE_BLOCK_SIZE and sw=1.

  • RAID 0: Set su=RAID_CHUNK_SIZE and sw=NUMBER_OF_ACTIVE_DISKS, to spread the work as evenly as possible across all member disks.

  • RAID 1: No special values required; use the values required from the underlying storage.

  • RAID 5: Set su=RAID_CHUNK_SIZE and sw=(NUMBER_OF_ACTIVE_DISKS-1), as one disk is used for parity so the wrap around to the first disk happens one disk earlier than the full RAID set width.

  • RAID 6: Set su=RAID_CHUNK_SIZE and sw=(NUMBER_OF_ACTIVE_DISKS-2), as two disks are used for parity so the wrap around to the first disk happens two disks earlier than the full RAID set width.

  • RAID-10: The RAID 0 portion of RAID-10 dominates alignment considerations. The RAID 1 redundancy reduces the effective number of active disks, eg 2-way mirroring halves the effective number of active disks, and 3-way mirroring reduces it to one third. Calculate the number of effective active disks, and then use the RAID 0 values. Eg, for 2-way RAID 10 mirroring, use su=RAID_CHUNK_SIZE and sw=(NUMBER_OF_MEMBER_DISKS / 2).

  • RAID-50/RAID-60: These are logical combinations of RAID 5 and RAID 0, or RAID 6 and RAID 0 respectively. Both the RAID 5/6 and the RAID 0 performance characteristics matter. Calculate the number of disks holding parity (2+ for RAID 50; 4+ for RAID 60) and subtract that from the number of disks in the RAID set to get the number of data disks. Then use su=RAID_CHUNK_SIZE and sw=NUMBER_OF_DATA_DISKS.

For the purpose of calculating these values in a RAID set only the active storage devices in the RAID set should be included; spares, even dedicated spares, are outside the layout considerations.

A note on sunit/swidth versus su/sw

Alignment values historically were specified in sunit/swidth values, which provided numbers in 512-byte sectors, where as swidth was some multiple of sunit. These units were historically useful when all storage technology used 512-byte logical and physical sectors, and often reported by underlying layers in physical sectors. However they are increasingly difficult to work with for modern storage technology with its variety of physical sector and block sizes.

The su/sw values, introduced later, provide a value in bytes (su) and a number of occurrences (sw), which are easier to work with when calculating values for a variety of physical sector and block sizes.

Logically:

  • sunit = su / 512
  • swidth = sunit * sw

With the result that swidth = (su / 512) * sw.

Use of sunit / swidth is discouraged, and use of su / sw is encouraged to avoid confusion.

WARNING: beware that while the sunit/swidth values are specified to mkfs.xfs in 512-byte sectors, they are reported by +mkfs.xfs (and xfs_info) in file system blocks (typically 4KiB, shown in the bsize value). This can be very confusing, and is another reason to prefer to specify values with su / sw and ignore the sunit / swidth options to mkfs.xfs.

Posted Tue Jan 8 14:11:07 2019 Tags:

Modern iPhones (6s or later) take "Live Photos" by default: these are video recorded 1.5 seconds either side (3 seconds total) of when you take a still photo (see also a guide to Live Photos).

I suppose if you were mainly taking photos of people acting silly for the camera then this might be a fun feature to try out a few times, before putting aside. But if you mainly photograph other things, or take photos as a record of slides, etc, it is an actively useless waste of storage space, that can make the intended photo harder to see (eg due to flickering lighting).

Imagine, not entirely hypothetically, that you finally got an iPhone new enough to have this feature, long after the feature was introduced, and had forgotten about the feature entirely. Then took a bunch of photos of very still things (slides, screens, paper, etc) to record them, ignoring the "bulls eye" icon and "live" -- perhaps assuming they were related to focus or "showing live view with effects" or something. Only to realise a couple of hundred photos later that all your recent photos were bloated with this additional baggage you did not want, and visual flicker.

Obviously the first thing you would do is turn the feature off when taking a photo (tap the bulls eye icon, so it is not yellow and "Live" does not appear in yellow). Then you realise that by default every time you go back into the Camera application is is turned back on again. So you accidentally end up taking more "Live Photos" when distracted and just picking up your phone to record something. Which leads to a mix of "Still" and "Live Photos" that are even harder to clean up.

Eventually, after some searching, you realise it is possible to change the default as well:

  1. First go to iPhone Settings -> Camera -> Preserve Settings, and enable "Preserve Settings";

  2. Then go to the Camera application and click the bulls eye icon to turn off Live Photo (making sure "Live" is no longer displayed in yellow).

After that, the iPhone Camera application will actually remember your last setting, rather than passively agressively turning the feature on every time. So the unwanted bloat, and difficulty viewing the still photo you were trying to take, is at least no longer getting worse.

Then secondly you would want to "clean up" these "Live Photos" just making them still photos. After some hunting I found out since iOS 9.3 it is possible to strip the video out of Live Photos on the iPhone, and reclaim the storage, but it is a multi-step process:

  • In the iOS Photos application, select one or more Live Photos (and only live photos). Batches of 10 seem to work; large batches, or batches including non-Live photos appear not to give the right options.

  • Choose Share (bottom left) -> Duplicate -> Duplicate as Still Photos, to create a new file with just the still photo.

  • Check that the photos duplicated to still photos correctly.

  • When you are happy, go back and delete the original Live Photos to mark them for deletion in 30 days.

  • Then go to "Recently Deleted" and delete the original Live Photos again to delete them now.

You can check your progress on cleaning this up by going to the automatic "Live Photos" smart album in the Photos application and seeing what is left. If you remove all the "Live Photos" then the "Live Photos" smart album will vanish. When you are complete, copy the new still photos over to your computer (and force a backup of your iPhone) to discard the bloated "Live Photos" versions from your computer as well. It may be necessary to restart the iPhone and/or disconnect/connect from Image Capture.app to stop seeing the deleted "Live Photos" versions in the image list (which makes it confusing to know if you are copying the right version or not).

The biggest risk here is deleting a "Live Photo" without duplicating it, due to the need to select photos multiple times. I would recommend making a backup to your computer first, and cross checking against that before permanenty deleting the "Live Photos" from "Recently Deleted", at least for any photos for which you would miss them if they were gone.

Note that if you accidentally include a non-Live Photo in the selection to duplicate, you will not get the final prompt to "Duplicate as Still Photos", and it will just duplicate everything as is, making the problem worse. If that happens, delete the duplicates, and then try again being more careful in identifying the Live photos (which in the "Select" mode have no visual distinction so you just have to remember; thanks Apple).

For a couple of hundred accidental live photos, processed in batches of 10 or so, this is merely frustrating busy work, but actually possible to do. The main catch is the duplicated photos will appear at the end of the "Camera Roll" (as the files are created more recently). This makes it easier to tell which is the original "Live Photo" and which is the "Duplicate as Still Photo", but harder to move back and forth between them (or find photos by the order in which they were taken) if a lot of time has passed between when the "Live Photos" were accidentally taken and the clean up effort. Particularly if you have accidental intermixed "Live Photos" and taken-as-intended Still Photos (as the taken-as-intended Still Photos will be left behind in the timeline, and the fixed "Live Photos" will be added to the end of the Camera Roll).

Fortunately this is a one-off cleanup issue once the iPhone camera default settings are fixed. But user hostile defaults, plus delaying finding out what magic new features mean, leads to wasting half a day cleaning up the resulting mess. Thanks Apple.

For the record, other ways that do not seem to work / work reliably for everyone:

  • Editing the photo on the Phone, and unselecting the bulls eye ("live") will stop the photo displaying as "Live", but appears not to update the photo storage -- as you can "edit" again, and turn the "Live" flag back on again. Plus this creates a second file (IMG_Ennnn.JPG) for the edited version, adding to the storage problems. Several guides do still suggest this approach though, and I suspect it might hide them from the "Live Photos" folder in the iPhone photos albumn.

  • Deleting the IMG_nnnn.MOV companion file via Image Capture.app does not seem to work -- it vanishes from the list, but it is unclear if it is actually removed (eg, it does not appear in the "Recently Deleted" folder for further cleanup), unlike what some people report using the Photos application on MacOS. I did not pursue this further as it was not clear if it worked, and I do not use the MacOS Photos application.

  • The Lean iOS app supposedly allows bulk changes to photos to make them "not Live". But the reviews suggest while it claims to save storage space for them it did not (I am unclear exactly what it does; if it uses the "duplicate as still photo" approach, the storage may not be reclaimed until the original Live Photos are deleted...; if it just toggles the "Live Photo" tag, then no storage will be reclaimed as best I can tell). Since it was no longer a free App ($1.99) I did not try this, having found a manual solution which was merely fiddly and time consuming.

Constantly changing bytes in JPEGs

While investigating this I came across another somewhat related frustrating: when copied with Image Capture.app, the JPEGs of photos captured by default with a modern iPhone Camera application will change every time they are copied, but only in a small range of bytes (about bytes 1663-1683 from memory). This means that the photos are no longer byte for byte identical, which breaks sha1sum / sha256sum style checking for identical copies (and hard linking those together to save space).

After some digging it appears this is caused by the default settings in the modern iPhone Camera being to storage images in HEIF format -- a High Efficiency Image Format, used with HEVC a High Efficiency Video Coding (H.265) (see also HEIF iPhone Photos in iOS 11). HEIF is a container format, capable of storing multiple image frames and H.264 video. I assume this video format is also being used as part of "Live Photos", at least by default.

The result of this is when you copy "JPEG" images with Image Capture.app they appear to be synthesised on demand, with the result that the JPEG files vary slightly. In particular it appears everything is created identically except a UUID value, which appers to be recreated per copy (or at least each time you connect Image Capture.app to the phone):

ewen@ashram:~$ exiftool Desktop/IMG_5260.JPG  >/tmp/desktop
ewen@ashram:~$ exiftool Pictures/IMG_5260.JPG  >/tmp/pictures
ewen@ashram:~/Desktop/apse$ diff -u /tmp/desktop /tmp/pictures | grep "^[-+]"
--- /tmp/desktop    2018-11-24 20:50:52.000000000 +1300
+++ /tmp/pictures   2018-11-25 10:14:27.000000000 +1300
-Directory                       : Desktop
+Directory                       : Pictures
-File Access Date/Time           : 2018:11:24 20:50:33+13:00
-File Inode Change Date/Time     : 2018:11:24 20:50:28+13:00
+File Access Date/Time           : 2018:11:24 20:35:26+13:00
+File Inode Change Date/Time     : 2018:11:15 13:11:39+13:00
-Content Identifier              : 0F3C5F38-B578-4CDF-84F6-50C89A5B5C10
+Content Identifier              : 51A6B6AD-8484-465D-913C-112A4CC97931
ewen@ashram:~/Desktop/apse$ 

This does not happen when the photos are saved in JPEG directly, as was done with the iPhone Camera application prior to iOS 11. (It is an unfortunate oversight really, as it appears to me that the "Content Identifier" UUID for the JPEG could have been stashed in the HEIF file somewhere, or derived from stable values, which would have resulted in reproducible bit for bit identical exported files; I would consider that a bug, but Apple possibly do not.)

In iOS 11 you can change the format in which the iPhone Camera application stores photos via iPhone Settings -> Camera -> Formats where:

  • "High Efficiency" means store in HEIF/HEVC format

  • "Most Compatible" means store in JPEG foramt

For now, I have set mine back to "Most Compatible" as I value bit for bit identical files to reduce the storage requirements on my computer.

Eventually, when HEIF files can be copied and manipulated directly -- avoiding the constant data changes of synthesising JPEGs inexactly -- the HEIF format is probably a better choice (amongst other things it can store a greater bit depth -- 10 bit currently, but up to 16 bit in the formats -- than JPEG, and will typically provide better compression on higher resolution photos). See also Apple WWDC 2017 presentation on High Efficiency Image File Format, 500px blog post in HEIF, and Nokia Technology site on HEIF. It appears Apple are, as is often the case, at the forefront of deploying the HEIF format.

I think if you change this setting and then use "Duplicate as Still Image" (above, to resolve the "Live Photo" mess), the resulting files are being resaved as JPEG at that point. But I have not been able to completely verify that. (Certainly the "Duplicate as Still Photo" versions are not bit for bit identical with the JPEG from the "Live Photo" by any means, after having changed this setting, but it is unclear if that is due to the "Duplicate as Still Photo" feature or changing the default storage format.)

Posted Sun Nov 25 12:08:39 2018 Tags:

Following on from the Specialist Tracks Day at PyConAU 2018 and the First Main Conference Day at PyConAU 2018, was the Second Main Conference day, and then the Development Sprints.

The second day started with a keynote from Tracy Osborn, and then broke out into four rooms of 30-minute talk sessions for the day, and finished up with more lightning talks.

Keynote: Tracy Osborn

Tracy Osborn (@limedaring is someone who grew up around tech, was persuaded "tech was not for her" in college, and then returned to tech through web development. Tracy is best known in the Python community for her "easy to get started in web development" book series, Hello Web.

Tracy grew up in the mountains of Northern California, in a family that was into tech -- her grandfather worked at IBM for much of his career, and an uncle work in tech too -- in an area that was into technology: Northern California. So she was around computers basically from birth, and around the Web as it was starting to become popular. She even made her own websites, instead of writing school reports, back when that involved writing HTML by hand -- Tracy observed this was a shortcut to a good grade as the teachers were very impressed. She even made her own fan sites, incuding Tiryn Forest (hosted on AngleFire, and last updated in 1999!).

So naturally when Tracy went to college at Cal Poly she chose to study Computer Science. She had been doing it all her life, so it would be easy, right? Within the first hour of the first class, an introductory Java class, she was suddenly out of her depth and thinking she had missed something (as much as Java is a common teaching language, because of late 1990s history, it is not an easy language to get started with :-( ). She struggled on with Computer Science for much of her first year in college, doing well in the courses that involved design and less well in the courses involving studying Algorithms, until eventually a professor suggested "maybe computers are not for you" -- and then she quit Computer Science, and got an art degree in Graphic Design instead.

After college she worked as a front end web designer, avoiding JavaScript due to the trauma of Java classes (JavaScript is very different from Java, but was deliberately named to seem related back in the early days of the web; how unfortunate that naming backfired).

The rest of the keynote was the story of how Tracy found her way back into technology -- and ended up writing books about programming and web development. The short version is that she moved to Silicon Valley, everyone had a startup, and so she wanted to have a startup too.

Her startup was Wedding Invite Love, which has branched out into a number of related websites. Because her attempt to find a "technical co-founder" was unsuccessful, Tracy was drawn back into web development, this time with Django. She wrote ugly Python code -- but it worked. And in the process of running a startup, and seeing other startups' code, she learnt that working code was more important that beautiful code for a startup -- "end users don't see what is inside", and you can refactor later once you have learnt more. "Learn by doing," which is pretty much the only successful way to run a startup -- and the motto of her college.

After getting burnt out on running a startup she took a break. And for her break, she wrote a book: a better tutorial for starting with Django, based around a simple customisable tutorial site, without assuming any programmming background. Inspired by A Book Apart, she tried to get them to buy her book -- and then No Starch Press. But royalties are complicated and self-publishing was becomming more common, so she ran a Kickstarter campaign, publicised it at PyCon 2014 and published the book herself. The success of that book led to more kickstarter projects for more books, all published herself. And now she has helped many users learn programming, and design, despite being told computers were not for her.

Tracy also gave some advice for others wanting to follow their own startup / book path, including:

  • "Keep the marketing in mind when you build a product", eg talking to everyone at PyCon 2014 about her upcoming book helped make the Kickstarter a success (and it seems many of her books have been a success due to word of mouth in Python community).

  • Projects always take longer than you think; build in a bigger buffer in your timeline than you think you'll need.

  • Writing her book in Google Docs, using Markdown, allowed people, who were concerned with how long the book was taking to come out, review the content before publication -- and that feedback helped improve the book. (She laid the book out in InDesign, due to being familiar with it from her graphic design background.)

  • Mathematical perfection does not mean visual perfection, for instance the perceived colour can be affected by the amount of colour present, even with exactly the same colour used; and the bottom matte on art needs to be a little wider than the rest of the edges to avoid it appearing "too thin" (when the art was hung above eye level; but it is still done for effect now).

Earlier in her keynote Tracy also referenced her PyCon 2017 KeyNote: Anxiety, Self-Advocacy, and Promoting Yourself (video on YouTube), which seems worth going back and watching too.

Implementing a decorator for thread synchronisation

Graham Dumpleton, author of mod_wsgi (to link Apache to Python web applications), wanted to replicate the Java synchronized feature in Python, rather than needing to use lock objects directly.

His approach was to create a synchronized decorator, which can be applied to various Python features and automagically makes them synchronised, using a lock on an appropriate thing (eg, the object for a member function). The talk described how he evolved the design to make it more flexible, including how he used the wrapt module (which he wrote; documentation) to make the decorators more context-aware.

For anyone interested in the implementation details, the presentation slides contain lots of detail on the subtle edge cases the implementation had to handle. But for anyone who just wants to use it, the wrapt module includes his final synchronized decorator, usable with:

from wrapt import synchronized

@synchronized
def function(...):
    ...

whether the function is a top level function, an instance method or a class method.

Reflections on the Creative Process - Illustrated with Watercolour Painting

Grace Nolan, who works on IT Security at Google, and is helping organise PurpleCon, also does "wet on wet" watercolour painting as a creative outlet. She spoke, illustrated with painting intermissions, about what wet on wet watercolour painting had taught her about being creative in the context of software engineering. Grace started out with interactive art, and ended up in programming because of Kiwicon (an IT security related conference).

"Wet on wet" watercolour painting involves painting onto pre-wet paper, with the result that the colour tend to bloom a lot, and slowly blend together in watever way they want -- it is not entirely predictable.

Parallels with technology include:

  • The paper type matters -- textured paper tends to absorb the paint, and a more flat paper lets it sit on top ("know your hardware").

  • Laying the groundwork is important. Preparing the paper for painting is a lot like writing psuedo code before writing the real code.

  • Watercolour painting, and programming, can be stressful

  • You start with an optimistic belief that you can do it, but it does not quite work out how you hoped -- in painting and technology. (With programming there is more "reputation risk" -- an echo of Tom Eastman's keynote the day before.)

  • When you end up stressed by a task, especially a programming task, take time away. Eg, slowly sip a glass of water, and be present to the experience of the water.

  • Accept the situation: you cannot fight against what the water wants to do when painting.

  • The main reason she gets stressed is that she does not know what is happening. Giving up is often a short term solution. But the problem may still be there later. Getting more information (eg, looking at logs, or research) allows her to keep going, which is more in line with her values. Then she can commit to that decision.

  • Learning the techniques of others can help improve your craft.

  • Water colour "black" is usually made by blending different complementary colours together -- and you get a different "black" depending on which ones you choose.

  • Water colour gets its vibrancy from the white paper beneath; one of the worst things you can do is overpaint. Knowing when to stop is important in painting, and in programming.

Grace finished with some key takeaways:

  • Reflect on how you work

  • Self soothe

  • Talk to others about how you feel

  • Know that your community of people are willing to help and support you.

The approach she describd was based in Acceptance and Commitment Therapy. Grace also credited Chantal Jodin, a French artist (Google translation as a key inspiration to her painting, including the piece she painted during her talk.

The video of the presentation is well worth watching, for the inspiring painting while presenting.

FP Demystified

Eugene Van den Bulke was "FP Curious" -- curious about Functional Programming -- so he went to Lambda Jam and came away enthuaistic. He recommended Eugenia Cheng's keynote on Category Theory and Life, from Lambda Jam 2018; Eugenia is the author of "The Art of Logic".

Eugene's aim was to port Brian Londsdorf's class on Functional Programming in Javascript (featuring claymation hedgehogs teaching FP in Javascript) to Python. I think he succeeded in porting the code, but it felt like the 25-30 minute presentation format... did not help with demystifying a large complex topic!

Because of the speed of the presentation I struggled to take notes on everything covered -- it was a whirlwind tour of category theory, with examples in Python, presented from a Jupyter notebook which he filled in as he went. I suspect even watching it again one would need to pause repeatedly to take notes!

Some (hopefully not too inaccurate!) highlights:

  • A Box wraps a value; and a fold can extract the value from the box and apply a function to it. For instance in Python map can apply a function over a (set of) values. A Box is a functor -- something that can be mapped over. (The Box here is a custom implementation, rather than a Python built in.)

  • Currying translates a function taking multiple arguments into a collection of more specialised functions each taking one argument. This allows partial specialisation or pre-binding. In Python partial specialisation can be done with:

    from functools import partial
    foo = partial(FUNCTION, ARG)
    

    or via returning a closure with the first argument bound.

  • An Applicative Functor has more structure than a plain Functor, but less than a Monad. They allow you to use apply() as well.

  • A lift makes a function usable with another type (eg, a wrapped type, like Box above).

  • Either is an type that allows storing two types of values, by convention a left value or a right value, such as a result of a function or an exception. An Option is a special case where the right type is None (so you can have a value or nothing, making the value optional, like NULLable columns in a database). A fold on an Either takes two functions (one for the left value type and one for the right value type).

  • A Monad) is a design pattern that allows boxing and chaining function calls together, because each function call returns the same (boxed) type. (See also Crockford's Law that when you understand a monad you lose the ability to explain it to others :-) )

  • A SemiGroup adds a concat method; a Monoid has a function that combines the object with another of the same type.

It was a valiant attempt, but as noted above felt fairly rushed, particularly in an area like Functional Programming which suffers a lot from "obscure" terminology (borrowed from Mathematical Category Theory -- the terms are precise, and accurate, but obscure/complex for what they represent).

Perhaps reviewing the video along side the Jupyter notebook might help make it clearer. Or watching it along side the original Claymotion hedgehogs :-)

Task Queues in Python: A Celery Story

Celery is the default Task Queue interface used with Python. It provides message queue of tasks, and a broker that distributes tasks to workers, both pieces of which have multiple alternative iplementations (eg, Amazon SQS). It is commonly used in Python to get message passing concurrency (because Python has relatively poor support for in-process threading, due to its internal locking).

While Celery is simple to start with, there are lots of features and it can quickly get complex to configure if you have a more specialsed use case. Out of the box Celery is tuned for short running tasks, and is not ideal for longer running tasks (minutes/hours/days). Unfortunately there are multiple places to configure Celery, and they interact in complex ways -- the presenter found they had to disable prefetching of tasks, in multiple places, to make their workload usable. And even then sometimes jobs got stuck in the queue while they still had capacity available.

There are other task queues available for Python including:

  • RQ -- Redis Queue -- which uses Redis as the queue storage. It is very simple and understandable.

  • Huey -- another little task queue, supporting Redis and SQLite. It has nice task chaining and distributed locking.

  • Dramatiq, a task queue supporting RabbitMQ and Redis. The presenter noted the documentation is note as good as they had hoped.

  • TaskTiger, another Redis based queue, with distributed locking support and more queueing primitives.

There is also Dask, which is not a task queue, but can be used for similar things; if you are already using Pandas it might be the best option.

The present switched to using RQ, which they are happy with. They chose it in part because they were already using Redis for caching in their application, so it did not introduce any more dependencies. (They are also using their own serialisation; the RQ default is JSON, which has some limitations on what can be serialised.)

How To Publish A Package On PyPI

The talk was renamed "Publishing (Perfect) Python Packages on PyPi" after submitting the abstract, as the presenter liked the alliteration :-)

There were two main approaches presented, the manual approach, and an almost entirely automated approach.

The manual approach involves using setuptools:

  • Create a new directory, with a src subdirectory

  • Write your module, put it in the src directory (putting the code in src avoids accidentally running uninstalled versions).

  • Write setup.py in the top directory, using setuptools:

    from setuptools import setup
    setup(name=...,
          version=....,
          description=....,
          py_modules=[...],
          package_dir='src')
    
  • Then run:

    python setup.py bdist_wheel
    

    to create a build folder, and an egg-info folder (`eggs are a specific Python version release)

  • Test your installation locally, in a virtualenv:

    virtualenv venv
    . venv/bin/activate
    pip install -e
    

    (-e so that it imports the thing you are editing, with links, which allows you to edit and retest that it works)

  • Add a .gitignore file; see gitignore.io to help create a useful .gitignore template for your lanuage.

  • Add trove classifiers to help make your project findable by common search terms, these go in classifiers=[...] argument to your setup() call).

  • Add a license, in LICENSE.

  • Create documentation in RST or Markdown (RST is common in Python). Write at least a README.md. Use Sphinx or ReadTheDocs to publish the documentation.

  • Consider using the README.md as your long description in the documentation, getting your setup.py to read in README.md from a file; PyPI now supports Markdown syntax.

  • Use pipenv for testing:

    pipenv install -e
    pipenv install --dev --pytest ...
    pipenv shell
    

    This creates a Pipfile file, and a Pipfile.lock which records the exact versions/hashes used. (The "lock" here is locking in the versions, not a traditional unix lock flag file.)

  • Recommendaton is to use setup.py for production dependencies, and Pipfile for development dependencies. Keep the versions in the Pipfile as relaxed as possible; the lock file will record the known good/last used ones.

  • Build a source distribution:

    pip install check-manifest
    check-manifest
    python setup.py sdist
    

    This builds a source tarball. It wants a URL and author details in the metadata, and a Manifest of files to include; use Manifest.in to add extra files. (check-manifest will create a Manifest from the file checked into git.)

  • Upload your packge with `twine:

    pip install --dev twine
    twine upload dist/*
    

    You need to create a PyPI account first. (Do not use setup to upload; it uses an insecure upload method.)

  • Other things to do: test against other Python versions, eg, with tox, which creates a virtualenv for each Python version you want to test and runs your tests in that environment. Use Travis to do automated testing if your code is on GitHub, eg to automatically check pull requests.

The automated approach is to use cookiecutter, which will ask you a few questions and then give you a "best practice" template directory for your project with everything ready to go:

pip install cookiecutter
cookiecutter gh:audreyr/cookiecutter-pylibrary  # or
cookiecutter gh:audreyr/cookiecutter-pypackage

It directly grabs the template out of GitHub.

The examples used in the talk are a useful reference, in addition to the Python Packaging Authority guides.

Watch out for the Safety Bandits!

Tennessee Leeuwenburg wanted to highlight two security related tools to help make your code more secure:

  • safety checks your installed dependencies for packages with known issues (eg, CVEs):

    pip install safety
    safety check
    

    There is an insecure-package which you can install to make sure safety is finding issues.

  • pyup.io can automtically alert you to issues with dependencies, for free if your package is maintained in public.

  • bandit looks for common programing patterns that are known to be weak. It analyses the internal structure of your code, as a tree. (Suggestion: group code with the same trust level in the same directory hierachy, so you can focus your most paranoid scans on the code most at risk, and reduce the noise on code that only works with "trusted" input.)

Lightning Talks

chunksof: a generator

Tim Heap wrote a generator which will break an iterable up into chunks, because there were many attempts on StackOverflow of varying correctness and lots of other approaches were not perfect:

  • slice() is good for lists, but breaks on generators.

* itertools.islice() is good but in some situations may not terminate (eg, if the iterator is empty; need lookahead using itertools.tee() and contextlib.suppress()). Unfortunately tee means that the iterated elements are not garbage collected until the end, so you can run out of memory.

  • Another alternative using itertools.chain() which works, and allows garbage collection of everything but the first example.

  • More elaborate alternative using a yieldone() inner generator, which allows everything to be garbage collected, but does not allow consuming the items out of order.

The "perfect" version is apparently only available as a Tim Heap gist of chunksof.py. To see the examples it appears you have to watch the video, and freeze frame, as they do not appear to be anywhere else.

PyO3

Nicholle James is making an anthology of Python 2.7 fan fiction, to be made online for free under a CC-BY-SA license, with print edition at PyCon 2019. Pitches were due by 1 September, with work due by around first quarter 2019; look out for an online release by about mid 2019.

Tracking trucks in East Africa (Nairobi, Kenya)

Tisham Dhar of Lori Systems helps implement a system to track trucks in Africa. "Uko Wapi?!": Where is the truck?! They have a smartphone application, but there is only about 40% penetration of smartphones amongst truckers. So they are using Traccar, a Java app, to track vehicles using vehicle GPS tracking hardware, that they reverse engineered (which send data back via a M2M SIM).

Where the tracker does not send data back, they call the driver, ask "uko wapi?", get a place name from the driver, and then use Google Maps API to locate the truck, sanity checking that the location makes sense on the route that the truck is on.

Python Sphinx and Jira

Brett Swanson, of Cochlear, Ltd works in a heavily regulated industry (hearing implants). Which means they need detailed software release reports. They used to build these by hand which was slow and error prone; now they generate tables in Sphinx (using list tables, which were easy to generate) with a Python script that can pull in the Jira issues.

String encodings and how we got in this mess

Amber Brown (hawkie) covered a whirlwind history of text encoding including:

  • Baudot, a 5 bit encoding used by telegraphs, with multiple variations (ITA1, Murray, Western Union, ITA2), since they mostly did not need to interoperate.

  • BCD, a 6-bit punch card encoding, from the 1920s

  • EBCDIC, an 8-bit encoding from IBM in the 1960s (part of the System/360 project).

  • ASCII, a 7-bit American standardised encoding, also from the 1960s

Eventually it was released none of these were enough to hold all the world's characters, which led to:

  • Shift JIS, an 8-bit/16-bit variable encoding for Japanese which includes ASCII as a subset

  • ISO 8859-x, a series of 8-bit encodings compatible with ASCII (but incompatible with each other)

  • UCS-2 and UTF-16, two 16-bit encodings to handle more characters

  • Unicode and UTF-8; Unicode is regularly extended to add more characters

Detection of which character set is in use, without metadata saying so, is difficult, which can lead to Mojibake -- strange results from decoding in the wrong character set encoding (originally named in Japan).

Building 3D physics simulations

Cormac Kikkert is a student who built a physics simulation in Python, using PyGame (rather than trying to use a 3D library directly). His approach is to divide the object into lines, then move those lines in the physics simulation, and redisplay; movement is with points and a velocity vector.

Python Bugs: Pentastomida

Libby Berrie gave her first talk, at her first Pycon, about Pentastomida, an actual bug which affects actual snakes. She is a front end developer who has a degree in bio informatics.

Pentastomida is a very old parasite (dated to around 450 million years old), which primarily infects snakes. Eventually the snakes cough up the parasites, which last for ages and are eventually eaten by the prey of snakes... and get back into snakes.

Apparently Pentastomida can also affect humans, generating flu-like symptoms and cysts in some (but most human infections are asymmtomatic).

Don't do this

Lewis Bobbermen, a student who works at Polymathian, wrote a decorator that allows using square brackets instead of paranthesis -- by abusing __getitem__. He also implemented Currying by overiding __or__ and __call__, and f-strings (new in Python 3.6) in Python 2 by walking back up the stack frame.

This eventually lead to his flip-flop operator, the first one submitted.

Confessions of a goto user

Alen Pulford, a student who was part of the student showcase, likes the GOTO statement. He gave examples in bash, Arduino (C), and a Python example... which reopens the source code. He refined that to a version that figures out where it is called from, and a recursive definition.

Flip Flop Face Offerator

Merrin MacLeod returned to the stage with a followup to her Saturday lightning talk on the Flip Flop operator. In the interveening 24 hours things had gotten a little out of hand, and in addition to Lewis's implementation above (the first submitted) there were several more implementations. So they had a Face Off, with two judges and voting.

You need to watch the lightning talk or look at the collected implementations, as I cannot do justice to the judges reactions!

MicroPython and Jupyter

Andrew Leech described jupyter-micropython-remote (source) which allows connecting a Jupyter notebook to MicroPython, which is very useful for debugging. It currently requires a daily build of MicroPython (or 1.9.5 when that is released), as the communication interface ia via mprepl.

Controversial PEPs (of the past)

Nick Coghlan, a Python core developer summarised some contraversial Python PEPs of the past:

  • PEP 227 -- staticly nested scopes (eg, functions in functions), available since Python 2.1

  • PEP 318 -- decorators, available since Python 2.4

  • PEP 308 -- conditional expressions, available since Python 2.4

  • PEP 340 -- annonymous block statements, rejected. But he noted PEP 342 (coroutines via generators), PEP 343 (the with statement), PEP 380 (delegating to a sub-generator) and PEP 492 (async/await) all came from ideas in PEP 340.

Many of which have become accepted core parts of Python.

Development Sprints

I was back on Monday and Tuesday for the Development Sprints, mostly because I was staying in Sydney for another conference at the end of the week -- so it was easy to go to two days of sprints, have one day off, and then go to the second conference.

For my sprint days I mostly worked on FuPy, MicroPython on FPGA, including updating to the current upstream MicroPython version; I posted a summary of work at the sprint to the FuPy Mailing List.

Posted Sun Sep 16 21:56:20 2018 Tags: