LCA2016 - Days Four and Five (Thursday and Friday)

Linux.Conf.Au 2016 happened last week, in Geelong, Vic; I wrote about the first day of LCA2016 during the conference, and the second and third days yesterday, so this is the third party of the triology with notes from the last two days.

Day Four -- Thursday

Jono Bacon's keynote

Thursday started with a keynote by Jono Bacon about "Community 3.0" -- ie, how to build better Open Source communities based on what we have learnt in the last 15-20 years. ("Community 1.0" was in the late 1990s with people trying to figure out how to run an effective community; "Community 2.0" has been the last 10 years or so, with some effective models to follow, but not perfect.)

Jono referenced David Rock's SCARF model for collaborating and influencing others:

Status: relative importance
Certainty: security/predictability
Autonomy: creating intentional choice (cf choice architecture)
Relatedness: clear social groupings
Fairness: reduce unfair opportunities/rewards

He suggested the golden rules of community are to accomplish goals indirectly, and influence behaviour with small actions (see, eg, Richard Thaler's book "Nudge").

Jono's "Community 3.0" embraces:

System 1 and System 2 thinking
Behavoural Paterns
Workflow
Experiences
Packaged Guidance

and realises that the most important thing is that participants have a "sense of belonging", and that it is important they have an opportunity to do things.

Hardware and Software of the Machine

"The Machine" is a Hewlett Packard Enterprise (HPE) Research project, with numerous nodes (each with a mere 256GB of RAM), attached to a large fast store -- Fabric Attached Memory. The idea is to build a new kind of "super computer", with many petabytes of RAM; as a design it is sufficiently large to exceed the address bus of the CPU nodes (53 bits needed to address it; but the CPU can address only 48 bits). A small installation will be a rack; a large installation will be an entire data centre, all connected via high speed transport to the same memory.

There is an emulator of the Fabric Attached Memory available now, and in theory there will be ("small") hardware (a mere rack) available later this calendar year.

Machine Ethics and Emerging Technologies

Paul Fenwick gave another entertaining, though provoking, talk -- this time about what machine ethics should be in the "far, far, future" (10,000 years -- Paul's "one weird trick" to get people to think about future reality without protesting it is impossible).

In particular it seems like that autonomous vehicles will be common, if not ubiquitous. Which raise a new set of human/machine ethical questions -- eg, at what point should the "robot vehicle" sacrifice itself and/or its occupants to save others. Paul's thesis is that there is always a point where it should in everyone's view -- but where the line should be is an open question. More relevant to the Open Source conference, if the machine is programmed to sacrifice its occupants in some situations, should it be impossible to modify the machine (eg, extensive DRM).

He also pointed out that machines have always replaced certain jobs (a "water bearer" was relevant before plumbing, for instance); but historically there have been "new jobs" to replace them. This seems to be happening less -- the mechanisation is reaching a point where there just is not as much need for humans to do everything. (For instance in the 1830s a 70 hour work week was common; but now for decades a 40 hour work week is common).

One consequence of shorter working hours, and more mechanisation, is that people will have more leisure time (1880: approx 44k lifetime leisure hours; 1995: approx 112k lifetime leisure hours), and there will be more unemployed, or "under" employed people. How does society cope when many people are not needed in the workforce? "Having a job" is currently a major form of status.

We've been here before -- the Destruction of Stocking Frames Act 1812 was a response to previous Luddite behaviour.

There was also a discussion of the potential good (and bad) of "drones" -- both as a means of delivery, and as a weapon. (As one person pointed out during question time, arguably drones as a weapon are less bad than laying landmines -- at least when the war is over, the drones can be trivially removed.) This future is basically here -- a solar powered drone has flown for over two weeks and another for multiple days.

(Paul's talk slides.)

Open Source tools for Distributed Sysadmin

Elizabeth K Joseph spoke about OpenStack's system administration process -- where they treat system adminstration "as code". They use the same OpenStack CI (Continuous Integration) system as the rest of Openstack, for their Puppet code. And review proposed changes with the same Gerrit.

They use IRC for coordination (eg, #openstack-infra, #openstack-infra-incident), rather than voice/video calls because there is a record and none of the sysadmins like voice/video calls. And they have written tools like PuppetBoard to allow everyone to monitor the status of applying system configuration changes.

(The source to all the OpenStack infrastructure code is OpenSource -- aside from a few private items like keys/passwords kept in a private repository; the system configuration documentation is also open.)

Record and Replay Debugging with `rr`

rr is a Mozilla Research project which allows "record and replay" debugging, of unmodified programs, with only a 1.5x overhead (cf 2x-3x for compiling with debug symbols, and much more for other instrumentation).

It looks ideal for use if your program fits within its constraints:

runs on modern x86/x86-64 CPU
on modern Linux kernel
on a single CPU core (ie, no true parallelism)
does not interact with other (untraced) programs (eg, no shared memory)

It uses ptrace, hardware perf counters, and seccomp-bpf to track only the subset of system calls it cares about. And then at each boundary it copies the input to the program, including signals, etc, so that it can reconstruct them on replay.

On replay, usual gdb interaction is possible -- including running queries that involve gdb injecting code into the process (which is done in fork()ed copy to avoid mutating the replay process).

Speaking their language

My final talk for the day was by Rikki Endsley, of opensource.com, about writing for non-technical audiences. She gave lots of good resources, including:

The Care and Feeding of the Press (by Michael Singer)
RTFM? How to write a manual worth reading (by Rich Bowen)
Howto: Writing an Excellent Post-Event Wrap-Up Report (by Leslie Hawthorn)

Much of the rest of the talk was inspired by Stephen King's "On Writing" memoir, including relevant quotes -- which led to her article Stephen King's Practical Advice for Tech Writers.

In brief summary, the tips:

Good writing requires reading
Invite the reader in (good introduction mentions the audience and topic)
Tell a story (rewrite to take things which are "not the story" out)
Leave out the boring parts (eg, for pacing; but note on the web that you can always hyperlink to other documents!)
To edit is devine (revisit after a break; ideally get input of others)
Start writing ("the scariest moment is always just before you start" -- Stephen King (emphasis added))

Day Five -- Friday

Genevieve Bell's keynote

The last day started with keynote by Genevieve Bell, the "redheaded Australian anthropologist" who has worked for Intel for the last 15+ years. She gave a very entertaining keynote about how she'd been recruited by Intel (over a period of 6+ months) and her job role. Her answer to the pat interview question "is there anything else we should know" was "I'm an unreconstructed Marxist and a radical feminist" apparently received the response "will we like that?"; Genevieve's answer was "well the first six months might be a bit rough". Her original job assignments were "women -- all", and "rest of world"; Intel wanted everything other than "North American men" explained! As she said, that is quite a bit of job security...

The video is well worth watching. After her "origin story" the talk focused on the challenges of explaining technology use in other parts of the world to Engineers -- and then later in the talk, on explaining Engineers to themselves.

Helicopters and Rocket Planes

As usual, Andrew Tridgell gave an excellent talk -- this time about the new autonomous navigation challenge, which includes vertical take off and landing (VTOL). So they're investigating both helicopters and fixed wing planes fitted with extra "quad copter" blades for the vertical takeoff/landing part. They appear to have both solutions working to the point where they can fly test routes reliably -- although obviously the helicopter solution is the most finicky to get right (there's a good example in the talk).

The rocket part comes from the Lohan rocket plane, of the "Special Projects Division" of The Register. It is a "Low Orbit Helium Assisted Navigator" -- a 3D printed plane that will be taken into orbit via a helium ballon, and then autonomously navigate its way back to earth once the balloon pops (at high altitude). Andrew Tridgell had been assisting with flight computer design, and was able to describe/show some test flights of the flight computer; sadly the Lohan itself is still trying to get permission for a test flight.

Again the video is well worth watching.

The World of 100G Networking

Now that 10Gbps networking is becomming common on servers, and 40Gbps aggregation is common (via QSFP+ -- 4 * 10Gbps), the new frontier is 100Gbps networking. 100Gbps is being achieved as 4 * 28Gbps, less encoding overheads -- over QSFP28; which replaced older CFP connectors carrying the less useful 10 * 10Gbps.

There are currently three competing 100Gbps standards:

Infiniband (available now)
Ethernet (NIC designs ready)
Omnipath (from Intel; coming soon)

On the server side, all of these are aimed at high performance compute clusters at present. However Ethernet has pre-standard versions of 25Gbps (1 * 28Gbps, less encoding) and 50Gbps (2 * 28Gbps, less encoding) which seem to be likely fractions provided to other servers (based on what happened with 40Gbps); 25Gbps standardisation due later this year, and 50Gbps standardisation expected by 2018.

The rest of the talk was about software interfaces to using these faster network interfaces efficiently. As the talk points out, at 100Gbps there is much less time to process each packet/bit, and unlike a hardware assisted switch/router, it is not possible to just punt everything to custom hardware...

There are five candidates:

Socket API (ie, as now, but with additional tuning; only real practical for Ethernet; lots of tuning required to hit 100Gbps and many cores)
Block File IO
RDMA -- Remote Direct Memory Access (common with Infiniband)
OFI -- Open Fabric Interfaces -- via libfabric, which is a RDMA-like interface abstracted by Intel to work on other technology.
DPDK, which is a user-space API for efficiently processing packets.

Linux support for some of these technologies is available from approximately the Linux 4.3/4.5 kernel (and RHEL 7.2/7.3), depending on the technology.

The talk also pointed out that 100Gbps can more than saturate the DDR3 memory bandwidth of slower DDR3 (12.5GB/s versus 6.5-17 GB/s), and is definitely much faster than even a very fast single local disk (150MB/s). So as these technologies become more ubiquitous, expect local storage to be "just for booting".

Free as in cheap gadgets: the ESP8266

The ESP8266 is a Wifi radio, with a small microcontroller (80Mhz 32-bit CPU, with 96KB data ram, 64KB code ram, and 64KB code rom), available for about $2 (quantity one). Hobby development boards like the Sparkfun Wifi Module - ESP8266 are under $10 (quantity one). So quite a few hardware projects are being built around them, even where Wifi is not a major feature. (There's a cool die shot available online.)

Development was originally only via a proprietary SDK, which was limited access; but it leaked in 2014, and that lead to some more openness in the development environment, including an official (binary) public SDK released October 2014 and a open source RTOS released in December 2014. Support for the ESP8266 in the Arduino environment was added March 2015, and proved to be very popular. For instance, it is used as part of several simpleiothings.com projects.

The documentation is still fairly limited, but work is going on to reverse engineer more functionality. And esp-open-rtos is attempting to put together a full open RTOS environment, from bits released and reverse engineered.

(For more details see the talk slides.)

Life is better with community automation

Emily Dunham described how the Rust community works to avoid conflict. (After a very brief introduction to Rust, including pointing out Multirust and the Rust Playground as places to start with the language.)

The basic idea is that they have an extensive test suite, aim to maintain an "always working" source code repository, and have a merge robot ("bors"; current code) that will merge anything that has been reviewed and passes all the tests. That removes the conflict between people over merges, turning it into an "us vs Robots" struggle to get good code. By merging ASAP it also reduces the "out of sync with trunk" development issues.

They also have an extensive code of conduct, posted everywhere, which has been in place from the beginning -- and tends to discourage people who might run up against the code of conduct from even becoming involved. So they have a community mostly populated by nice people. (It was noted that such codes of conduct are much more difficult to add later; the question period has some discussion of how it might be done.) They also do other things to promote good community, including awarding "friends of the tree" (who did good work on the code base), and recognising new contributions in their newsletter; which links back to Jono Bacon's "Community 3.0" keynote.

Of possible note, they do language regression testing by running all known Rust code on both the old and new language versions and looking for differences.

Lightning Talks

ETA, 2016-02-16: to add notes on lightning talks.

NZOSS
Chromebook Linux
StackPtr mapping system, with source
#LABHR - Let's All Build A HatRack - octohatrack to recognise non-code contributions on GitHub
PyCon.Au 2016, Melbourne 12-16 August 2016
Kiwi PyCon 2016, Dunedin 9-11 September 2016
GovHack NZ, 29-31 July 2016, throughout New Zealand
JMAP -- mail via JSON over HTTPS, with source on GitHub
Machine Learning in R, with a simple (6 line) example, that does not seem to be online anywhere.
Chaos Key, by Keith Packard. A higher data rate (1MB/s) version of OneRNG. Aims to be out later in 2016, costing around US$30 each. Intended for servers; driver in Linux 4.3+.
LCA2017, Hobart, 16-20 January 2016, at the Wrest Point Convention Centre; apparently there will be an option to stay in student accomodation (way) up the hill, but getting into the Hotel is probably much more expedient, and hopefully there is a reasonable discount (regular room prices are AUD$160-AUD$260 depending on room quality!).