Following on from the Specialist Tracks Day at PyConAU 2018 were two full days of "main conference" programme. The first day had two invited speakers (morning and afternoon), as well as four rooms of 30-minute talk sessions and lightning talks.
Keynote: Annie Parker
Annie Parker is the founder of Techfugees Australia, a tech community response to the refugee crisis in various countries. They have run several meetup and Hackathon events, around Australia, bringing together enthuiastic tech people, refugees and organisations working with recent arrivals to the country.
From these Hackathons have come several refugee focused startups, and other tech projects, including a job matchup site (many refugees are highly skilled, but their qualifications are not always recognised), and a website to help refugees find out about resources available to help them.
Annie brought along Shaqeaq, a refugee from Afganastan/Iran, who came along to one of the Sydney Hackathons to talk about the issues refugees face, got enthuastic about solving the problem and stayed to hack, and then ended up founding her own refugee-focused startup -- while she is still in high school.
I would highly recommend the video of this keynote if you want to be inspired and have not seen the talk already.
Describing Descriptors
The Python Desciptor
Protocol is a way
of binding behaviour to (class) attributes, so that you can override
getters and setters.
This facility exists in a lot of programming languages.
Python has a few facilities for implementing overrides,
including implementing the __setattr__(self, attribute,
value)
method on a class, and the @property
decorator but
the Descriptor Protocol
is the most flexible approach.
The core of the Descriptor Protocol is four methods:
__get__(self, instance, owner)
__set__(self, instance, value)
__delete(self, instance)
__set_name__(self, owner, name)
-- since 3.6, called when the property is created on the class (can be used, eg, to capture the name of the attribute to use later in error messages)
These methods override the behaviour of retrieval, storage, and deletion of the attribute.
There are two major categories of Descriptors:
Data Descriptors: implement
__get__()
and__set__()
Non-Data Descriptors: implement
__get__()
but not__set__()
Typically non-Data Descriptors would allow data retrieval, as if it was
a property, with the results being calculated or retrieved from somewhere
else. They can be used for @staticmethod
type overrides, etc.
Data Descriptors will take precedence over the class's internal dictionary lookups; but non-Data Descriptors will only be used if there is no other definition of the attributes name in the class.
The Python WeakKeyDictionary can be useful as a storage location for data associated with the attribute as the WeakKeyDictionary itself will not keep the data around; but the reference within the class to the attribute will keep it around (so when the object with the attribute goes away, the entry in the WeakKeyDictionary can also go away).
The talk gives some examples for how these can be used together.
End to End Energy Monitoring in Python
Tisham Dhar (@whatnick described an energy metering platform using ESP8266/ESP32 (16MB of flash or more) and MicroPython to capture energy usage values and stream them out to a data collection platform for graphing (with Graphite). It is based around the MicroChip ATM90E26 measurement chip, with the CrowdSupply campaign for his monitoring kit launched during PyConAU 2017 and discussed in the talk by Joel Stanley presented at last year's PyCon AU (talk video). (Thisham originally started on Arduino, but outgrew the platform fairly quickly, which motivated the move to MicroPython -- helped by work by Joel Stanley getting MicroPython going on the board.)
To make the results available, the PicoWeb web server is run on the microcontorller (ESP32 best, due to RAM requirements), and vue.js is used for the display. The data is also streamed out over raw TCP/IP sockets for management.
This solution can be used both for measuring consumption (eg, per device or per outlet), and for measuring energy generation (eg, solar power). It can monitor voltage, current, power usage, and power factor (basically the phase shift between current and voltage sign waves).
The slides have lots of interesting detail, as well as some discussion of NILM -- Non Invasive Load Monitoring and some other power usage meters.
The Case of the Mysteriously High System Load
"Debugging is like being the detective in a crime movie where you are also the murderer." -- Filipe Fortes
Our case is set amongst the Geri embryo development machine, and its companion Geri Connect, which allows remote monitoring of the embryos. The machine takes time lapse photos, every 30 minutes, at 11 focus depths, of 6 chambers with up to 16 embryos from the same patient in each. These time lapse photos are then turned into short "delelopment movies" for trained staff to review to check on the development progress, and identify the most viable embryo.
They started to run into performance load issues, and started experiencing load averages of 9 to 30, on a 6 core system -- resulting in it getting further behind. Initially they had no metrics being recorded, and resorted to "hand parsing" the logs to get some metrics on how long steps were taking. Then they started adding observability by using a Python statsd library, from Etsy, to send out statistics from their application at key points, which gve them more visiblity. They also used pg-statsd to get stastics out of PostgreSQL.
Sending metrics from Python can be as simple as:
from statsd import StatsClient
...
sclient = StatsClient(HOST, PORT) # optionally prefix=...
sclient.timing('tag', TIME) # submit timing result, eg ms
sclient.incr('tag') # increment counter
As they investigated their metrics showed two main issues:
database updates were the slowest part -- because they were making a lot of attempts at updating something which actually could only be updated 1/176 times (once all videos of all focus stacks of all slots in all chambers had been processed). So they moved that to a background thread to happen periodically, rather than every single time a video was made.
Celery did not respond well to oversubscribing the CPU cores -- they got much better results by slightly undersubscribing the CPU cores (n-1), leading to a huge decrease in system load.
The presentation was very entertaining, and viewing the video is highly recomended.
Creepy, frivolous and beautiful art made with machines
J Rosenbaum presented a fabulous talk on the fantastically weird generated art coming out of machine learning and other algorithms, particularly when someone feeds them a very selective world view. You need to see their slides or view their presentation to really appreciate it -- it was an awesome presentation. (A content warning note -- there is some artistic nudity in the slides so you may not want to view this at work; and some of the machine learning generated art is very weird, and sometimes rather creepy.)
J's website has more information on their process, and more examples (although note that too contains artistic nudity including on the front page). Their work at the intersection of art, machine learning, and gender seems well worth following.
Context Managers: You can write your own
Daniel Porteous described
how you can write your own Python Context
Manager,
and how to save yourself some effort by using the tools in
contextlib
.
Context managers are invoked by the Python "with
" statement, and can
be implemented by hand with a class that has the methods:
__enter__(self)
__exit__(self, type, value, traeback)
-- the extra arguments are for exception handling, to give the context manager a chance to alter what happens when an exception is thrown within the context managed block.
The class can also optionally have an __init__
method, but Daniel
cautioned against doing anything "expensive" (time/space intensive)
in the __init__
method, as it may be done and never used.
contextlib
makes this process simpler, with various decorators, including
@contextmanager
, which allows immplemeting your context manager as
a simple function which has a yield
at the point that caller code
should run. It also has other helper methods that make implementation
simpler.
Refactoring Code with the Standard Library
John Reese described how FaceBook have managed
to repurpose lib2to3
,
originally intended to help port Python 2 code to Python 3, to do other
automatic source refacoring (eg, adding or removing function arguments,
inserting calls to UI translation routines, etc) in a safe manner.
Because this refactoring happens in a parsed source tree, rather than
just a wild strings with a regex, it has more context to work with and
is less at risk of changing the wrong thing or making changes in
unexpected ways. And because lib2to3
has been maintained with all
the language updates in Python 2 and Python 3, it does a much better
job of parsing Python than trying to write a separate parser yourself.
Overall it looks like a good solution if you need to make changes to a large Python code base in a safe manner.
Keynote: Tom Eastman
Tom Eastman, a PyCon AU (and PyCon NZ) regular was invited to give the Saturday afternon Keynote -- and took the role very seriously, including wearing a suit jacket to present, and delivering a well rehearsed presentation.
Tom's theme was learning -- really learning, rather than just "learning light" where you put yourself in an environment where maybe learning could happen but do not actually do the work.
Like most of us who were gifted as children, Tom found himself assuming that things were either easy, or you could not do them. That "effort is what you need when you're not talented". It was only much later that he came across Carol Dweck and her descriptions of the "fixed mindset", and recognised those beliefs in himself.
The contrast to a "fixed mindset" is a "growth mindset" -- believe that you can change with effort. The growth mindset allows for the opportunity to improve -- and expects there to be effort involved so does not give up when it is "hard", has a thirst for the challenge, and sees failures as minor setbacks. The right mindset can have a big difference in what you do.
Tom illustrated the difference by talking about the difference between his mindset in areas where he perceived himself to be experienced (eg, Python programming), and those where he perceived himself to be a beginner (eg, Information Security). In the areas he believed himself to be experienced, and he perceived others as thinking he was experienced, he felt his reputation was at stake -- and he always wanted to be his best. So it was either easy or "he did not have talent"; a "fixed mindset". In areas where he believed himself to be a beginner he was quite happy "not knowing" and making mistakes, messing up, because he was "just a beginner" (and if you listen to any of Tom's many talks about information security over the least 5 years or so you'll hear him start by saying he is not an expert... giving himself permission not to be perfect).
Real learning happens best with "effortful retrieval", particularly using your own words (eg, teaching others, or writing your own flash cards). Otherwise the learning is "like a backup system you've never run a test restore on" (ie, no idea if there's anything "stored"). He recommended a Pomodoro App for focused time management (25 minute chunks, with 5 minutes break). And giving yourself permission to be a beginner -- to make mistakes, to not know it all immediately. Find a place to be "stragegically dumb". Accepting that it might be difficult at first. "If you're feeling comfortable maybe you're not actually learning."
In the course of his keynote, Tom also happened to mention cram learning Ruby, for a project, and coming across a weird operator... which he pointed out was very weird. That small side note, a lightning talk by someone Tom asked about the operator, and a subtle challenge on Twitter ended up creating a meme for PyConAU 2018 -- a Flip Floperator, about which there is more on day three of the conference.
Lightning Talks
Bex Dunn, Landscape Scientist at Digital Earth Australia
They have open satellite data, going back several years, of Australia, which gives them a compliation in space over time. By rendering that into an animated image, they can have a satellite image that changes over time as water flows into, eg, the Murray Darling basin and back out again.
The code is apparently somewhere on the GeoScience Australia GitHub page.
Tim Ansell, FuPy
Tim is not actually a robot -- he's actually all human. A constant bug generator. So he likes software. You can fix bugs in software after you write them. Whereas hardware is hard to patch. FPGAs make hardware into software.
This was all a pitch to get people to come hack on FuPy at the Sprints, and get hardware.
Merrin MacLeod, The Flip Floperator
Apparently Tom Eastman challenged her to talk about the Ruby
"Flip Flop" operator
because she is a Ruby programmer, so this followed on neatly from
Toms mention in passing in his keynote (immediately before the
lightning talks). The "..
" operator in Ruby (and Perl) flips
on when the first expression is true, then stays on until the second
expresion is true, acting like a generalised range expression.
It turns out to not be used in Ruby, and is probably being removed (or at least deprecated) around Ruby 2.6.
Adam Jacquier-Parr, DRF Model Pusher
DRF Model Pusher is a REST framework to send real time updates out using Pusher. It is implemented for Django (as a Model), and allows sending up to 10kB messages through public and private channels (optionally with a background worker).
The package is on GitHub.
Claire Krause, GeoScience Australia
She works with satellite imagery, studying agriculture from space. One of the things they can do is monitor when storage dams are being filled (with water) and emptied. This allows them to detect whether the water use rules are being obeyed. Previously this was hand digitised information, but now they have an algorithm which can detect water and identify dams. (It creates a raster data set, which they then automatically trace to build a vector dataset of geospatial coordinates.)
They are in the process of moving to a every 5 days dataset, which will let them better watch what happens in near real time.
This appears to be part of Digital Earth Australia, who have an Open Data Cube, including an example of doing water detection.
Felicity Robson, student, Captcha Cracker
Felicity talked about attempting to crack CAPTCHAs, starting from the example of recognising Phoenix, her cat. In particular she was targetting the Google CAPTCHA system -- which currently asks about vehicle, storefront and street sign recognition. The aim was to use a SIFT to recognise the images -- particularly street signs, which are relatively standardised. However Recaptcha (Google's CAPTCHA system) kept locking them out... for acting too robotically. (In later discussion with a Google engineer they pointed out that image recognition is one of the things that Google uses to identify a robot, others being patterns of behaviour, like trying to do things over and over again quickly... so there'll always be an arms race between CAPTCHAs and those who try to automatically break them.)
Nick Moore, Rocket Surgery
As a project for BuzzConf (held in Ballan, Vic), Nick and some others created a small MicroPython based telemetry system that could be used with water powered rockets to track their launch system. They used MQTT to stream the data back for real time display. Based on ESP32 chips
Benno Rice, Blockchain
"Waste electricity faster than anyone else". A repeat of a lightning talk at another conference, I believe.
Peter Lovett, PEP505
Peter Lovett is not a fan of PEP505, a proposal to add None-aware operators to Python in the form of "question assignments". While the PEP is from 2015, it appears to have gotten more discussion recently -- but still relatively few fans.
Philip James, Gratitude
Philip offered thanks to Guido van Rossum, the recently "retired" creator and BDFL of Python. And all the other Python core team and package maintainers.
As part of this he built the thanks package (on GitHub, which uses PyPI data to help find ways to fund Python development and make those projects sustainable.
ETA, 2018-08-29: Added Tom Eastman Keynote and lightning talks.