Bash: Shell functions as arbitrary environment variables

bash, the Bourne Again SHell, is an extended version of the POSIX standard (Unix) shell, from the GNU Project. (Both were based on the Bourne Shell originally written by Stephen Bourne at Bell Labs, hence the name.) Because bash is Free Software, has been around for over 20 years, and is the de facto standard shell on Linux systems (at least outside embedded Linux systems, which often use busybox), it is widely used. Quite a few shell scripts even assume that the extended features of bash are available (called "bashisms"), making it harder to just use a replacement shell (eg, dash, Debian's version of the Almquist Shell).

One of the features of the POSIX shell, and bash, is shell functions, which allow giving a name to a block of shell script that can live inside an existing shell script (or shell session), rather than being stored in a separate file that is read in each time it is invoked. Particularly in early days of Unix executing something already defined in memory was much more efficient.

bash allows these shell functions to be copied from a parent shell session to a child shell session, so that they are available to any commands executing in a sub-shell (including sub-shells that are invoked to handle some common shell features, like grouping commands together). Which seems like a convenient feature.

How bash handles propogating these shell functions from one bash process to another is both clever and terrifying: if an environment variable is found with the magic characters "() {" at the start of the value, then it will be treated as the definition of a shell function with the name of that environment variable. That is clever because it reuses something that Unix already had (propagation of the environment variables), and terrifying because it means that these function definitions could now come from anywhere that resulted in invoking bash.

Add to this picture:

In another "clever and terrifying" move, bash reuses its standard command processor to parse the shell function out of the environment variable
bash is often linked to /bin/sh, the standard Unix shell path
The system() C standard library call runs commands by starting /bin/sh to parse the command, and many other languages have one or more library functions with this behaviour (system, popen, ...)
Lots of software explicitly or implicitly uses functions that invoke the shell (/bin/sh) to parse a command line it has assembled, and either explicitly or implicitly passes along environment variables.
In a modern networked system there are many ways to pass values to a remote system that end up in environment variables on that remote system (eg, HTTP's Common Gateway Interface (CGI), ssh, DHCP scripts, VPN connections, etc).

and the result is that the standard command processor in bash ends up parsing environment variables containing untrusted, and potentially malicious, input.

This does not end well. (Risky Business described it as Bashpocalypse 2014, in reflection of the amount of hype generated; pretty impressive hype given that I had assumed most of the IT security hype quota for the year got used up earlier this year over Heartbleed.)

The first bug (CVE-2014-6271) resulted from the bash standard command processor being willing to process multiple commands on the same "line" if they are separated by semicolons (;) -- a built in Unix shell feature for decades -- and being quite happy to do this when parsing environment variables containing shell functions too. Anything that wasn't part of the shell function was executed immediately. Oops.

The second bug (CVE-2014-7169) but resulted from incomplete handling of escape characters while parsing the shell function definition out of the environment variable. Resulting in at least the creation of arbitrary files.

The immediate result was six patches to versions of bash spanning at least 10 years. Followed by more patches the next day to work around the second bug found (CVE-2014-7169). And a botnet.

But as far as I can tell the frankly terrifying combination of passing shell functions around via arbitrary environment variables and then parsing them with the standard bash command processor still exists. Just with a bit of a safety fence around it to try to avoid accidental drownings.

A couple of decades ago when I started using Unix it was commonly received wisdom that the shell should not be used in any security critical situations (eg, setuid). It appears, largely due to how ubiquituous use of /bin/sh as a command processing helper actually is, and the increasingly "network accessible everything", that lesson needs repeating.

At this point I think everything needs to assume that it is processing untrusted input, even if in the normal case the same software was what generated that input. Perhaps especially if the normal case is that the same software generated that input as is reading it. (At least if different software generates and parses the input, there will be some extra testing and robustness to handle normal misunderstandings; if the same software generates and the parses the input there is a high risk that the implementation will "just know what to generate" and the parser will only expect that -- and thus be much more fragile.)

It also seems like a really good idea to avoid anything that invokes a shell when running with untrusted input. Just saying.

Debian and Ubuntu patches

For reference this is what I found for Debian and Ubuntu patch versions required to pick up both CVE-2014-6271 and CVE-2014-7169 fixes:

Debian 7 (Wheezy): 4.2+dfsg-0.1+deb7u3
Debian 6 LTS (Squeeze): 4.1-3+deb6u2
Ubuntu 10.04 LTS (Lucid): 4.1-2ubuntu3.2
Ubuntu 12.04 LTS (Precise): 4.2-2ubuntu2.3
Ubuntu 14.04 LTS (Trusty): 4.3-7ubuntu1.2

All but the Debian 6 LTS packages were installable pretty much immediately; the Debian LTS ones took a while to propagate around the mirrors. (Probably time to upgrade those last few systems off the old extended-support version of Debian Linux...)

They should probably be considered minimum versions, and I fully expect there will be at least one disruptive gotcha found in this same area.

`dash` as `/bin/sh`

It also seems like an extremely good idea to consider using something other than bash for /bin/sh at this point. Debian and Ubuntu adopted dash as a default /bin/sh a few years ago, mostly because bash was sufficiently big that loading it for each shell script during booting was taking too long.

The way to tell which one is in use is:

ls -l /bin/sh

or:

dpkg -S /bin/sh

Interestingly this seems to be maintained outside of Debian's update-alternatives system, presumably due to how integral it is to the system. Instead it is maintained as a dpkg configuration option on the dash package (the equivalent Ubuntu page has a bunch of detail on dealing with scripts broken by "bashisms").

(Based on a quick check, all but one of my Debian and Ubuntu systems has dash as the /bin/sh provider, greatly mitigating the risk of this particular bug. The one that does not was originally installed a long time ago, before Debian made this change, and apparently I never reconfigured it to use dash. Fortunately that one is also hidden away on an internal network, and has already had bash patched.)

OS X

OS X 10.9 (Mavericks) uses bash 3.2.51 as its /bin/sh, so is vulnerable to this problem; OS X 10.6 (Snow Leopard) uses bash 3.2.48 as its /bin/sh which is also vulnerable. bash 3.2 was originally released in 2006; there have been three bash 4.x feature releases since then, starting with bash 4.0 released in early 2009 (ie pre OS X 10.6). Internet rumours are that Apple chose not to upgrade due to the license change from GPLv2 to GPLv3 (ie, effectively maintaining their own fork).

Presumably eventually Apple will patch this bug (Apple security updates do not yet seem to address the problem; although the 10.9.5 update from 2014-09-17 does address lots of other CVEs). With OS X pretty much only used as a client these days, the risk is somewhat lower: DHCP off untrusted Wifi seems to be the major risk vector for non-servers. Those sufficiently paranoid and tech savy can recompile bash on OS X with the patches or build a recent bash and use it in place of /bin/sh.

Updates

ETA, 2014-09-28: a good writeup on the bash code quality from Errata Security (found via cks's blog post about issues just bumping version numbers, who found the post via twitter). By Robert Graham who also posted the first remote scan for triggering this bug via HTTP; he also has other posts about shellshock.

ETA, 2014-09-29: There are at least CVE-2014-7186 and CVE-2014-7187 as additional bugs/fixes for bash (patches released by Ubuntu Linux 2014-09-26), as well as CVE-2014-6277 for which more technical details are due out later this week... but distros are encouraged to patch. It appears Ubuntu has included something like that unofficial patch (from their changelog):

SECURITY IMPROVEMENT: use prefixes and suffixes for function exports

but without a CVE ID it is hard to be sure; and Debian seems to have also included similar patches, in a release late last week (4.2+dfsg-0.1+deb7u3) but their changelog does not mention any CVE IDs. (Also appears RHEL/CentOS include similar fixes, and indicate the change is intended to prevent environment variables being confused with function exports.)

It looks like bash will be a rich source of bugs for some time to come.

ETA, 2014-10-02: As promised details on the additional bash bug have been released (CVE-2014-6277 and CVE-2014-6278; see also full disclosure mailing list post, which is cached at LWN). The "security improvement" that Debian and Ubuntu included last week apparently mitigates these problems (by, AFAICT, not allowing them to be attacked from arbitrary points -- as a result of parsing shell functions only out of variables in a distinguished namespace, something that I think should have been done from the start; it appears to have been accepted upstream too). Both bugs seem to be the result of automated fuzzing (with american fuzzy lop and tmin).

I am pleased to see others agree it is best not to expose the main bash command parser to untrusted remote input and pushed strongly for that approach -- it seems to have been accepted by many Linux distros and upstream now. (NetBSD went further and disabled importing shell functions by default, rather than "expose bash's parser to the internet and then debug it live while being attacked."; Ian Jackson also experimented with doing this for his Debian systems.)

Michal Zalewski explains why the 'many eyes' approach left this undiscovered for 20 years: security people did not expect bash to parse its environment variables, so did not go looking for it. As Hanno Böck points out this is a "class of problem" in itself: tools maintainted by volunteers that get little attention from anyone else, but are "running the Internet" (bash, OpenSSL, etc). The Linux Foundation's Core Infrastructure Initiative was started -- after HeartBleed -- to provide a process to start addressing some of these.

See also LWN writeup "Bash gets Shellshocked".

Finally, OS X Bash Update addressing the first two issues, and adding exported functions namespacing (apparently slightly different from others choices), which will hopefullyaddress most of the practical issues on OS X. Although it is not obviously available through the standard OS X Updates, at least not yet, seemingly just as a manual download.

ETA, 2014-10-09: David A. Wheeler wrote a great paper on the shellshock vulnerability, including a useful timeline and survey of responses. (Thanks to LWN for linking to it.)

ETA, 2014-10-19: It appears Apple shipped the bash fix as part of Apple Security Update 2014-005, on 2014-10-16. That update also includes disabling CBC suites in SSLv3 as a security fix for POODLE (apparently leaving RC4, which itself has security issues).

Debian and Ubuntu patches

dash as /bin/sh

OS X

Updates

`dash` as `/bin/sh`