At SDNCon this past week, I recreated my Kiwi Pycon Ryu Example by combining Docker and Open vSwitch. The process was fairly simple, just requiring some custom Dockerfiles and a bit of network scripting, but one element struck me as a kludge: the need to delete and recreate the Open vSwitch before starting the example containers, in order to predict which switches port numbers they were on. So I wanted to find a way to determine which switch port a new Docker container instance had been connected to.

A Docker container that is given a network interface will, by default, end up with an interface with a randomly allocated MAC address, and a randomly allocated IP address. The scripts used by my Docker and Open vSwitch example also used a randomly allocated MAC address (by the veth network interface), but statically assigned IP addresses to match the ones in the original example. As a result there may or may not be any predictible (in advance) network information that, eg, a SDN controller like Ryu can look for to identify a specific container connection.

By inspection it appears that a new connection to an Open vSwitch bridge will currently be allocated the next sequentially available OpenFlow port number for that bridge, starting with the first OpenFlow port number, 1. This is the kludge that my previous post relied on to have predictible port numbers. But firstly that is almost certainly an internal implementation detail that could change at any time, and secondly if a container is stopped and started again then new network interfaces will be allocated for the new container instance which will get new port numbers -- so guessing the OpenFlow port number for a given container depends not only on the order in which the containers are started, and the assumption of sequential port numbers, but also knowing how often the containers have been restarted since the Open vSwitch was created. That is hardly a recipe for reliable prediction (my previous example worked around this by deleting the Open vSwitch bridge and recreating it immediately before starting the example containers -- to reset the port counter).

It is also worth nothing that when the container is started then stopped, the container end of the veth link disappears which takes the host end of the linnk down, but it appears the Open vSwitch database retains the other end of the veth link, even though Linux has removed the interface. It is possible that if that stale interface were manually removed from the Open vSwitch database that Open vSwitch would reuse that old port number at some later point -- which would make the OpenFlow port numbers even less predictible than the apparent current implementation of "sequentially incrementing".

My aim was to find a way to start with a Docker container identifier and end up with an OpenFlow port number in use by the connection of that Docker container to a specific Open vSwitch bridge. So that port number could then be used to address the container's connection by the OpenFlow controller managing that Open vSwitch. After some investigation, and asking an Open vSwitch developer, I found there is a way to map from a running Docker container's network interface through to an OpenFlow port number on the Open vSwitch bridge -- but it also relies on Open vSwitch internal implementation detail. I wanted to document this path as even with the reliance on internal implementation detail it seems more reliable than "try to start things in a predictable order and guess the OpenFlow port numbers."

Given that we use veth links to connect the Docker container to Open vSwitch bridge, the overall approach is:

  1. Translate the Docker container identifer to a Linux network namespace

  2. Use that network namespace to get the ethernet interfaces in that Linux network namespace (and hence in that Docker container)

  3. From the ethernet interface in the network namespace (container) get the "peer" veth SNMP ifIndex of the network interface that is outside the container (ie, the external end of the veth link that actually got connected to the Open vSwitch bridge).

  4. Scan the network interfaces in the host machine to find the name that matches that SNMP ifIndex

  5. Ask Open vSwitches internal tool (ovs-appctl) to tell us which OpenFlow port that host interface is connected to.

I have written this up into a script which automates these steps, but I wanted to detail them below for ease of reference. (See the end of this post for script usage information.)

Because of the use of an internal tool, this approach could break at any time -- but AFAICT at present Open vSwitch provides no non-internal means to perform this useful mapping from a container interface to an OpenFlow port. It seems to be assumed that the OpenFlow controller will identify connected devices via something other than the port number, but if the MAC address and IP address are also randomly allocated then there is no obvious network-visible attribute to use to locate the container connection.

Docker container to Linux network namespace

Suppose we have a container running with the name firewall_ext, the process looks like:

GUEST_NAME="firewall_ext"

# Find the Linux container device mountpoint 
CGROUP_MOUNT=$(grep -w devices /proc/mounts | awk '{ print $2; }')

# Translate the Docker container name into a Docker Container ID
CONTAINER_SHORTID=$(docker ps -a | 
                    awk 'substr($0,139) ~ '"/${GUEST_NAME}"'/ { print $1;}')
# eg, ab4e7d1d591a

# Find that container ID in the devices mountpoint
CONTAINER=$(find "${CGROUP_MOUNT}" -name "${CONTAINER_SHORTID}*")
#
# eg, /sys/fs/cgroup/devices/docker/ab4e7d1d591a16fe5f87702a61a15555accb02fb624f2eb84ff027741529454d

# Turn that container ID mount location into a Network namespace ID
NETNS=$(head -n 1 "${CONTAINER}/tasks")
#
# eg, 11082

(this process based on the process used in, eg, pipework and ovswork.sh.)

Getting ethernet interfaces in Linux network namespace

The ip netns exec command allows running ip commands as if they were inside the network namepsace. (See, eg, Scott Lowe's post about this feature.) Given this extremely helpful command, it is pretty simple to make a list of all the ethernet network interfaces in the container:

CONTAINER_ETH=$(sudo ip netns exec "${NETNS}" ip link show | 
                grep -B 1 link/ether | grep '^[0-9]' | 
                cut -f 2 -d : | sed 's/ //g;')
# eg, eth0

taken from full output like:

ewen@docker:~$ sudo ip netns exec 11082 ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode
DEFAULT group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
27: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 9a:69:12:bc:3d:1d brd ff:ff:ff:ff:ff:ff
ewen@docker:~$ 

Find the SNMP ifIndex of the Host end of the veth link

For each of those ethernet interfaces we found, we can use ethtool -S to find the other end of the veth link:

HOST_IF_ID=$(sudo ip netns exec "${NETNS}" ethtool -S "${GUEST_IF}" |
             awk '/peer_ifindex:/ { print $2; }')
# eg, 28

taken from full output like:

ewen@docker:~$ sudo ip netns exec 11082 ethtool -S eth0
NIC statistics:
     peer_ifindex: 28
ewen@docker:~$ 

Find the interface names from the host SNMP ifIndex values

Conveniently ip link show shows the SNMP ifIndex values, so we can simply scan that output for the ifIndex value we want:

HOST_IF=$(ip link show | awk "/^${HOST_IF_ID}:/"' { print $2; }' | 
          cut -f 1 -d :)
# eg, vethp11082eth0

taken from (partial) full output like (trimmed for width, and length):

ewen@docker:~$ ip link show | tail -8 | cut -c 1-62
24: vethp10565eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
    link/ether c6:03:3b:74:20:e5 brd ff:ff:ff:ff:ff:ff
26: vethp10565eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
    link/ether 16:a7:2e:27:67:e5 brd ff:ff:ff:ff:ff:ff
28: vethp11082eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
    link/ether ca:f8:49:a3:a5:f9 brd ff:ff:ff:ff:ff:ff
30: vethp11240eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
    link/ether 56:fe:1b:09:d4:80 brd ff:ff:ff:ff:ff:ff
ewen@docker:~$ 

Finding where the veth host end connects to Open vSwitch

(ETA, 2014-10-15: Also see update at end of this post for other, possibly more maintainable, ways to fetch this information.)

Open vSwitch maintains two port numbers for a given interface connected to a given Open vSwitch:

  1. A Linux port number, which is global to all Open vSwitch managed interfaces on the host

  2. An OpenFlow port number, which is local to a specific Opon vSwitch bridge

Multiple Open vSwitch bridges can have OpenFlow port number 1, which relate to different Linux interfaces -- but those will have different Linux port numbers (for the OpenFlow/Linux interface).

The Open vSwitch ovs-dpctl tool is able to show the global (Linux) port numbers that Open vSwitch is tracking for given interfaces:

ewen@docker:~$ ovs-dpctl show
system@ovs-system:
        lookups: hit:61 missed:67 lost:0
        flows: 0
        port 0: ovs-system (internal)
        port 1: kiwipycon (internal)
        port 2: vandervecken (internal)
        port 3: vlanswitch (internal)
        port 4: vethp10565eth0
        port 5: vethp10565eth1
        port 6: vethp11082eth0
        port 7: vethp11240eth0
ewen@docker:~$ 

(Note that most of the ovs-dpctl show output is inexpecably indented by one tab -- and it really is a tab not spaces -- for no particularly obvious reason; I've translated them to spaces in this blog post to ensure the visual alignment remains. But if parsing this output beware that it is literally a tab character. Unfortunately many of the Open vSwitch tools have this "non-trivial to parse" output format.)

After talking with an Open vSwitch developer, the only way to get the OpenFlow port numbers of a specific Linux (veth host) interface on a specific Open vSwitch bridge is to use an Open vSwitch internal tool, ovs-appctl to run dpif/show (a command not documented in the ovs-appctl manpage, but which is listed in the ovs-vswitchd manpage; but unfortunately the ovs-appctl version does not take arguments, at least in the version in Ubuntu Linux 14.04 LTS -- ie, Open vSwitch 2.0.2).

This command outputs the internal Open vSwitch information which maps each interace on an Open vSwitch bridge through to both OpenFlow port number (which we want) and Linux port number (shown above):

ewen@docker:~$ sudo ovs-appctl dpif/show
system@ovs-system: hit:0 missed:54
        flows: cur: 0, avg: 0, max: 14, life span: 0ms
        hourly avg: add rate: 0.381/min, del rate: 0.381/min
        overall avg: add rate: 1.000/min, del rate: 1.000/min
        kiwipycon: hit:0 missed:7
                kiwipycon 65534/1: (internal)
        vandervecken: hit:0 missed:7
                vandervecken 65534/2: (internal)
        vlanswitch: hit:0 missed:40
                vethp10565eth0 1/4: (system)
                vethp10565eth1 2/5: (system)
                vethp11082eth0 3/6: (system)
                vethp11240eth0 4/7: (system)
                vlanswitch 65534/3: (internal)
ewen@docker:~$ 

(as above, note the indenting by the tool is a tab character, not spaces but I have converted it to spaces to preserve the visual indent in this blog post).

In this output, for each Open vSwitch bridge, there is the Linux interface name (the host end of the veth link in our case), then a pair of "OpenFlowPort/LinuxPort" -- ie, the first number maps to the OpenFlow port number that we want, and the second number maps to the Linux port number (eg, as returned by ovs-dpctl show), which we do not currently care about.

To parse the ovs-appctl dpif/show information (which does require root privileges to retrieve) we need to extract out the section starting with the bridge name that we want, and finishing with the next bridge name (or end of input):

get_ovs_portmap() {
  BRIDGE="${1}"
  sudo ovs-appctl dpif/show |
       MATCH="${BRIDGE}" \
       perl -ne 'BEGIN                 { $in_switch=0;       } 
                 if (/$ENV{MATCH}/)    { $in_switch=1; next; }
                 if ($in_switch) {
                   if (/^(\t| {8})\S/) { $in_switch=0        }
                   else { print; }
                }'
}

(later versions of Open vSwitch may take an argument to dpif/show to limit output to the section for a specific bridge, as we do with the shell function above).

Given that output we can scan for the host veth interface name that we care about, and get the OpenFlow port number of the container interface we started with as it conncts to that specific Open vSwitch bridge:

OF_PORT_ID=$(get_ovs_portmap "${OVS_BRIDGE}" | 
             awk "/^\s*${HOST_IF}/ "'{ print $2; }' | cut -f 1 -d /)
# eg, 3

docker-ovs-port script

The docker-ovs-port script automates all the above steps, given a container name or identifier and an Open vSwitch bridge name:

ewen@docker:~$ ./docker-ovs-port firewall_ext
Usage: ./docker-ovs-port CONTAINER OVS_BRIDGE
ewen@docker:~$ 

it outputs three fields (in CVS format) for each ethernet interface of the Docker container):

  1. interface name inside the Docker container (eg, eth0)

  2. OpenFlow port number of that container on the named Open vSwitch bridge (if this field is empty, it is not connected to that OpenFlow bridge)

  3. (for convenience) the MAC address of the ethernet interface in that Docker container (also useful to the OpenFlow controller, and easily obtained with `ip netns exec ${NETNS} ip link show``)

For instance:

ewen@docker:~$ ./docker-ovs-port firewall_ext vlanswitch
eth0,3,9a:69:12:bc:3d:1d
ewen@docker:~$ 

or for a container with multiple connections (in this case a "firewall" container, which routes/firewalls traffic between multiple VLANs on the Open vSwitch):

ewen@docker:~$ docker ps | grep trivial_firewall | cut -c 1-65
e7fd7cf5c4cc        trivial_firewall:latest   "/bin/sh /usr/local
ewen@docker:~$ ./docker-ovs-port e7fd7cf5c4cc vlanswitch
eth0,1,52:09:90:0b:db:f3
eth1,2,e6:0c:ad:58:93:1f
ewen@docker:~$ 

(here the container does not have a manually assigned named, so we find it by the image that it is running instead, and use that to get the Docker container ID).

ETA, 2014-10-14: An Open vSwitch developer pointed out that since Open vSwitch 2.1 (around 6 months old; newer than what is in Ubuntu 14.04 LTS -- also around 6 months old), there is a way to request that a particular interface be assigned a particular OpenFlow port on the Open vSwitch bridge, and this will be stored in the Open vSwitch database and reapplied on bridge restart if possible (eg, it does not conflict). The ovs-vswitchd.conf.db(5) man page (PDF) (apparently only available as a PDF) has more detail. It appears the syntax is something like:

ovs-vsctl add-port ${OVS_BRIDGE} ${ETH_PORT} -- set interface ${ETH_PORT} ofport_request=10

I can not easily test this as I do not have Open vSwitch 2.1 running anywhere, but that appears consistent with this example, and examples in the FAQ for "How do I configure Quality of Service (QoS)?".

However if the request cannot be satisified then some other port will be allocated, and the above process will be needed to find it again.

ETA, 2014-10-15: Further discussion turns up that this is actually a FAQ in the Open vSwitch FAQ (sadly one cannot link directly to a specific question, because Open vSwitch's online FAQ is just a GitHub view of a text file):

Q: How can I figure out the OpenFlow port number for a given port?

which offers several options:

  1. An OpenFlow OFPT_FEATURES_REQUEST (returning a OFPT_FEATURES_REPLY) includes the OpenFlow Port to Name mapping, which could then be parsed by an OpenFlow Controller looking for a specifc interface name (saving a step if you are already passing the information to an OpenFlow Controller).

  2. ovs-ofctl show ${OVS_BRIDGE} prints the output of OFPT_FEATURES_REPLY in a format that could be parsed, eg:

    [....]
    1(vethp10840eth0): addr:4a:87:60:b0:be:cc
    [....]
    2(vethp10840eth1): addr:aa:14:8e:d3:3d:d8
    

    and being a public interface hopefully this would be a little more stable (AFAICT the OpenFlow port number is first, then the Linux interface name inside parenthesis, then the MAC address of that host-end interface which isn't as useful to us).

  3. ovs-vsctl get Interface ${INTERFACE_NAME} ofport (which would need to be be combined with sudo ovs-vsctl iface-to-br ${INTERFACE_NAME} to check that the interface is actually on the bridge you want -- otherwise the port number refers to some other OVS Bridge...).

  4. ovs-vsctl -- --columns=name,ofport list Interface to get the whole table (two lines per interface), where -1 indicates "no OpenFlow port". (Trickier to parse, and lists all bridges, so probably not ideal.)

Finally, it looks like combining Docker and Open vSwitch is topical, with SocketPlane founded to explore that (started by some OpenDaylight developers).

Also in the same sort of space is Zettio's Weave for linking containers across mutiple hosts. (It appears not to be using Open vSwitch, but some sort of home-grown UDP encapsulation.)