Quagga: RIP offset-list in 1 is a No-Op

One of my clients has a small network using RIPv2 as an IGP -- because the vendor of their older layer 3 switches wanted nearly twice as much money if a license for OSPF, etc, was included.

The network has a ring of four layer 3 switches, and two firewalls; two layer 3 switches connected to each firewall, and the two layer 3 switches on each side of the firewalls connected to each other. There are also a couple of firewall-specific segments for each firewall (DMZs).

Firewalls being firewalls, they are quite keen to see symmetric routing of traffic -- ie, return traffic comes in on the same interface that the outgoing traffic went out. The ring of layer 3 switches and firewalls obviously presents several "equal cost" paths by default, so a priority of planned (symmetric!) path selection was implemented on the firewalls (as the devices that care) using RIP offset lists. There are both "out" (ie, announce) offset lists to ensure symmetric traffic flow between combinations of layer 3 switches, and also "in" (ie, receive) offset lists to attempt to ensure symmetric traffic flow between the "firewall local" network segments (DMZs).

During a recent partial outage (due to sudden hardware failure of another device) it was discovered that the remote firewall to local firewall's DMZ traffic flows were not working properly, which turned to be due to one of the firewalls having picked the wrong interface to send that DMZ traffic. At the time a quick work around was implemented using static routes to override the RIP path selection -- but once the dust settled I went looking for a reason why the configuration planned for that scenario did not just work.

The firewalls are using the ripd from Quagga -- a relatively old version as they have been in production for some years, but checking the current Quagga source the same effective behaviour still seems to apply in the current (2016-03-23) source.

The DMZ-DMZ traffic flows offset lists were implemented with something like:

access-list dmz-traffic permit A.B.C.D/N
[....]
access-list dmz-traffic deny any

router rip
    offset-list dmz-traffic in 1 bnx1

where bnx1 was the interface that was undesired -- ie with the hope that the other interface (bnx0 in this case) would be used instead (due to its metric being "1 less" than the bnx1 metric).

However it turns out with Quagga that "offset-list NAME in 1 [INTERFACE]" does not actually have any effect in Quagga, because:

by default RIP adds the interface metric to incoming prefixes
the default interface metric is 1 (in the version my client is running; see below)
in Quagga's ripd, the offset-list value is applied instead of adding the interface metric.

In particular ripd/ripd.c around line 556, contains the comment:

/* If offset-list does not modify the metric use interface's
   metric. */

and code that checks the return value of rip_offset_list_apply_in() and only adds the interface metric if rip_offset_list_apply_in() returned 0, indicating that the offset list did not change the metric value.

So offset-list NAME in 1 [INTERFACE] adds 1 if the offset list matches; but if the offset list did not match, then 1 (the default interface metric) would have been added anyway. So offset-list NAME in 1 [INTERFACE] makes no difference to the calculated metric -- 1 is added whether or not the offset-list matched. It is a "No-Op".

To actually affect the incoming metric by more than what would happen by default it is necessary to use:

offset-list NAME in 2 [INTERFACE]

in Quagga.

Outbound offset lists (ie, offset-list NAME out 1 [INTERFACE]) behave differently, because by default nothing is added to the out going announcements, so adding 1 to them will increase the metric beyond the default value. As intended. Fortunately the various "layer 3 switch to layer 3 switch" paths on my client network were all implemented using out offset lists, and worked fine. (The DMZ traffic paths were implemented with an incoming offset list simply because it seemed an easy way to keep the two prefix lists separate, and thus more consistent on the two firewalls.)

While I have not verified it on actual hardware (and it appears completely undocumented for both Quagga and Cisco AFAICT) Quagga's handling of incoming offset list values appears to be different behaviour from what happens on Cisco routers.

For instance routerlabs.de's RIPv4 Manipulation of the Metric with Offset Lists example, apparently using Cisco 7200s, shows the commands:

access-list 10 permit 172.17.0.10 0.0.0.0
router rip
    offset-list 10 in 5 Serial 1/0

changing the metric of 172.17.0.10 from:

R       172.17.0.10 [120/1] via 192.168.100.2, 00:00:05, Serial1/0

to:

R       172.17.0.10 [120/6] via 192.168.100.2, 00:00:00, Serial1/0

Ie, from 1 to 6. So that's 1 (for the interface) plus 5 (for the offset-list) is 6. (There's also similar behaviour in this Journey of a Network Engineer post, which also seems to be Cisco -- eg talking about CCIE, a Cisco certification. And this Cisco forum thread on RIP, which shows what appears to be a screenshot of actual Cisco routers -- with "offset-list 1 in 10" taking the metric of 2.2.2.0/24 from 1 to 11.)

Which implies that on Cisco the offset-list NAME in N value is in addition to the interface value. That is intuitively what I would expect from an offset list (ie, "adjust by N more", rather than "replace interface metric adjustment").

Possibly this is a (very longstanding?) bug in Quagga (which AFAICT has not been reported; I have not reported it either since it is possibly too late to change without breaking people's configurations -- and it would not make much difference to my client as they are just going to change their offset-list to say "2" rather than "1" rather than trying to install a more modern Quagga). But it did seem worth recording it somewhere one could find it while searching on the Internet, because if someone else had recorded it, it would have saved me a bunch of time and confusion!

Sidebar: Quagga interface metrics and RIP

A few years ago, the default interface metric was set to 0 (apparently to match Linux's defaults). This caused problems for RIP convergence. So a special case was added to the ripd/ripd.c code to treat an interface metric of 0 as if it were 1 (as suggested here, and agreed here; patch as applied).

But the special behaviour of offset-list NAME in seems to have been there since the first revision checked into git through until now, with only some minor formatting changes.