DMVPN DualHub EIGRP Traffic Engineering

With the advance of vDSL, Fiber, cable Internet and the appropriate SLA’s bussiness Internet connections have become increasingly reliable. By choosing the local ISP’s carefully it is much more interesting for a company to replace the MPLS connections for an Overlay network based on redundant Internet connections. As a result businesses quite often obtain a higher speed connection for much lower rates. One of the business cases I made in 2006/2007 had a 70% decrease in annual costs compared to their European WAN line based on an MPLS service provider including High Availability.
These kind of overlay networks are quite often based on a hub-spoke technology (each spoke registers themselves on the hub router; the configuration of the spoke is relatively easy and adaptable for each site while the hub config doesn’t change). This concept is very suitable for scalability. With certain technologies spoke-spoke traffic is handled automatically. The Cisco technology for this is called DMVPN (Dynamic Multipoint Virtual Private Network) and is used intensively in their IWAN solution. The most common routing protocol for these topologies is EIGRP for the dynamic routing and fast convergence.

Things get interesting when you start to use a dual-hub topology, so that a branch office can use two seperate Internet connections. In these situations it can happen that WAN-traffic is flowing over a flapping or unreliable Internet connection, or that traffic is going over the backup-line which is much slower than the primary connection. In this post I will explain why EIGRP is doing this and how you can change that behaviour within EIGRP.

Network Topology

The diagram below is a very common topology for a DMVPN based on a dual-hub / dual ISP solution. Each Hub is connected to a separate ISP (redundancy so that an ISP failure will not result in extra connectivity problems) and a branch office also has two ISP connections for the same redundancy.
Traffic from the WAN connections are terminated to a DMZ interface on a firewall, so that traffic can be inspected before it’s getting into the datacenter.
This setup results in the situation that in case of the failure of one ISP, the WAN traffic will flow via the other ISP connection.

Problem

If the two Internet connections from the spoke are identical (with regards to bandwidth and speed) then there is not really a problem; both DMVPN tunnels act in an active/active topology where traffic (ingress and egress) is load balanced. This is quite clear from the output of command  “show ip route eigrp” on the spoke router. It is clear that the main office ( 10.0.1.0/24) is reachable via both hub-routers.

On the ASA , it is also visible that the spoke is reachable via two destinations. Bear in mind that the asa doesn’t do packet-switched loadbalancing but flow-based (e.g. per connection and not per packet like a router does)

However, a problem will occur when the second Internet connection on the spoke is the backup connection with a lower speed (for example a 4G connection). Let’s change the bandwidth of tunnel2 (backup) to a lower speed, so that traffic from the branche office will go via the primary connection (tunnel1)

And now traffic is only flowing via Tunnel1 as the output demonstrates

However, the problem is not at the branche, it is at the hub. At the ASA-HQ traffic for the branch (1.0.10.1.0/24) is still being directed to both hubs!

So, traffic from HQ to the branch is still sent over the slower link, or in case of a problem on the Internet via an unreliable path. And as a result the WAN connection has become unreliable, even we have redundancy.

The cause of this problem lies within EIGRP self. EIGRP is a very nice (and now open) protocol that is both link-state and distance-vector protocol. The bandwidth of an interface is one of the many metrics that is used to determine what the most optimal path is for a destination. And that is, in fact, also quite logical, as the bandwidth for the branch office is (from the perspective of the ASA-HQ) is the same because both hub1 and hub2 are connected with the same bandwidth. This results in that the cost for 10.10.1.0/24 via both paths are the same. In summary, the ASA doesn’t “know” that tunnel2 for that specific branch is slower. The following output shows that the metrics with EIGRP are the same for both hubs.

It is of course possible to lower the bandwidth on hub2, so that traffic is preferred over hub1. This is of course possible if there are no problems on a spoke connected to hub1 and needs to be directed via hub2.. Or when there are hundreds of spokes connected to hub2.

Solution

It is still posible to inform the ASA that the traffic for a specific branch office is directed via a specific hub router. As the bandwidth of the hub router cannot be changed, another metric needs to be changed for specific destinations to enforce a different preferred path while keeping the flexibility and redundancy within EIGRP.

For this we will use the administrative distance in combination with an access-list. As known, Cisco uses administrative distances per routing-protocol to put specific routes in the RIB. For EIGRP the administrative distance is 90 for internal routes and 170 for external routes (routes that are injected into EIGRP).

In this topology all routes to the spokes are internall and have a distance of 90.

In Cisco IOS you can use the command distance to change the networks that are reachable via specific IP-subnets (the Tunnel interface on which the Spokes connect). By attaching a standard access-list to this command, only those networks matched to the access-list on that specific interface will change their distance.

The configuration on IOS would then be:

The standard access-list 99 is used to match traffic. By adding the address 127.0.0.1 it is ensured that access-list 99 remains in the configuration, even if we don’t want to do traffic engineering.
Within the EIGRP configuration, the command  distance 95 10.255.1.0 0.0.0.255 99 is added. This basically tells the router that all networks that are matched on access-list 99 and reachable via IP network 10.255.1.0/24 the distance needs to be set to 95.

When you now, dynamically, add the network of a spoke to the hub that needs to become the backup router and clear that specific neighbor, the distance will be changed to 95. And as a result the routing to ASA-HQ is changed as well, so the ASA is informed that this specific network is available at cost 95.
In this example I want to route traffic over hub1, so I need to change hub2 with the following config:

And when checking the EIGRP routes, traffic is now preferred over hub1, even already from hub2!

The route via the DMVPN tunnel is of course still valid, but has become more expensive and thus not installed in the RIB.

And the ASA will also route the branch office (10.10.1.0/24) via hub1 :

With this a possible complex situation is easily fixed. If there’s a feel that a specific WAN connection for a specific branch office is acting up, with just adding / removing the network on the access-list you can force the traffic to a more specific path.
I’ve used this method quite regulary to determine if a provider is having packet loss, or latency, out-of-order packets, or other possible network related problems.

Leave a Reply

Your email address will not be published. Required fields are marked *

Solve : *
3 − 2 =


This site uses Akismet to reduce spam. Learn how your comment data is processed.