Andras Dosztal
Andras Dosztal
Network architect
Apr 25, 2016 4 min read

Replication Over The Backup WAN, Part 1

thumbnail for this post

It’s fine to have a backup link, but normally it’s not used and we’re paying a lot for it; could you send our file storage’s replication traffic through it?” says the recurring customer question, immediately followed by the “Just make sure business traffic is preferred in case of a primary link failure.” statement. I’m examining three scenarios in this and the upcoming two posts.

Scenario 1: Small remote branch

Topology
Here we have a single edge router (IOU1) with two WAN links.This topology might be used on a small site with unreliable WAN connections. Notes:

  • Ostinato’s eth0 is only used to transmit the stream file to my PC.
  • IP addresses were assigned using these formulas (H = hostname sequence number, i.e. 4 for IOU4):
    • Loopbacks: H.H.H.H/32
    • Interfaces: 192.168.{H1}{H2}.{H1|H2} (e.g. 192.168.24.4 for s2/0 of IOU2, 192.168.13.1 for e0/1 of IOU1)

Policy Based Routing is the key solution here. In this example, the traffic from source 192.168.1.99 is routed by policy to s3/0 on IOU1, while the rest of the traffic uses the primary link (e0/1). The first step in creating PBR is defining an ACL for the source:

access-list 100 remark Replication interface of the storage
access-list 100 permit ip host 192.168.1.99 any

Step 2 is creating a route-map:

route-map Backup_traffic permit 10
  match ip address 100
  set ip next-hop 192.168.14.4
route-map Backup_traffic permit 20

Line 10 matches the replication traffic from 192.168.1.99 and sets IOU4 as next hop. Line 20 matches everything else and, as no set statements are created, uses the routing table to look up the next hop. All we have to do now is enabling the policy on the LAN of IOU1 (e0/0):

interface Ethernet0/0
ip policy route-map Backup_traffic

Preferring everything over replication traffic

A simple QoS ruleset is required here. First we identify the replication traffic in a class-map using the same access list that we used for the PBR:

class-map match-all Backup
  match access-group 100

Then a policy-map has to be created:

policy-map Backup
  class Backup
    set dscp af13
    bandwidth percent 5
class class-default
    fair-queue

This policy-map sets DSCP AF13 to the backup traffic (this is an example; you should negotiate the DSCP value with your WAN provider), and assigns 5% of the available bandwidth; the rest is reserved for everything else. Please note that in case there’s no other traffic on the backup line (i.e. both links are up), replication traffic can use all bandwidth available. Finally, the policy has to be applied on both WAN interfaces (we need QoS on the primary link in case the backup link goes down too):

interface s3/0
  service-policy output Backup
interface e0/1
  service-policy output Backup

Tests

Let’s see how the solution works in daily operation as well as during failovers. I created two streams in Ostinato:

Simulating normal traffic
    Source: 192.168.1.10
    Destination: 192.168.99.10
    Protocol: UDP
Simulating replication traffic
    Source: 192.168.1.99
    Destination: 192.168.99.10
    Protocol: ICMP

The different protocols marked with different colors in Wireshark, so it’s easy to spot them.

Capture of IOU1 e0/1, only UDP traffic can be seen (along with some OSPF hellos and other packets):

Primary link - normal operations

IOU1 s3/0, all ICMP without a single UDP:

Backup link - normal operations

The expected behavior is having lots of UDP packets with a few ICMP, as the latter can use only 5% of the available bandwidth.

Primary link down

Works like a charm. 😄

Since the next hop defined in the policy is not reachable (as the interface is down), the PBR is not in effect. Mixed traffic appears on the primary link.

Backup link down

Note: if the backup was connected to a switched network, IP SLA and tracking would be needed. Reason: if the link between the remote router and the switch went down, IOU1 wouldn’t notice it, and PBR would be used further on.

Downloadable files

Note about R2’s config: The destination IP used in the tests is not reachable, a static route pointing to null0 was used as a black hole. To avoid flooding ICMP unreachables, the “no ip unreachables” command was used on this routers interfaces.