As I was doing testing in my previous post, I ran into an issue where I had configured bond-type 1 (Active-Backup) interface however Active Slave never failed over when I disconnected the interface. For the life of me, I didn’t have a clue why! Subsequently, I found out that the configuration I had on ESXi host’s vSwitch was wrong and this is why the failover never happened.

Before I told about the ESXi vSwitch, I was looking at a number of different ways to fix this issue. From my searching I found a great article written by Ivan Erben on how you can manually fail over active slave in bond-type 1 configuration

It was quite straightforward, as I like it :p

Firstly, check to see what the active slave is by using the command cat /proc/net/bonding/bond0

marquk01@km-vm1:~$ cat /proc/net/bonding/bond0 
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth1
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:29:4f:26:c5
Slave queue ID: 0

Slave Interface: eth2
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:29:4f:26:cf
Slave queue ID: 0

Having seen that eth1 is the active slave, we can remove the interface from the bond, by running echo -eth1 > /sys/class/net/bond0/bonding/slaves

You will need to sudo root to make this change.

marquk01@km-vm1:~$ sudo -s
\[sudo\] password for marquk01: 
root@km-vm1:~# echo -eth1 > /sys/class/net/bond0/bonding/slaves

We can see that the eth1 has been removed from bond configuration

root@km-vm1:~# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth2
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth2
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:29:4f:26:cf
Slave queue ID: 0

The bond will still pass traffic and work as expected to add the interface back into the bond, we would need to run: echo +eth1 > /sys/class/net/bond0/bonding/slave

As we can see, eth1 has been added back into the bond and eth2 has become the active slave.

root@km-vm1:~# echo +eth1 > /sys/class/net/bond0/bonding/slaves
root@km-vm1:~# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup)
Primary Slave: None
Currently Active Slave: eth2
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: eth2
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:29:4f:26:cf
Slave queue ID: 0

Slave Interface: eth1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:0c:29:4f:26:c5
Slave queue ID: 0

This is very useful, if you know you have planned maintenance or need a quick failover of interfaces and you don’t have link detection enabled. Definitely a great find and post by Ivan! You can check out his blog here

Share on LinkedIn
Share on Reddit