Legacy LNet Active-Passive InfiniBand Bonding with ib-bond HCA Driver
Note: This article is superseded in later versions of Lustre that include Multi-Rail LNet support. The Multi-Rail LNet feature was introduced in Lustre 2.10.0.
For versions of Lustre that do not include LNet Multi-rail support, LNet can still be configured to take advantage of bonded network interfaces when they are presented as a single device by the underlying transport. Be aware that devices using OFED or the in-kernel InfiniBand bonding drivers will only support active-passive, or failover, network bonding, which means that only one physical interface is active at any one point in time. Thus, configuring multiple InfiniBand connections on a single fabric using the ib-bond kernel driver provides a way to improve fault-tolerance, but will not increase throughput.
Refer to the Multi-Rail LNet article for information on the far more powerful and versatile networking functionality implemented in LNet since Lustre 2.10.0. Multi-Rail LNet offers multiple active ntwork paths and is implemented manner that is fabric-agnostic. Lustre's native Multi-Rail LNet functionality allows data to be aggregated across multiple network transports simultaneously, as well as providing fault tolerance features.
Note: A host can have multiple independent LNet interfaces configured and connected to separate networks (multi-homing), without requiring either bonding or Multi-rail functionality. This enables servers to be directly connected to multiple fabrics simultaneously, or for a Lustre client to mount file systems that have been presented over different fabrics.
The ko2iblnd
LND provides support for InfiniBand network device bonding in an active-passive configuration, for the purposes of high availability (HA). Because the bonded interface is active-passive, there is no improvement in throughput performance, so the feature is only suitable for use in situations where service availability is a mandated requirement (mission-critical platforms).
With this form of bonding, the server actively uses one interface in the bonded group at a time. If the active interface fails, traffic fails over to the remaining interface in the bond group.
This form of InfiniBand bonding support is distinct from the use of bonded network interfaces with ksocklnd
, which runs over TCP/IP sockets. The ksocklnd
TCP/IP LNet driver does not distinguish between bonded or single interfaces and no specific LNet configuration is required.
Enabling Active-Passive InfiniBand (o2ib) Bonding
To enable failover support in LNet for bonded InfiniBand (or other network interfaces supported by OFED), add the following option into the kernel modules configuration:
options ko2iblnd dev_failover=1
The common convention is to create files in the directory /etc/modprobe.d
containing options for loadable kernel modules.
With this option enabled, one can refer to the bonded network interface in the LNet configuration. For example:
options lnet networks=o2ib0(bond0)
The following example, based on a RHEL / CentOS operating platform, illustrates a bonded network configuration for a Lustre system with two InfiniBand interfaces.
/etc/modprobe.d/lustre.conf: alias ibbond bonding options lnet networks=o2ib0(ibbond) options ko2iblnd dev_failover=1 /etc/sysconfig/network-scripts/ifcfg-ibbond: DEVICE=ibbond BOOTPROTO=none IPADDR=10.0.0.11 NETMASK=255.255.0.0 ONBOOT=yes TYPE=Bonding USERCTL=no MTU=2044 BONDING_OPTS="mode=1 miimon=100 primary=ib0" /etc/sysconfig/network-scripts/ifcfg-ib0: DEVICE=ib0 USERCTL=no ONBOOT=yes MASTER=ibbond SLAVE=yes BOOTPROTO=none TYPE=InfiniBand /etc/sysconfig/network-scripts/ifcfg-ib1: DEVICE=ib1 USERCTL=no ONBOOT=yes MASTER=ibbond SLAVE=yes BOOTPROTO=none TYPE=InfiniBand
The ibbond
alias name in the sample /etc/modprobe.d/lustre.conf
configuration file is arbitrary, but is more descriptive than e.g. bond0
. It is common to encounter installations where there are both bonded Ethernet and bonded IB interfaces on the same host, and choosing a descriptive naming convention simplifies administration of the machines.
Restrictions for Legacy ib-bond LNet Topologies
If the version of Lustre does not natively support multi-rail topologies, i.e., multiple network interfaces connected to the same subnet, attempts to assign two interfaces to the same LNet will fail.
For example:
options lnet networks="tcp0(eth0),tcp0(eth1)"
The above configuration will cause a syntax error when the kernel module is loaded and an attempt is made to start the network. The following transcript shows the behavior when this unsupported configuration is attempted:
[root@rh7z-pe ~]# modprobe -v lnet insmod /lib/modules/3.10.0-327.13.1.el7_lustre.x86_64/extra/kernel/net/lustre/libcfs.ko insmod /lib/modules/3.10.0-327.13.1.el7_lustre.x86_64/extra/kernel/net/lustre/lnet.ko networks="tcp0(eth0),tcp0(eth1)" [root@rh7z-pe ~]# lctl network up LNET configure error 22: Invalid argument
The kernel ring buffer will have a record of the error reported by the LNet driver, for example:
[root@rh7z-pe ~]# dmesg | tail -1 [ 6620.324053] LNetError: 111-1: Duplicate network specified: tcp
The kernel will also log the error in the syslog:
[root@rh7z-pe ~]# tail -1 /var/log/messages Feb 21 21:11:28 rh7z-pe kernel: LNetError: 111-1: Duplicate network specified: tcp
Similarly, one cannot specify multiple interfaces within the parentheses associated with an LNet LND. In the following example, only the first interface, eth0
, will be used to create an NID for the host; the second parameter, eth1
, will be ignored:
# eth0 inet 192.168.207.2/24 # eth1 inet 192.168.207.111/24 # options lnet networks="tcp0(eth0,eth1)" [root@rh7z-pe ~]# modprobe -v lnet insmod /lib/modules/3.10.0-327.13.1.el7_lustre.x86_64/extra/kernel/net/lustre/libcfs.ko insmod /lib/modules/3.10.0-327.13.1.el7_lustre.x86_64/extra/kernel/net/lustre/lnet.ko networks="tcp0(eth0,eth1)" [root@rh7z-pe ~]# lctl network up LNET configured [root@rh7z-pe ~]# lctl list_nids 192.168.207.2@tcp
To take advantage of complex topologies and to aggregate performance across multiple network interfaces, use the latest version of Lustre containing the Multi-rail LNet feature.