LNet Configuration Edge Case Behaviors and Side-Effects
If no explicit configuration is supplied to LNet, either through a modprobe options file or a YAML description for
lnetctl, it will attempt to create a valid TCP/IP (
socklnd) NID for
tcp0 using the first network interface that is detected by the operating system (e.g.
eth0) when the module is loaded and the LNet service started.
The order of interface detection is entirely at the discretion of the operating system, which means that there is no guarantee that the ordering of interfaces will be preserved between reboots and on the insertion of a new hardware device. It also means that the default behavior for a host will differ depending on its hardware configuration. Most operating systems do try to ensure that a device, after it is detected, maintains the same device name (
eth1, etc.) between reboots. Nevertheless, it is strongly recommended that all configuration be stated explicitly: defining the configuration also defines the expected behavior of the system, making it easier to audit.
ip2nets option for the LNet kernel module is a list of network definition and IP-match pairs. These pairs are processed in sequence. If there is a match for a local IP address, then that network definition is used for the node, and further pairs for that network are ignored. Multiple networks can be matched.
ip2nets="tcp(eth2) 134.32.1.[4-10/2]; tcp(eth1) *.*.*.*"
This set of rules is used to create network
0 is implied, because the LNet network number is omitted). If a local IP address matches
134.32.1.[4-10/2], meaning it is one of
tcp0 is created using interface
eth2. Otherwise the second pair is used, and because "
*.*.*.*" matches every address, it always creates
ip2nets will use the IP address definition to match the host, not the interface. The
ip2nets definition will not verify or otherwise qualify that the IP address matched is associated with the physical network interface in the specification. This means that a pattern can match the IP address of an interface that will not actually be used for LNet communications. From the above example, if a host has an interface
eth3 with IP address
220.127.116.11, then that would be considered a match good enough to trigger the creation of the NID on
Also, if the device is not specified in an
ip2nets definition, LNet will pick up the first available device rather than the device that matches the IP address pattern. For example, if the IP address pattern matches the IP address on
eth1, but no device is mentioned in the
ip2nets definition, then
eth0 will get an LNet configuration. As an illustration, consider a host with a
10.70/16 IP address on
eth0, and a
192.168/24 address on
eth1. The following
ip2nets definition will create a NID on
eth0, even though the pattern matches the
options lnet ip2nets="tcp0 192.168.*.*"
If, instead, the device is included in the spec, then the configuration will be applied to
options lnet ip2nets="tcp0(eth1) 192.168.*.*"
The definition is interpreted as follows: configure the first socklnd NID that is found on the host where there is an IP Address matching
192.168.*.*. In this respect, it's consistent with the behavior of the much simpler networks syntax in the following example:
options lnet networks="tcp0"
This example creates a NID on the first network device detected by the operating system, because no device was specified. In common with the
ip2nets parameter, the lack of definition of a specific network interface means that LNet will configure the first interface that was detected by the host operating system.
If an interface is explicitly specified as well as a pattern, the interface matched using the IP pattern will be compared against the explicitly defined interface. For example, if the
ip2nets definition is “
tcp(eth0) 192.168.*.3” and there exists in the system a device
eth0 with IP address
18.104.22.168 and a device
eth1 with IP address
192.168.3.3, then configuration will fail, because the pattern contradicts the interface specified. A clear warning will be displayed if inconsistent configuration is encountered.
If the LNet number for a NID is 0 (zero), for example,
o2ib0, the number will sometimes be omitted from command output, and can usually be omitted from configuration files as well (although it is not recommended – for reasons of clarity alone, it is recommended to supply as much information as is reasonable when creating configuration information).