Lustre Networking (LNET) Overview

From Lustre Wiki
Revision as of 01:00, 21 August 2021 by Adilger (talk | contribs) (clarify usage of TCP port 988 is only for socklnd)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Lustre’s network communication protocol, LNet, was originally derived from a project called Portals, developed by Sandia National Labs. LNet is designed to be lightweight and efficient and supports message passing for RPC request processing and RDMA for bulk data movement. All network metadata and file data I/O is managed through the LNet protocol and API. LNet is versatile, capable of operating over different network fabrics, including Ethernet, InfiniBand and Intel® OPA. In common with the other major components of Lustre, LNet is implemented as a Linux kernel module.

All nodes in a Lustre file system, including servers and clients, must have a valid LNet configuration and be connected either directly on a common network fabric, or via a router between networks.

To support the various types of networks, LNet has a low-level device layer called a Lustre Network Driver (LND), implemented as a pluggable driver module. The LND provides an interface abstraction between the upper level LNet protocol and the kernel device driver for the network interface. Each low-level network protocol requires a separate LND and multiple LNDs can be active on a host simultaneously, if the server requires access to more than one type of network. The most commonly used LNet drivers are the ksocklnd.ko module for TCP/IP networks, and the ko2iblnd.ko module for RDMA networks that make use of the OpenFabrics Enterprise Distribution (OFED) network driver. The ksocklnd.ko module is usually abbreviated to socklnd. The ko2iblnd.ko module (usually referred to as the o2ib LND or just called o2ib) supports fabrics running InfiniBand, Omni-Path, and RDMA over Converged Ethernet (RoCE). The letter “k” prefix in the LND names is there to emphasize that these are kernel modules, which is true for the majority of the Lustre software stack. It is often omitted from documentation, to improve readability.

LNet also supports the ability to route Lustre communications between different networks. Dedicated computers called LNet routers can direct traffic between multiple LNets.

Network interfaces on computer systems running LNet are addressed with a node identifier and also with a protocol identifier and network number for that protocol. The format is:

<address>@<LND protocol><lnd#>

For tcp and o2ib LNDs, the address is the IPv4 address of a network device on the host. Other LNDs may use other address formats, such as a simple integer representing the node number. The complete string is called an LNet Network Identifier (NID) and it uniquely defines an interface for a host on an LNet communications fabric.

The following example is a NID for an Ethernet interface:

[email protected]

This is a unique ID for a host with an Ethernet NIC on LNet tcp0 using the socket LND (as indicated by the tcp network type). A second configuration could also be added for a different interface:

[email protected]

The number appended to the LND protocol type must be a non-negative integer and must be the same for all Lustre hosts (client and server) that participate on the same network. For example, the following two NIDs are not on the same LNet:

[email protected]
[email protected]

Even though each NID has the same IPv4 address, they belong to two different LNets, because the LND instance numbers are different (tcp0 and tcp1, respectively).

RDMA networks have a similar NID format. For example:

[email protected]

Note: Even though the o2ib LNet driver uses OFED Verbs for communications, the IP address of the IB interface is used to identify the IB interface for the initial connection with a peer. After the initial connection, the o2ib LND uses only RDMA for all further communications. By default, socklnd uses TCP port 988 to create connections, and this must not be blocked by any firewalls.