Lustre Networking (LNET) Overview

From Lustre Wiki
Jump to: navigation, search

Lustre’s network communication protocol, LNet, was originally derived from a project called Portals, developed by Sandia National Labs. LNet is designed to be lightweight and efficient and supports message passing for RPC request processing and RDMA for bulk data movement. All network metadata and file data I/O is managed through the LNet protocol and API. LNet is lightweight and versatile, capable of operating over different network fabrics, including Ethernet, InfiniBand and Intel® OPA. In common with the other major components of Lustre, LNet is implemented as a Linux kernel module.

All participants in a Lustre file system, including servers and clients, must have a valid LNet configuration and be connected either directly on a common network fabric, or via a router between networks.

To support the various types of networks, LNet has a low-level device layer called a Lustre Network Driver (LND), implemented as a pluggable driver module. The LND provides an interface abstraction between the upper level LNet protocol and the kernel device driver for the network interface. Each low-level network protocol requires a separate LND and multiple LNDs can be active on a host simultaneously, if the server requires access to more than one type of network. The most commonly used LNet drivers are the ksocklnd.ko module for TCP/IP networks, and the ko2iblnd.ko module for RDMA networks that make use of the OpenFabrics Enterprise Distribution (OFED) network driver. The ksocklnd.ko module is usually abbreviated to socklnd or referred to as the sockets LND. The ko2iblnd.ko module (usually referred to as the o2ib LND or just called o2ib) supports fabrics running InfiniBand, Omni-Path, and RDMA over Converged Ethernet (RoCE). The letter “k” prefix in the LND names is there to emphasize that these are kernel modules, which is true for the majority of the Lustre software stack. It is often omitted from documentation, to improve readability.

LNet also supports the ability to route Lustre communications between different networks. Dedicated computers called LNet routers can direct traffic between multiple LNets.

Network interfaces on computer systems running LNet are addressed with a node identifier (the IPv4 address of a network device on the host) and also with a protocol identifier and network number for that protocol. The format is:

<IPv4 address>@<LND protocol><lnd#>

The complete string is called an LNet Network Identifier (NID) and it uniquely defines an interface for a host on an LNet communications fabric.

The following example is a NID for an Ethernet interface:

This is a unique ID for a host with an Ethernet NIC on LNet tcp0 using the socket LND (as indicated by the tcp network type). A second configuration could also be added for a different interface:

The number appended to the LND protocol type must be a non-negative integer and must be the same for all Lustre hosts (client and server) that participate on the same network. For example, the following two NIDs are not on the same LNet:

Even though each NID has the same IPv4 address, they belong to two different LNets, because the LND instance numbers are different (tcp0 and tcp1, respectively).

RDMA networks have a similar NID format. For example:

Note: Even though the o2ib LNet driver uses RDMA for communications, TCP is used to establish the initial connection with a peer using the fabric’s IP upper level protocol. After the initial connection, the o2ib LND uses RDMA for all further communications. By default, LNet uses TCP port 988 to create connections, and this must not be blocked by any firewalls. The LNet NID of an o2ib device also makes use of an IPv4 address to identify the NID, just as for the TCP socket LND.