LNet Router Config Guide2

From Lustre Wiki
Revision as of 14:01, 12 October 2017 by Sharmaso (talk | contribs)
Jump to navigation Jump to search

This document provides procedures to configure and tune an LNet router. It will also cover detailed instructions set on setting up connectivity of an Infiniband network to Intel OPA nodes using LNet router.

LNet

LNet supports different network types like Ethernet, InfiniBand, Intel Omni-Path and other proprietary network technologies such as the Cray’s Gemini. It routes LNet messages between different LNet networks using LNet routing. LNet’s routing capabilities provide an efficient protocol to enable bridging between different types of networks. LNet is part of the Linux kernel space and allows for full RDMA throughput and zero copy communications when supported by underlying network. Lustre can initiate a multi-OST read or write using a single Remote Procedure Call (RPC), which allows the client to access data using RDMA at near peak bandwidth rates. With Multi-Rail (MR) feature implemented in Lustre 2.10.X, it allows for multiple interfaces of same type on a node to be grouped together under the same LNet (ex tcp0, o2ib0, etc.). These interfaces can then be used simultaneously to carry LNet traffic. MR also has the ability to utilize multiple interfaces configured on different networks. For example, OPA and MLX interfaces can be grouped under their respective LNet and then can be utilized with MR feature to carry LNet traffic simultaneously.


LNet Configuration Example

An LNet router is a specialized Lustre client where Lustre file system is not mounted and only the LNet is running. A single LNet router can serve different file systems.



For the above example:

  • Servers are on LAN1, a Mellanox based InfiniBand network – 10.10.0.0/24
  • Clients are LAN2, an Intel OPA network – 10.20.0.0/24
  • Routers on LAN1 and LAN2 at 10.10.0.20, 10.10.0.21 and 10.20.0.29, 10.20.0.30 respectively

The network configuration on the nodes can be done either by adding the module parameters in lustre.conf /etc/modprobe.d/lustre.conf or dynamically by using the lnetctl command utility. Also, current configuration can be exported to a YAML format file and then the configuration can be set by importing that YAML file anytime needed.


Network Configuration by adding module parameters in lustre.conf

Servers:
options lnet networks="o2ib1(ib0)" routes="o2ib2 10.10.0.20@o2ib1"
 
Routers:
options lnet networks="o2ib1(ib0),o2ib2(ib1)" "forwarding=enabled"
 
Clients:
options lnet networks="o2ib2(ib0)" routes="o2ib1 10.20.0.29@o2ib2" 

NOTE: Restarting LNet is necessary to apply the new configuration. To do this, it is needed to unconfigure the LNet network and reconfigure again. Make sure that the Lustre network and Lustre file system are stopped prior to unloading the modules.

// To unload and load LNet module
modprobe -r lnet
modprobe lnet
  
// To unconfigure and reconfigure LNet
lnetctl lnet unconfigure
lnetctl lnet configure 


Dynamic Network Configuration using lnetctl command

Servers:
lnetctl net add --net o2ib1 --if ib0
lnetctl route add --net o2ib2 --gateway 10.10.0.20@o2ib1
lnetctl peer add --nid 10.10.0.20@o2ib1
 
Routers:
lnetctl net add --net o2ib1 --if ib0
lnetctl net add --net o2ib2 --if ib1
lnetctl peer add --nid 10.10.0.1@o2ib1
lnetctl peer add --nid 10.20.0.1@o2ib2
lnetctl set routing 1
   
Clients:
lnetctl net add --net o2ib2 --if ib0
lnetctl route add --net o2ib1 --gateway 10.20.0.29@o2ib2
lnetctl peer add --nid 10.20.0.29@o2ib2 


Importing/Exporting configuration using a YAML file format

// To export the current configuration to a YAML file
lnetctl export FILE.yaml
lnetctl export > FILE.yaml
  
  
// To import the configuration from a YAML file
lnetctl import FILE.yaml
lnetctl import < FILE.yaml 

There is a default lnet.conf file installed at /etc/lnet.conf which has an example configuration in YAML format. Another example of a configuration in a YAML file is:

net:
    - net type: o2ib1
      local NI(s):
        - nid: 10.10.0.1@o2ib1
          status: up
          interfaces:
              0: ib0
          tunables:
              peer_timeout: 180
              peer_credits: 8
              peer_buffer_credits: 0
              credits: 256
          lnd tunables:
              peercredits_hiw: 64
              map_on_demand: 32
              concurrent_sends: 256
              fmr_pool_size: 2048
              fmr_flush_trigger: 512
              fmr_cache: 1
          tcp bonding: 0
          dev cpt: -1
          CPT: "[0]"
route:
    - net: o2ib2
      gateway: 10.10.0.20@o2ib1
      hop: 1
      priority: 0
      state: up
peer:
    - primary nid: 10.10.0.20@o2ib1
      Multi-Rail: False
      peer ni:
        - nid: 10.10.0.20@o2ib1
          state: up
          max_ni_tx_credits: 8
          available_tx_credits: 8
          min_tx_credits: 7
          tx_q_num_of_buf: 0
          available_rtr_credits: 8
          min_rtr_credits: 8
          refcount: 4
global:
    numa_range: 0
    max_intf: 200
    discovery: 1 

LNet provides a mechanism to monitor each route entry. LNet pings each gateway identified in the route entry on regular, configurable interval (live_router_check_interval) to ensure that it is alive. If sending over a specific route fails or if the router pinger determines that the gateway is down, then the route is marked as down and is not used. It is subsequently pinged on regular, configurable intervals (dead_router_check_interval) to determine when it becomes alive again.


Multi-Rail LNet Configuration Example

If the routers are MR enabled, we can add the routers as peers with multiple interfaces to the clients and the servers, the MR algorithm will ensure that both interfaces of the routers are used while sending traffic to the router. However, single interface failure will still cause the entire router to go down. With the network topology example in Figure 1 above, LNet MR can be configured like below:

Servers:
lnetctl net add --net o2ib1 --if ib0,ib1
lnetctl route add --net o2ib2 --gateway 10.10.0.20@o2ib1
lnetctl peer add --nid 10.10.0.20@o2ib1,10.10.0.21@o2ib1
 
Routers:
lnetctl net add --net o2ib1 --if ib0,ib1
lnetctl net add --net o2ib2 --if ib2,ib3
lnetctl peer add --nid 10.10.0.1@o2ib1,10.10.0.2@o2ib1
lnetctl peer add --nid 10.20.0.1@o2ib2,10.20.0.2@o2ib2
lnetctl set routing 1
   
Clients:
lnetctl net add --net o2ib2 --if ib0,ib1
lnetctl route add --net o2ib1 --gateway 10.20.0.29@o2ib2
lnetctl peer add --nid 10.20.0.29@o2ib2,10.20.0.30@o2ib2 


Fine-Grained Routing

The routes parameter, by identifying LNet routers in a Lustre configuration, is used to tell a node which route to use when forwarding traffic. It specifies a semi-colon-separated list of router definitions.

routes=dest_lnet [hop] [priority] router_NID@src_lnet; \ dest_lnet [hop] [priority] router_NID@src_lnet 

An alternative syntax consists of a colon-separated list of router definitions:

routes=dest_lnet: [hop] [priority] router_NID@src_lnet \ [hop] [priority] router_NID@src_lnet