File:LUG2019-Long Distance Lustre-Gautam.pdf

From Lustre Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

LUG2019-Long_Distance_Lustre-Gautam.pdf(file size: 5.76 MB, MIME type: application/pdf)


   Lustre is widely used in HPC datacenters with Infiniband, Omnipath, Aries and Ethernet fabrics. Lustre networking (LNET) plays a big role in how lustre devices communicate with each other. LNET router is a great way to bridge different network fabrics together, where client and server across different fabrics can communicate with each other. LNET routers also adds resiliency by using multiple LNET routers to route to a FileSystem.
   We installed a brand new HPC datacenter about 30 miles (48 km) away from an existing datacenter. Both datacenters uses Lustre FileSystems in a flat Infiniband network. This presentation explains how we were able to connect these two datacenters where Lustre clients on one datacenter can access lustre FileSystems on other datacenter across the town. Since long distance Infiniband is expensive and complex, we chose to use high speed Ethernet network for long distance communication and use IB-Ethernet LNET routers on both ends to bridge two fabrics together. We will show how the various OS and Lustre tunings on LNET routers, Lustre clients and servers that needs to be performed to maximize the throughput and show some test results. We will also present challenges that we faced along the way and how we were able to resolve and/or mitigate them. The system is now in production exceeding our expectations.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeDimensionsUserComment
current15:37, 14 June 2019 (5.76 MB)Adilger (talk | contribs)Lustre is widely used in HPC datacenters with Infiniband, Omnipath, Aries and Ethernet fabrics. Lustre networking (LNET) plays a big role in how lustre devices communicate with each other. LNET router is a great way to bridge different network fabrics...

The following page uses this file: