File:LUG2019-Performance Lustre All Flash OIST-Tanaka.pdf

The Okinawa Institute of Science and Technology is an interdisciplinary graduate school offering a 5-year PhD program in Science. Over half of   the faculty and students are recruited from outside Japan, and all education and research is conducted entirely in English. The mission of   OIST is to promote and sustain the advancement of science and technology in Japan and throughout the world.

At OIST, the Scientific Computing and Data Analysis Section (SCDA) promotes the effective use of High Performance Computing and high performance centralized data storage at all stages of research. OIST has been running Lustre for almost a decade and uses Lustre as the primary file-system for the HPC cluster. For most of this time the storage used has been traditional RAID-based hard disk drive storage. However the increasing demands placed by the workloads required from the researchers led OIST to explore the feasibility and option of using more performant storage.

OIST recently expanded its Lustre storage by adding an All-Flash based Lustre filesystem for AI(Artificial Intelligence)/machine learning. All-Flash storage systems offer low latency, massive IOPS and can accelerate small random IOs as well as I/O bandwidth. Those mixed IOPS and I/O bandwidth performance characteristics are required for machine learning and to speed-up training workloads.

This presentation provides fundamental performance evaluation of Lustre on an All-Flash storage system showing the current performance capability of Lustre on low latency NVMe/SSD devices. The presentation includes experimental test results with lustre-2.12 for comparison with lustre-2.10 and also proposes optimizations and tuning of Lustre filesystem for All-Flash storage system.

As the use of machine-learning has grown across many research disciplines and because the relative cost of flash is coming down, it is   expected that these findings will be of wide interest within the Lustre community.