File:LUG2019-Layering ZFS Pools on Lustre-Mohr.pdf

For most HPC systems, Lustre is a good solution for providing high-bandwidth I/O to shared storage resources that can be accessed simultaneously from many clients for parallel computations. Lustre performs best for large sequential read/write operations, but performance can diminish for workloads that produce lots of small I/O requests or random file accesses. This is the reason many sites deploy additional storage resources (like NFS) for user home directories where tasks like code compilation or interactive file editing may perform better. However, deploying these secondary storage resources adds additional burden to system administrators and fails to leverage the advantages of an existing Lustre investment (like increased storage capacity).

In this presentation, we share our experiences with layering a ZFS file system on top of a Lustre file system. We outline potential use cases and discuss benefits from a system administration standpoint, such as:

– Conserving Lustre inodes by using ZFS to consolidate large numbers of small files into a single Lustre file – Using ZFS snapshots to backup home directories – Using ZFS quotas in conjunction with Lustre quotas to provide more control of storage usage – Using ZFS encryption for protected project spaces

We investigate the performance of ZFS-on-Lustre and present the results of several benchmark tests. Based on these benchmark results, we discuss the possibility of using ZFS to speed up code compilation and look at ZFS’ ability to shape I/O traffic to the backend Lustre file system. We also look at using NFS to export a ZFS-on-Lustre configuration and benchmark performance from a NFS client system.