File:LUG2019-HSM Data Movement Tiering Layouts-Evans.pdf

From Lustre Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

LUG2019-HSM_Data_Movement_Tiering_Layouts-Evans.pdf(file size: 6.55 MB, MIME type: application/pdf)

External HSM coordinators enable the flexible scheduling of tunable parameters, the addition of prioritized queues for different jobs, and options for improved scalability and data storage routines. In this presentation, I’ll discuss the current state of the HSM coordinator within Lustre, and a proposal to migrate it to userspace, enabling a large chunk of Lustre code to be bypassed (including lfs hsm commands and the registration of copytools). Next, I’ll talk about adding mirror sync and migration to the HSM infrastructure. This is relatively straightforward work, but unlike current implementations of migrate, it allows for multiple clients to work on one file at a time, resulting in improved performance. My next topic is a proposal to improve the staging of data in/out by changing the Lustre striping system from a RAID-like implementation to one that handles tiering through mirroring and releasing stripes. A stripe on an SSD would be dormant and empty, but the layout information would still exist. During stage in, the stripe is activated, a mirror sync is called and the stripe fills up. At the end of a job, the reverse happens. Finally, I’d like to recommend several coding and quality-of-life improvements in the HSM subsystem, including the removal of data structures and enumerated types that are virtually identical.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeDimensionsUserComment
current15:47, 14 June 2019 (6.55 MB)Adilger (talk | contribs)External HSM coordinators enable the flexible scheduling of tunable parameters, the addition of prioritized queues for different jobs, and options for improved scalability and data storage routines. In this presentation, I’ll discuss the current stat...

The following page uses this file: