PFL2 Scope Statement
Today, files in a Lustre filesystem have a static layout that is determined at the time each file is first opened. Each file's layout may have one or many stripes (OST objects) on which it stores data, but the file's layout cannot be modified after it is created, which forces the user to choose (or live with) the layout used at that time. If files are small, and/or many clients/threads are each writing to their own file (file-per-process, 1:1, N:N) then having a single stripe per file is optimal to reduce contention and metadata overhead. If files are very large and/or accessed by many clients concurrently (shared-single-file, N:1, N:M), and/or have bandwidth requirements exceeding that available from a single OST it is desirable to use a large number of stripes to maximize aggregate bandwidth and balance space usage.
The Progressive File Layout (PFL) Prototype validated the approach of increasing the stripe count to get the benefits of both low metadata overhead for small files with high aggregate throughput for large files. In order to have a production-quality PFL implementation, a number of issues must be addressed in the code. The addition of composite layouts increases complexity in layout handling on both the client and server, and needs to be implemented in a manner that isolates the complexity of creation and handling composite layouts. Clients, and applications running thereon, must be able to add, modify, and remove composite layouts and individual components of the layout on the MDS while the file in active use. The PFL Prototype implementation needs to be reworked to address a number of defects and implementation shortcuts.
In-Scope for PFL Phase 2 Implementation
The Phase 2 implementation will produce High Level Design documents for full PFL implementation. This includes designing the userspace interface and client-side implementation for creating and modifying composite layouts, as well as designing the network protocol and server-side implementation needed to modify and maintain consistency of composite layouts while they are in active use. The userspace interface, client-side and server-side layout handling will be implemented. The implementation will be based as much as practical on the PFL Prototype implementation, which will be improved or reworked as needed.
In-Scope for PFL Phase 3 Implementation
It should be noted that the planned Phase 2 implementation is not a full PFL production implementation. In particular, the Phase 2 implementation does not include the following components, which will be part of the PFL Phase 3 implementation:
- Client IO stack remapping for changing layouts during file IO
- Layout inheritance from parent directory or global default layout
- LFSCK support for the composite layout type
- PFL-optimized OST selection for multi-component composite layouts
Out of Scope for PFL Phase 2 Design
Although the PFL Phase 2 high-level design is preparing for implementing dynamically managed PFL file layouts, composite layouts have several potential applications that are beyond the scope of the PFL Phase 2/3 high-level design:
- No compatibility will be possible for older clients or servers
- Handling of composite layouts with overlapping/replicated extents
- Handling of nested composite layouts
- Components which contain Data-on-MDT or HSM archive layouts
- Increasing the maximum layout size beyond existing LOV xattr/RPC limits
- Handling OST out-of-space conditions by adding new components