PFL2 Scope Statement: Difference between revisions

From Lustre Wiki
Jump to navigation Jump to search
m (reword to include Phase 3 implementation)
Line 2: Line 2:
Today, files in a Lustre filesystem have a static layout that is determined at the time each file is first opened. Each file's layout may have one or many stripes (OST objects) on which it stores data, but the file's layout cannot be modified after it is created, which forces the user to choose (or live with) the layout used at that time. If files are small, and/or many clients/threads are each writing to their own file (file-per-process, 1:1, N:N) then having a single stripe per file is optimal to reduce contention and metadata overhead. If files are very large and/or accessed by many clients concurrently (shared-single-file, N:1, N:M), and/or have bandwidth requirements exceeding that available from a single OST it is desirable to use a large number of stripes to maximize aggregate bandwidth and balance space usage.  
Today, files in a Lustre filesystem have a static layout that is determined at the time each file is first opened. Each file's layout may have one or many stripes (OST objects) on which it stores data, but the file's layout cannot be modified after it is created, which forces the user to choose (or live with) the layout used at that time. If files are small, and/or many clients/threads are each writing to their own file (file-per-process, 1:1, N:N) then having a single stripe per file is optimal to reduce contention and metadata overhead. If files are very large and/or accessed by many clients concurrently (shared-single-file, N:1, N:M), and/or have bandwidth requirements exceeding that available from a single OST it is desirable to use a large number of stripes to maximize aggregate bandwidth and balance space usage.  


The Progressive File Layout (PFL) Prototype validated the approach of increasing the stripe count to get the benefits of both low metadata overhead for small files with high aggregate throughput for large files.  In order to have a production-quality PFL implementation, a number of issues must be addressed in the code.  The addition of composite layouts increases complexity in layout handling on both the client and server, and needs to be implemented in a manner that isolates the complexity of creation and handling composite layouts.  Clients, and applications running thereon, must be able to add, modify, and remove composite layouts and individual components of the layout on the MDS while the file in active use.  The PFL Prototype implementation needs to be reworked to address a number of defects and implementation shortcuts.
The [[PFL Prototype Solution Architecture|Progressive File Layout (PFL) Prototype]] validated the approach of increasing the stripe count to get the benefits of both low metadata overhead for small files with high aggregate throughput for large files.  In order to have a production-quality PFL implementation, a number of issues must be addressed in the code.  The addition of composite layouts increases complexity in layout handling on both the client and server, and needs to be implemented in a manner that isolates the complexity of creation and handling composite layouts.  Clients, and applications running thereon, must be able to add, modify, and remove composite layouts and individual components of the layout on the MDS while the file in active use.  The PFL Prototype implementation needs to be reworked to address a number of defects and implementation shortcuts.
==In-Scope for PFL Phase 2==
==In-Scope for PFL Phase 2 Implementation==
The Phase 2 implementation will produce High Level Design documents for full PFL implementation.  This includes designing the userspace interface and client-side implementation for creating and modifying composite layouts, as well as designing the network protocol and server-side implementation needed to modify and maintain consistency of composite layouts while they are in active use.  The userspace interface, client-side and server-side layout handling will be implemented.  The implementation will be based as much as practical on the PFL Prototype implementation, which will be improved or reworked as needed.
The Phase 2 implementation will produce High Level Design documents for full PFL implementation.  This includes designing the userspace interface and client-side implementation for creating and modifying composite layouts, as well as designing the network protocol and server-side implementation needed to modify and maintain consistency of composite layouts while they are in active use.  The userspace interface, client-side and server-side layout handling will be implemented.  The implementation will be based as much as practical on the PFL Prototype implementation, which will be improved or reworked as needed.
==In-Scope for PFL Phase 3 Implementation ==
It should be noted that the planned Phase 2 implementation is not a full PFL production implementation. In particular, the Phase 2 implementation does not include the following components, which will be part of the PFL Phase 3 implementation:
* Client IO stack remapping for changing layouts during file IO
* Layout inheritance from parent directory or global default layout
* LFSCK support for the composite layout type
* PFL-optimized OST selection for multi-component composite layouts
==Out of Scope for PFL Phase 2 Design ==
==Out of Scope for PFL Phase 2 Design ==
Although the PFL Phase 2 high-level design is preparing for implementing statically managed PFL file layouts both within and beyond the scope of the PFL Phase 2 project, composite layouts have several potential applications that are also beyond the scope of the PFL Phase 2 high-level design:
Although the [[PFL2 High Level Design|PFL Phase 2 high-level design]] is preparing for implementing dynamically managed PFL file layouts, composite layouts have several potential applications that are beyond the scope of the PFL Phase 2/3 high-level design:
* No compatibility will be possible for older clients or servers
* No compatibility will be possible for older clients or servers
* Handling of composite layouts with overlapping/replicated extents
* Handling of composite layouts with overlapping/replicated extents
Line 13: Line 19:
* Increasing the maximum layout size beyond existing LOV xattr/RPC limits
* Increasing the maximum layout size beyond existing LOV xattr/RPC limits
* Handling OST out-of-space conditions by adding new components
* Handling OST out-of-space conditions by adding new components
==Out of Scope for PFL Phase 2 Implementation ==
 
It should be noted that the planned Phase 2 implementation is not a full PFL production implementation. In particular, the Phase 2 implementation does not include the following components, which would be part of a potential PFL Phase 3 implementation:
* Client IO stack remapping for changing layouts during file IO
* Layout inheritance from parent directory or global default layout
* LFSCK support for the composite layout type
* PFL-optimized OST selection for multi-component composite layouts
[[Category:PFL]]
[[Category:PFL]]

Revision as of 15:36, 19 January 2017

Problem Statement

Today, files in a Lustre filesystem have a static layout that is determined at the time each file is first opened. Each file's layout may have one or many stripes (OST objects) on which it stores data, but the file's layout cannot be modified after it is created, which forces the user to choose (or live with) the layout used at that time. If files are small, and/or many clients/threads are each writing to their own file (file-per-process, 1:1, N:N) then having a single stripe per file is optimal to reduce contention and metadata overhead. If files are very large and/or accessed by many clients concurrently (shared-single-file, N:1, N:M), and/or have bandwidth requirements exceeding that available from a single OST it is desirable to use a large number of stripes to maximize aggregate bandwidth and balance space usage.

The Progressive File Layout (PFL) Prototype validated the approach of increasing the stripe count to get the benefits of both low metadata overhead for small files with high aggregate throughput for large files. In order to have a production-quality PFL implementation, a number of issues must be addressed in the code. The addition of composite layouts increases complexity in layout handling on both the client and server, and needs to be implemented in a manner that isolates the complexity of creation and handling composite layouts. Clients, and applications running thereon, must be able to add, modify, and remove composite layouts and individual components of the layout on the MDS while the file in active use. The PFL Prototype implementation needs to be reworked to address a number of defects and implementation shortcuts.

In-Scope for PFL Phase 2 Implementation

The Phase 2 implementation will produce High Level Design documents for full PFL implementation. This includes designing the userspace interface and client-side implementation for creating and modifying composite layouts, as well as designing the network protocol and server-side implementation needed to modify and maintain consistency of composite layouts while they are in active use. The userspace interface, client-side and server-side layout handling will be implemented. The implementation will be based as much as practical on the PFL Prototype implementation, which will be improved or reworked as needed.

In-Scope for PFL Phase 3 Implementation

It should be noted that the planned Phase 2 implementation is not a full PFL production implementation. In particular, the Phase 2 implementation does not include the following components, which will be part of the PFL Phase 3 implementation:

  • Client IO stack remapping for changing layouts during file IO
  • Layout inheritance from parent directory or global default layout
  • LFSCK support for the composite layout type
  • PFL-optimized OST selection for multi-component composite layouts

Out of Scope for PFL Phase 2 Design

Although the PFL Phase 2 high-level design is preparing for implementing dynamically managed PFL file layouts, composite layouts have several potential applications that are beyond the scope of the PFL Phase 2/3 high-level design:

  • No compatibility will be possible for older clients or servers
  • Handling of composite layouts with overlapping/replicated extents
  • Handling of nested composite layouts
  • Components which contain Data-on-MDT or HSM archive layouts
  • Increasing the maximum layout size beyond existing LOV xattr/RPC limits
  • Handling OST out-of-space conditions by adding new components