Lustre Internals Documentation Update: Difference between revisions
KenRawlings (talk | contribs) |
KenRawlings (talk | contribs) (general project updates & wiki page refactoring) |
||
Line 4: | Line 4: | ||
* Incremental updating of Lustre internals documentation. | * Incremental updating of Lustre internals documentation. | ||
* | * Start with bringing the Understanding Lustre Filesystem Internals document up to date | ||
* The available time for those with Lustre internals knowledge is limited, have a low volume mailing list and targeted iterations | * The available time for those with Lustre internals knowledge is limited, have a low volume mailing list and targeted iterations | ||
=== | === Online Document === | ||
Community web-editable version of Understanding Lustre Filesystem Internals: | |||
https://docs.google.com/document/d/1sbtonyl66h7g5AficO6BLRMeXwAeNp-pwswirWAkgIE | https://docs.google.com/document/d/1sbtonyl66h7g5AficO6BLRMeXwAeNp-pwswirWAkgIE | ||
=== Mailing List === | |||
Google Groups Mailing List: | |||
https://groups.google.com/forum/#!forum/lustre-internals-update/ | |||
=== Todo === | === Todo === | ||
Document | * Online Document | ||
* Overall review, remove out-of-date-information | ** Overall review, remove out-of-date-information | ||
* Add new sections for subsystems that did not exist in 1.6 | ** Add new sections for subsystems that did not exist in 1.6 | ||
* Section-by-section review | ** Section-by-section review | ||
* Add section with timeline for architectural changes | ** Add section with timeline for architectural changes | ||
* Add HSM section | ** Add HSM section | ||
* Add DNE coverage | ** Add DNE coverage | ||
* Add OSD section (research materials from 2.3-2.4 timeframe) | ** Add OSD section (research materials from 2.3-2.4 timeframe) | ||
* | ** Add ZFS coverage | ||
* | ** Expand protocol coverage and reference RPC documentation | ||
* | ** TOC was lost in the conversion but should be able to be autogenerated from sections via the use of styles | ||
* Review Doxygen comments in codebase and reference them here | |||
* | * Wiki Page | ||
* | ** Review Doxygen comments in codebase and reference them here | ||
** Continue adding references to pertinent LUG/LAD/Lustre-Ecosystem/etc. presentations | |||
Organizational | *Organizational | ||
* Gather existing materials (presentations, etc.) | ** Find dynamic for collaboration between those with internals knowledge and those without who want to help | ||
* Generate Doxygen comments and put on web (automate?) | ** Gather existing materials (presentations, etc.) | ||
* Find balance for how to coordinate (in document/wiki/group) | ** Investigate IO simplification project materials for updates and integration | ||
** Generate Doxygen comments and put on web (automate?) | |||
** Find balance for how to coordinate (in document/wiki/group) | |||
=== Feedback === | === Community Feedback === | ||
Feedback from discussions with community members on updating Lustre internals documentation in general: | |||
* Top-down versus bottom-up approach | |||
** Differing viewpoints on this Possible to find way to address both? | |||
* Separate documents (e.g. Understanding Lustre Filesystem Internals document) versus documentation in the code (e.g. Doxygen) | |||
** Possible to incrementally expand and update both, balance what's in them, and have them mutually reference? | |||
== Additional Resources == | == Additional Resources == | ||
=== Doxygen Code Documentation === | |||
* Old: http://wiki.old.lustre.org/lid/doxygen.api/modules.html | |||
* TODO: Generate new version from codebase and make available on web. | |||
=== Lustre Protocol Documentation === | === Lustre Protocol Documentation === | ||
Line 249: | Line 262: | ||
|} | |} | ||
==== Current Table Of Contents | == Old Site Lustre Internals Documentation Area== | ||
http://wiki.old.lustre.org/lid/ | |||
=== Glossary === | |||
http://wiki.old.lustre.org/lid/glossary/glossary.html | |||
Brief descriptions of Lustre concepts, objects and major components indexed in various ways. | |||
=== Lustre Internals: A Gentle Introduction === | |||
http://wiki.old.lustre.org/lid/agi/agi.html | |||
=== Subsystem Map === | |||
TODO: Generate new version of | |||
* http://wiki.old.lustre.org/lid/subsystem-map-old/subsystem-map.html | |||
== Current ULFI Table Of Contents == | |||
* Component View on Architecture | * Component View on Architecture | ||
Line 353: | Line 385: | ||
** Patches: ext3 to ldiskfs | ** Patches: ext3 to ldiskfs | ||
* Future Work | * Future Work | ||
Revision as of 08:52, 6 April 2017
Organization
Goals
- Incremental updating of Lustre internals documentation.
- Start with bringing the Understanding Lustre Filesystem Internals document up to date
- The available time for those with Lustre internals knowledge is limited, have a low volume mailing list and targeted iterations
Online Document
Community web-editable version of Understanding Lustre Filesystem Internals:
https://docs.google.com/document/d/1sbtonyl66h7g5AficO6BLRMeXwAeNp-pwswirWAkgIE
Mailing List
Google Groups Mailing List:
https://groups.google.com/forum/#!forum/lustre-internals-update/
Todo
- Online Document
- Overall review, remove out-of-date-information
- Add new sections for subsystems that did not exist in 1.6
- Section-by-section review
- Add section with timeline for architectural changes
- Add HSM section
- Add DNE coverage
- Add OSD section (research materials from 2.3-2.4 timeframe)
- Add ZFS coverage
- Expand protocol coverage and reference RPC documentation
- TOC was lost in the conversion but should be able to be autogenerated from sections via the use of styles
- Wiki Page
- Review Doxygen comments in codebase and reference them here
- Continue adding references to pertinent LUG/LAD/Lustre-Ecosystem/etc. presentations
- Organizational
- Find dynamic for collaboration between those with internals knowledge and those without who want to help
- Gather existing materials (presentations, etc.)
- Investigate IO simplification project materials for updates and integration
- Generate Doxygen comments and put on web (automate?)
- Find balance for how to coordinate (in document/wiki/group)
Community Feedback
Feedback from discussions with community members on updating Lustre internals documentation in general:
- Top-down versus bottom-up approach
- Differing viewpoints on this Possible to find way to address both?
- Separate documents (e.g. Understanding Lustre Filesystem Internals document) versus documentation in the code (e.g. Doxygen)
- Possible to incrementally expand and update both, balance what's in them, and have them mutually reference?
Additional Resources
Doxygen Code Documentation
- Old: http://wiki.old.lustre.org/lid/doxygen.api/modules.html
- TODO: Generate new version from codebase and make available on web.
Lustre Protocol Documentation
- http://wiki.opensfs.org/Contract_SFS-DEV-005
- https://jira.hpdd.intel.com/browse/LUDOC-280
- https://build.hpdd.intel.com/job/lustre-protocol-reviews/lastSuccessfulBuild/artifact/protocol.html
Presentations
Sequoia and the ZFS OSD (LUG2013)
Intel® Lustre* File Level Replication (LUG2014)
Distributed Name Space Phase I (LUG2013)
Lustre & Kerberos: in theory and in practice (LUG2015)
Documentation in Lustre Tree
Overview of the Lustre Client I/O (CLIO) subsystem
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/clio.txt
LFSCK
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/lfsck.txt
- LFSCK master slave design
- Object traversal design reference
Lock Ordering
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/lock-ordering
/* dot input file for lock-ordering diagram */
Overview of the Lustre Object Storage Device API
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/osd-api.txt
Overview of Dynamic LNet Configuration
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/dlc.txt
Lustre versioning
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/versioning.txt
Old Wiki Architectural Documents
Old Site Lustre Internals Documentation Area
http://wiki.old.lustre.org/lid/
Glossary
http://wiki.old.lustre.org/lid/glossary/glossary.html
Brief descriptions of Lustre concepts, objects and major components indexed in various ways.
Lustre Internals: A Gentle Introduction
http://wiki.old.lustre.org/lid/agi/agi.html
Subsystem Map
TODO: Generate new version of
Current ULFI Table Of Contents
- Component View on Architecture
- Lustre Client
- OSS
- MDS
- Lustre Lite
- Connection
- Dentry Object
- Lustre Superblock
- Lustre inode
- Path Lookup
- Path
- Asynchronous I/O
- Group I/O (or Synchronous I/O)
- Direct I/O
- Interface with VFS
- Read-Ahead
- LOV and OSC
- Device Operations
- Page Management
- From OSC Client To OST
- Grant
- LDLM: Lock Manager
- Namespace
- Resource
- Lock Type and Mode
- Callbacks
- Intent
- Lock Manager
- Requesting a Lock
- Canceling a Lock
- Policy Function
- Cases
- MDS: One Client Read
- MDS: Two Clients
- OST: Two Clients Read and Write
- OST and Obdfilter
- as OST
- Initial Setup
- Dispatching
- Directory Layout
- Group Number
- Object Id
- obdfilter
- File Deletion
- File Creation
- MDC and Lustre Metadata
- Overview
- Striping EA
- Striping API
- Infrastructure Support
- Lustre Client Registration
- Superblock and Inode Registration
- Device
- Import and Export
- Portal RPC
- Client Side Interface
- Server Side Interface
- Bulk Transfer
- NRS Optimization
- Error Recovery: A Client Perspective
- LNET: Lustre Networking
- Core Concepts
- LNET Process Id
- ME: Matching Entry
- MD: Memory Descriptor
- Example Use of Offset
- MD Options
- Event Queue
- Portal RPC: A Client of LNET
- Get and Put Confusion
- Router In the Middle
- Round 1: Client Server Interactions
- Round 2: More details
- LNET API
- Naming Conventions
- Initialization and Teardown
- Memory-Oriented Communication Semantics
- Match Entry Management
- LNET/LND Semantics and API
- API Summary
- LNET Startup and Data Transfer
- Startup
- LNET Send
- LNET Receive
- The Case for RDMA
- LNET Routing
- General Concepts
- Asymmetric Routing Failure
- Routing Buffer Management and Flow Control
- Fine Grain Routing
- Core Concepts
- Lustre Generic Filesystem Wraper Layer: fsfilt
- Overview
- fsfilt for ext3
- fsfilt Use Case Examples
- DIRECT_IO in Lustre
- Replaying Last Transactions After a Server Crash
- Client Connect/Disconnect
- Why ls Is Expensive on Lustre
- Lustre Disk Filesystem: ldiskfs
- Kernel Patches
- Patches: ext3 to ldiskfs
- Future Work