Lustre Internals Documentation Update: Difference between revisions
KenRawlings (talk | contribs) |
KenRawlings (talk | contribs) m (refactoring) |
||
Line 13: | Line 13: | ||
=== Feedback === | === Feedback === | ||
* Top down versus bottom up | |||
* Separate documents versus documentation in the code | |||
* Have existing ULFI and Doxygen comments. How to incrementally expand and update both, balance what's in them, and have them mutually reference? | |||
=== Todo === | === Todo === | ||
Line 43: | Line 43: | ||
=== Understanding Lustre Filesystem Internals === | === Understanding Lustre Filesystem Internals === | ||
== Doxygen Code Documentation == | == Doxygen Code Documentation == | ||
Line 343: | Line 238: | ||
|- | |- | ||
|} | |} | ||
==== Current Table Of Contents ==== | |||
* Component View on Architecture | |||
** Lustre Client | |||
** OSS | |||
** MDS | |||
* Lustre Lite | |||
*** Connection | |||
*** Dentry Object | |||
*** Lustre Superblock | |||
*** Lustre inode | |||
** Path Lookup | |||
*** Path | |||
*** Asynchronous I/O | |||
*** Group I/O (or Synchronous I/O) | |||
*** Direct I/O | |||
*** Interface with VFS | |||
** Read-Ahead | |||
* LOV and OSC | |||
*** Device Operations | |||
** Page Management | |||
** From OSC Client To OST | |||
** Grant | |||
* LDLM: Lock Manager | |||
** Namespace | |||
** Resource | |||
** Lock Type and Mode | |||
** Callbacks | |||
** Intent | |||
** Lock Manager | |||
*** Requesting a Lock | |||
*** Canceling a Lock | |||
*** Policy Function | |||
*** Cases | |||
*** MDS: One Client Read | |||
*** MDS: Two Clients | |||
*** OST: Two Clients Read and Write | |||
* OST and Obdfilter | |||
*** as OST | |||
*** Initial Setup | |||
*** Dispatching | |||
*** Directory Layout | |||
*** Group Number | |||
*** Object Id | |||
** obdfilter | |||
*** File Deletion | |||
*** File Creation | |||
* MDC and Lustre Metadata | |||
*** Overview | |||
** Striping EA | |||
** Striping API | |||
* Infrastructure Support | |||
** Lustre Client Registration | |||
** Superblock and Inode Registration | |||
*** Device | |||
** Import and Export | |||
* Portal RPC | |||
** Client Side Interface | |||
** Server Side Interface | |||
** Bulk Transfer | |||
*** NRS Optimization | |||
** Error Recovery: A Client Perspective | |||
* LNET: Lustre Networking | |||
** Core Concepts | |||
*** LNET Process Id | |||
*** ME: Matching Entry | |||
*** MD: Memory Descriptor | |||
*** Example Use of Offset | |||
*** MD Options | |||
*** Event Queue | |||
** Portal RPC: A Client of LNET | |||
*** Get and Put Confusion | |||
*** Router In the Middle | |||
*** Round 1: Client Server Interactions | |||
*** Round 2: More details | |||
** LNET API | |||
*** Naming Conventions | |||
*** Initialization and Teardown | |||
*** Memory-Oriented Communication Semantics | |||
*** Match Entry Management | |||
** LNET/LND Semantics and API | |||
*** API Summary | |||
** LNET Startup and Data Transfer | |||
*** Startup | |||
*** LNET Send | |||
*** LNET Receive | |||
*** The Case for RDMA | |||
** LNET Routing | |||
*** General Concepts | |||
*** Asymmetric Routing Failure | |||
*** Routing Buffer Management and Flow Control | |||
*** Fine Grain Routing | |||
* Lustre Generic Filesystem Wraper Layer: fsfilt | |||
** Overview | |||
** fsfilt for ext3 | |||
** fsfilt Use Case Examples | |||
*** DIRECT_IO in Lustre | |||
*** Replaying Last Transactions After a Server Crash | |||
*** Client Connect/Disconnect | |||
*** Why ls Is Expensive on Lustre | |||
* Lustre Disk Filesystem: ldiskfs | |||
** Kernel Patches | |||
** Patches: ext3 to ldiskfs | |||
* Future Work |
Revision as of 13:21, 7 March 2017
Organization
Goals
Incremental updating of lustre internals documentation. using the existing Understanding Lustre Filesystem Internals document as a base to build upon.
Group web-editable version of Understanding Lustre Filesystem Internals available at:
https://docs.google.com/document/d/1sbtonyl66h7g5AficO6BLRMeXwAeNp-pwswirWAkgIE
- Lustre developer time is limited, have a low volume mailing list and targeted iterations
Feedback
- Top down versus bottom up
- Separate documents versus documentation in the code
- Have existing ULFI and Doxygen comments. How to incrementally expand and update both, balance what's in them, and have them mutually reference?
Todo
- Gather existing materials (what isn't here? anything more to contribute?)
- LUG/LAD/Lustre Ecosystem internals-related presentation references
- Generate Doxygen comments and put on web (automate?)
- Find balance for how to coordinate (in document/wiki/group)
Old Site Lustre Internals Documentation Area
http://wiki.old.lustre.org/lid/
Glossary
http://wiki.old.lustre.org/lid/glossary/glossary.html
Brief descriptions of Lustre concepts, objects and major components indexed in various ways.
Lustre Internals: A Gentle Introduction
http://wiki.old.lustre.org/lid/agi/agi.html
Subsystem Map
TODO: Generate new version of http://wiki.old.lustre.org/lid/subsystem-map-old/subsystem-map.html
Understanding Lustre Filesystem Internals
Doxygen Code Documentation
http://wiki.old.lustre.org/lid/doxygen.api/modules.html
Additional Resources
TODO: How to integrate these into the above
Lustre Protocol Documentation
http://wiki.opensfs.org/Contract_SFS-DEV-005 https://jira.hpdd.intel.com/browse/LUDOC-280 https://build.hpdd.intel.com/job/lustre-protocol-reviews/lastSuccessfulBuild/artifact/protocol.html
Documentation in Lustre Tree
Overview of the Lustre Client I/O (CLIO) subsystem
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/clio.txt
LFSCK
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/lfsck.txt
- LFSCK master slave design
- Object traversal design reference
Lock Ordering
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/lock-ordering
/* dot input file for lock-ordering diagram */
Overview of the Lustre Object Storage Device API
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/osd-api.txt
Overview of Dynamic LNet Configuration
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/dlc.txt
Lustre versioning
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/versioning.txt
Old Wiki Architectural Documents
Current Table Of Contents
- Component View on Architecture
- Lustre Client
- OSS
- MDS
- Lustre Lite
- Connection
- Dentry Object
- Lustre Superblock
- Lustre inode
- Path Lookup
- Path
- Asynchronous I/O
- Group I/O (or Synchronous I/O)
- Direct I/O
- Interface with VFS
- Read-Ahead
- LOV and OSC
- Device Operations
- Page Management
- From OSC Client To OST
- Grant
- LDLM: Lock Manager
- Namespace
- Resource
- Lock Type and Mode
- Callbacks
- Intent
- Lock Manager
- Requesting a Lock
- Canceling a Lock
- Policy Function
- Cases
- MDS: One Client Read
- MDS: Two Clients
- OST: Two Clients Read and Write
- OST and Obdfilter
- as OST
- Initial Setup
- Dispatching
- Directory Layout
- Group Number
- Object Id
- obdfilter
- File Deletion
- File Creation
- MDC and Lustre Metadata
- Overview
- Striping EA
- Striping API
- Infrastructure Support
- Lustre Client Registration
- Superblock and Inode Registration
- Device
- Import and Export
- Portal RPC
- Client Side Interface
- Server Side Interface
- Bulk Transfer
- NRS Optimization
- Error Recovery: A Client Perspective
- LNET: Lustre Networking
- Core Concepts
- LNET Process Id
- ME: Matching Entry
- MD: Memory Descriptor
- Example Use of Offset
- MD Options
- Event Queue
- Portal RPC: A Client of LNET
- Get and Put Confusion
- Router In the Middle
- Round 1: Client Server Interactions
- Round 2: More details
- LNET API
- Naming Conventions
- Initialization and Teardown
- Memory-Oriented Communication Semantics
- Match Entry Management
- LNET/LND Semantics and API
- API Summary
- LNET Startup and Data Transfer
- Startup
- LNET Send
- LNET Receive
- The Case for RDMA
- LNET Routing
- General Concepts
- Asymmetric Routing Failure
- Routing Buffer Management and Flow Control
- Fine Grain Routing
- Core Concepts
- Lustre Generic Filesystem Wraper Layer: fsfilt
- Overview
- fsfilt for ext3
- fsfilt Use Case Examples
- DIRECT_IO in Lustre
- Replaying Last Transactions After a Server Crash
- Client Connect/Disconnect
- Why ls Is Expensive on Lustre
- Lustre Disk Filesystem: ldiskfs
- Kernel Patches
- Patches: ext3 to ldiskfs
- Future Work