Lustre Internals Documentation Update
Organization
Goals
- Incremental updating of Lustre internals documentation.
- Start with bringing the Understanding Lustre Filesystem Internals document up to date
- The available time for those with Lustre internals knowledge is limited, have a low volume mailing list and targeted iterations
Online Document
Community web-editable version of Understanding Lustre Filesystem Internals:
https://docs.google.com/document/d/1sbtonyl66h7g5AficO6BLRMeXwAeNp-pwswirWAkgIE
Mailing List
Google Groups Mailing List:
https://groups.google.com/forum/#!forum/lustre-internals-update/
Todo
- Online Document
- Overall review, remove out-of-date information
- Add new sections for subsystems that did not exist in 1.6
- Section-by-section review
- Add section with timeline for architectural changes
- Add HSM section
- Add DNE coverage
- Add OSD section (research materials from 2.3-2.4 time-frame)
- Add ZFS coverage
- Expand protocol coverage and reference RPC documentation
- TOC was lost in the conversion but should be able to be auto-generated from sections via the use of styles
- Wiki Page
- Review Doxygen comments in code base and reference them here
- Continue adding references to pertinent LUG/LAD/Lustre-Ecosystem/etc. presentations
- Organizational
- Find dynamic for collaboration between those with internals knowledge and those without who want to help
- Gather existing materials (presentations, etc.)
- Investigate IO simplification project materials for updates and integration
- Generate Doxygen comments and put on web (automate?)
- Find balance for how to coordinate (in document/wiki/group)
Community Feedback
Feedback from discussions with community members on updating Lustre internals documentation in general:
- Top-down versus bottom-up approach
- Differing viewpoints on this. Possible to find way to address both?
- Separate documents (e.g. Understanding Lustre Filesystem Internals document) versus documentation in the code (e.g. Doxygen)
- Possible to incrementally expand and update both, balance what's in them, and have them mutually reference?
Additional Resources
Doxygen Code Documentation
- Old: http://wiki.old.lustre.org/lid/doxygen.api/modules.html
- TODO: Generate new version from codebase and make available on web.
Lustre Protocol Documentation
- http://wiki.opensfs.org/Contract_SFS-DEV-005
- https://jira.hpdd.intel.com/browse/LUDOC-280
- https://build.hpdd.intel.com/job/lustre-protocol-reviews/lastSuccessfulBuild/artifact/protocol.html
Presentations
Sequoia and the ZFS OSD (LUG2013)
Intel® Lustre* File Level Replication (LUG2014)
Distributed Name Space Phase I (LUG2013)
Lustre & Kerberos: in theory and in practice (LUG2015)
Documentation in Lustre Tree
Overview of the Lustre Client I/O (CLIO) subsystem
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/clio.txt
LFSCK
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/lfsck.txt
- LFSCK master slave design
- Object traversal design reference
Lock Ordering
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/lock-ordering
/* dot input file for lock-ordering diagram */
Overview of the Lustre Object Storage Device API
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/osd-api.txt
Overview of Dynamic LNet Configuration
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/dlc.txt
Lustre versioning
http://git.hpdd.intel.com/fs/lustre-release.git/blob_plain/HEAD:/Documentation/versioning.txt
Old Wiki Architectural Documents
Old Site Lustre Internals Documentation Area
http://wiki.old.lustre.org/lid/
Glossary
http://wiki.old.lustre.org/lid/glossary/glossary.html
Brief descriptions of Lustre concepts, objects and major components indexed in various ways.
Lustre Internals: A Gentle Introduction
http://wiki.old.lustre.org/lid/agi/agi.html
Subsystem Map
TODO: Generate new version of
Original ULFI Table Of Contents
- Component View on Architecture
- Lustre Client
- OSS
- MDS
- Lustre Lite
- Connection
- Dentry Object
- Lustre Superblock
- Lustre inode
- Path Lookup
- Path
- Asynchronous I/O
- Group I/O (or Synchronous I/O)
- Direct I/O
- Interface with VFS
- Read-Ahead
- LOV and OSC
- Device Operations
- Page Management
- From OSC Client To OST
- Grant
- LDLM: Lock Manager
- Namespace
- Resource
- Lock Type and Mode
- Callbacks
- Intent
- Lock Manager
- Requesting a Lock
- Canceling a Lock
- Policy Function
- Cases
- MDS: One Client Read
- MDS: Two Clients
- OST: Two Clients Read and Write
- OST and Obdfilter
- as OST
- Initial Setup
- Dispatching
- Directory Layout
- Group Number
- Object Id
- obdfilter
- File Deletion
- File Creation
- MDC and Lustre Metadata
- Overview
- Striping EA
- Striping API
- Infrastructure Support
- Lustre Client Registration
- Superblock and Inode Registration
- Device
- Import and Export
- Portal RPC
- Client Side Interface
- Server Side Interface
- Bulk Transfer
- NRS Optimization
- Error Recovery: A Client Perspective
- LNET: Lustre Networking
- Core Concepts
- LNET Process Id
- ME: Matching Entry
- MD: Memory Descriptor
- Example Use of Offset
- MD Options
- Event Queue
- Portal RPC: A Client of LNET
- Get and Put Confusion
- Router In the Middle
- Round 1: Client Server Interactions
- Round 2: More details
- LNET API
- Naming Conventions
- Initialization and Teardown
- Memory-Oriented Communication Semantics
- Match Entry Management
- LNET/LND Semantics and API
- API Summary
- LNET Startup and Data Transfer
- Startup
- LNET Send
- LNET Receive
- The Case for RDMA
- LNET Routing
- General Concepts
- Asymmetric Routing Failure
- Routing Buffer Management and Flow Control
- Fine Grain Routing
- Core Concepts
- Lustre Generic Filesystem Wraper Layer: fsfilt
- Overview
- fsfilt for ext3
- fsfilt Use Case Examples
- DIRECT_IO in Lustre
- Replaying Last Transactions After a Server Crash
- Client Connect/Disconnect
- Why ls Is Expensive on Lustre
- Lustre Disk Filesystem: ldiskfs
- Kernel Patches
- Patches: ext3 to ldiskfs
- Future Work