Lustre Internals Documentation Update

From Lustre Wiki
Jump to: navigation, search



  • Incremental updating of Lustre internals documentation.
  • The available time for those with Lustre internals knowledge is limited, maximizing the efficiency of their efforts is a priority.


  • Updates are moving to Category:Internals and pages with the Internals category in order to better facilitate linking and incremental updating.
  • Updating Lustre internals documentation is discussed regularly on the Lustre Working Group calls, please join us there.



  • Migrate ULFI LNET chapter from shared document to wiki
  • Migrate ULFI PtlRPC chapter from shared document to wiki
  • Migrate ULFI LDLM chapter from shared document to wiki
  • Update Subsystem wiki page for current subsystems
  • Add Layered Object wiki page, reference clio.txt and LID/AGI resources
  • Update wiki PtlRPC page to reference PtlRPC documentation work


  • Generate Doxygen comments from current source tree and put on web
  • Review Doxygen comments in code base and find methods to reference them from wiki
  • Keep Lustre_Internals wiki page?



Feedback from discussions with community members on updating Lustre internals documentation in general:

  • Top-down versus bottom-up approach
    • Differing viewpoints on this. Possible to find way to address both?
  • Separate documents (e.g. Understanding Lustre Filesystem Internals document) versus documentation in the code (e.g. Doxygen)
    • Possible to incrementally expand and update both, balance what's in them, and have them mutually reference?


Lustre Protocol Documentation


Sequoia and the ZFS OSD (LUG2013)

Intel® Lustre* File Level Replication (LUG2014)

Distributed Name Space Phase I (LUG2013)

Lustre & Kerberos: in theory and in practice (LUG2015)

Documentation in Lustre Tree

Doxygen Code Documentation

Overview of the Lustre Client I/O (CLIO) subsystem


  • LFSCK master slave design
  • Object traversal design reference

Lock Ordering

/* dot input file for lock-ordering diagram */

Overview of the Lustre Object Storage Device API

Overview of Dynamic LNet Configuration

Lustre versioning

Old Wiki Architectural Documents

Page Comments
Architecture - Backup
Architecture - CROW
Architecture - CTDB with Lustre
Architecture - Caching OSS
Architecture - Changelogs
Architecture - Changelogs 1.6
Architecture - Client Cleanup
Architecture - Clustered Metadata
Architecture - Commit on Share
Architecture - Cuts
Architecture - DMU OSD
Architecture - DMU Zerocopy
Architecture - End-to-end Checksumming
Architecture - Epochs
Architecture - External File Locking
Architecture - FIDs on OST
Architecture - Feature FS Replication
Architecture - Fileset
Architecture - Flash Cache
Architecture - Free Space Management
Architecture - GNS
Architecture - HSM No useful information
Architecture - HSM Migration
Architecture - HSM and Cache No useful information
Architecture - IO system
Architecture - Interoperability 1.6 1.8 2.0
Architecture - Interoperability fids zfs
Architecture - LRE Images
Architecture - Libcfs
Architecture - Llog over OSD
Architecture - Lustre DLDs
Architecture - Lustre HLDs
Architecture - Lustre Logging API
Architecture - MDS-on-DMU
Architecture - MDS striping format
Architecture - MPI IO and NetCDF
Architecture - MPI LND
Architecture - Metadata API
Architecture - Migration (1)
Architecture - Migration (2)
Architecture - Multiple Interfaces For LNET
Architecture - Network Request Scheduler
Architecture - New Metadata API
Architecture - OSS-on-DMU
Architecture - Open by fid
Architecture - PAG
Architecture - Pools of targets
Architecture - Profiling Tools for IO
Architecture - Proxy Cache
Architecture - Punch and Extent Migration
Architecture - Punch and Extent Migration Requirements
Architecture - Recovery Failures
Architecture - Request Redirection
Architecture - Scalable Pinger
Architecture - Security
Architecture - Server Network Striping
Architecture - Simple Space Balance Migration
Architecture - Simplified Interoperation
Architecture - Space Manager
Architecture - Sub Tree Locks
Architecture - User Level Access
Architecture - User Level OSS
Architecture - Userspace Servers
Architecture - Version Based Recovery
Architecture - Wide Striping
Architecture - Wire Level Protocol
Architecture - Write Back Cache
Architecture - Writing Architecture Documents
Architecture - ZFS TinyZAP
Architecture - ZFS for Lustre
Architecture - ZFS large dnodes
Architecture Descriptions

Old Site Lustre Internals Documentation Area


Brief descriptions of Lustre concepts, objects and major components indexed in various ways.

Lustre Internals: A Gentle Introduction

Subsystem Map

TODO: Generate new version of

Original ULFI Table Of Contents

  • Component View on Architecture
    • Lustre Client
    • OSS
    • MDS
  • Lustre Lite
      • Connection
      • Dentry Object
      • Lustre Superblock
      • Lustre inode
    • Path Lookup
      • Path
      • Asynchronous I/O
      • Group I/O (or Synchronous I/O)
      • Direct I/O
      • Interface with VFS
    • Read-Ahead
  • LOV and OSC
      • Device Operations
    • Page Management
    • From OSC Client To OST
    • Grant
  • LDLM: Lock Manager
    • Namespace
    • Resource
    • Lock Type and Mode
    • Callbacks
    • Intent
    • Lock Manager
      • Requesting a Lock
      • Canceling a Lock
      • Policy Function
      • Cases
      • MDS: One Client Read
      • MDS: Two Clients
      • OST: Two Clients Read and Write
  • OST and Obdfilter
      • as OST
      • Initial Setup
      • Dispatching
      • Directory Layout
      • Group Number
      • Object Id
    • obdfilter
      • File Deletion
      • File Creation
  • MDC and Lustre Metadata
      • Overview
    • Striping EA
    • Striping API
  • Infrastructure Support
    • Lustre Client Registration
    • Superblock and Inode Registration
      • Device
    • Import and Export
  • Portal RPC
    • Client Side Interface
    • Server Side Interface
    • Bulk Transfer
      • NRS Optimization
    • Error Recovery: A Client Perspective
  • LNET: Lustre Networking
    • Core Concepts
      • LNET Process Id
      • ME: Matching Entry
      • MD: Memory Descriptor
      • Example Use of Offset
      • MD Options
      • Event Queue
    • Portal RPC: A Client of LNET
      • Get and Put Confusion
      • Router In the Middle
      • Round 1: Client Server Interactions
      • Round 2: More details
    • LNET API
      • Naming Conventions
      • Initialization and Teardown
      • Memory-Oriented Communication Semantics
      • Match Entry Management
    • LNET/LND Semantics and API
      • API Summary
    • LNET Startup and Data Transfer
      • Startup
      • LNET Send
      • LNET Receive
      • The Case for RDMA
    • LNET Routing
      • General Concepts
      • Asymmetric Routing Failure
      • Routing Buffer Management and Flow Control
      • Fine Grain Routing
  • Lustre Generic Filesystem Wraper Layer: fsfilt
    • Overview
    • fsfilt for ext3
    • fsfilt Use Case Examples
      • DIRECT_IO in Lustre
      • Replaying Last Transactions After a Server Crash
      • Client Connect/Disconnect
      • Why ls Is Expensive on Lustre
  • Lustre Disk Filesystem: ldiskfs
    • Kernel Patches
    • Patches: ext3 to ldiskfs
  • Future Work