Lustre IO Monitoring

From Lustre Wiki
Jump to navigation Jump to search

Overview

Lustre exposes detailed I/O statistics at multiple layers — client, RPC, and server. Understanding these statistics is essential for diagnosing performance problems, identifying bottlenecks, and validating tuning changes.

This page focuses on interpreting the stats. For a broader monitoring overview, see the Lustre Monitoring and Statistics Guide. For tuning guidance, see Lustre Tuning.

Client-Side Statistics (llite)

Client stats show how applications interact with the Lustre filesystem:

lctl get_param llite.*.stats

Key counters:

Counter Description
read_bytes Total bytes read by applications
write_bytes Total bytes written by applications
open Number of file open operations
close Number of file close operations
mmap Number of memory-mapped I/O operations
seek Number of lseek calls
fsync Number of fsync calls (can indicate write-barrier-heavy workloads)
truncate Number of truncate operations
setattr Number of attribute changes (chmod, chown, etc.)
getattr Number of attribute reads (stat calls)

To reset client stats:

lctl set_param llite.*.stats=clear

Reading the Output Format

Each stat line shows:

{count} samples [{unit}] {min} {max} {sum} {sumsq}

For example:

read_bytes    500 samples [bytes] 4096 1048576 209715200 ...

This means 500 read operations occurred, with the smallest read being 4 KB, the largest 1 MB, and a total of 200 MB read.

RPC Statistics (osc)

RPC stats reveal how the client packages I/O into RPCs sent to OSTs:

lctl get_param osc.*.rpc_stats

This produces histogram output with several sections:

Pages per RPC

Shows the distribution of RPC sizes in pages (typically 4 KB each):

pages per rpc         rpcs   % cumulative %
1:                     150  15  15
2:                      50   5  20
...
256:                   500  50 100

What to look for:

  • A healthy large-file workload should show most RPCs at the maximum size (256 pages = 1 MB by default).
  • A high percentage of small RPCs (1-4 pages) indicates the application is doing small, non-sequential I/O. Consider increasing max_pages_per_rpc or adjusting the application's I/O pattern.

RPCs in Flight

Shows how many RPCs are concurrently in flight:

rpcs in flight        rpcs   % cumulative %
1:                     200  20  20
2:                     300  30  50
...
8:                     100  10 100

What to look for:

  • If most RPCs are at depth 1, the client may be I/O latency-bound. Consider increasing max_rpcs_in_flight (default 8).
  • If the histogram is concentrated at the maximum value, the client is fully utilizing its RPC pipeline — the bottleneck is likely on the server side or the network.

Offset Distribution

Shows whether I/O is sequential or random:

What to look for:

  • Sequential workloads show offsets that increase monotonically.
  • Random workloads show a flat offset distribution across the file.

Server-Side BRW Statistics (obdfilter)

BRW (Bulk Read/Write) stats show the actual disk I/O patterns on the OSS:

lctl get_param obdfilter.*.brw_stats

Sections include:

Disk I/O Size

disk I/O size          ios   % cumulative %
4K:                    100  10  10
8K:                     50   5  15
...
1M:                    500  50 100
  • Large I/O sizes (512 KB–1 MB) indicate efficient bulk transfer.
  • Many small I/O sizes suggest fragmentation, small-file workloads, or clients sending suboptimal RPCs.

Contiguous vs. Non-Contiguous Access

Shows whether I/O requests from clients target contiguous disk regions:

  • High contiguous access percentages indicate sequential I/O — good for throughput.
  • High non-contiguous (discontinuous) access means the server is seeking — a potential performance bottleneck on HDDs (less impactful on SSDs).

I/O Time

Shows the time distribution for completing I/O operations:

  • Look for long-tail latencies that could indicate slow disks, RAID rebuilds, or resource contention.

Interpreting the Histogram Format

All Lustre stats histograms use the same format:

{bucket_label}:  {count}  {percentage}  {cumulative_percentage}
  • count — number of operations in this bucket.
  • percentage — fraction of total operations in this bucket.
  • cumulative percentage — running total; 100% at the last bucket.

The bucket labels are powers of 2 (1, 2, 4, 8, 16, ...) representing pages, bytes, or microseconds depending on the section.

Common Performance Anti-Patterns

Symptom Where to Look Likely Cause
Many small RPCs osc.*.rpc_stats (pages per rpc) Application doing small random I/O; misaligned I/O; stripe size too small
Low RPCs in flight osc.*.rpc_stats (rpcs in flight) max_rpcs_in_flight too low; single-threaded application
High queue depth on server obdfilter.*.brw_stats Server overloaded; too many clients; slow storage backend
Imbalanced I/O across OSTs lfs df or per-OST stats Uneven striping; hot files on specific OSTs; OST QoS weights
High getattr/setattr rate llite.*.stats Metadata-heavy workload (e.g., ls -l on large directories); consider MDT tuning
Many fsync calls llite.*.stats Application forcing write barriers; affects throughput significantly

Collecting Stats Over Time

To capture a baseline and then measure a workload:

# Clear stats
lctl set_param llite.*.stats=clear
lctl set_param osc.*.rpc_stats=clear
# Run workload
# ...
# Collect stats
lctl get_param llite.*.stats > /tmp/llite_stats.txt
lctl get_param osc.*.rpc_stats > /tmp/rpc_stats.txt

On servers:

lctl set_param obdfilter.*.brw_stats=clear
# ... (after workload)
lctl get_param obdfilter.*.brw_stats > /tmp/brw_stats.txt

See Also