OBDFilter Survey

From Lustre Wiki
Jump to: navigation, search

Description

OBDFilter-Survey tests the performance of one or more OSTs by simulating Lustre client IO. Each OSS server in an installation is tested individually. The obdfilter-survey script is a wrapper around the lctl sub-command test_brw. OBDFilter-survey requires a functional Lustre file system, i.e. MGS and MDT running, target OSTs running. Lustre clients are not required for disk-only test but are needed for the network and remote file system (netdisk) modes (although in practice, the latter two modes are not used).

There are 3 test cases covered by the obdfilter-survey benchmark, referred to as:

  • disk
  • network
  • netdisk

The network and netdisk modes are not normally used for benchmarking as they may produce unreliable results. The network test has been effectively superseded by the LNET_Selftest benchmark. Also note that the obdfilter-survey benchmark itself does not scale well beyond a small number of OSTs. From the Lustre discussion mailing list:

The obdfilter_survey script is NOT scalable beyond tens of OSTs since it is only intended to measure the I/O performance of individual storage subsystems, not the scalability of the entire system.

Therefore, only run obdfilter-survey on individual OSTs, using the disk test case.

Note: obdfilter-survey is a potentially destructive test and there is a small risk that pre-existing data will be lost during execution. The test will consume capacity on the storage targets during execution and, in the manner of all benchmarks, will compete with other processes for resources. Do not run this benchmark on a system containing production data.

Purpose

OBDFilter-Survey provides feedback on the potential performance of OSTs attached to an OSS. The obdfilter-survey script generates sequential I/O from varying numbers of threads and objects (files) to simulate the I/O patterns of a Lustre client. It can be run directly on an OSS node to measure the OST storage performance without any intervening network, or it can be run remotely on a Lustre client to measure the OST performance including network overhead.

The approach and methodology for obdfilter-survey is very similar to that for sgpdd-survey.

Preparation

  1. Install the host operating system for each of the Lustre servers (MGS, MDS and OSS). On some sites, the MGS will be co-located with an MDS.
  2. Install the Lustre server software distribution on each system.
  3. Configure the Lustre Network (LNet) module and verify that it is operating correctly.
  4. Create the MGS, MDT and OST file system targets according to the system design.
  5. Start the Lustre services (mount MGS, OSTs and MDTs).

Benchmark Execution

The obdfilter-survey script takes its parameters from environment variables established at run time. The parameters that are of most interest are as follows (refer to the section OBDFilter-Survey Input Parameters for a detailed breakdown of the parameters and how to calculate suitable values):

Name Description Typical Values
nobjlo Initial object count 1
nobjhi Maximum object count 512
thrlo Initial number of threads 1
thrhi Maximum number of threads 1024
size Data set size in MiB per OST. Cannot exceed the capacity of an individual OST 2.5 * size(RAM) / count(LUNs)
rslt_loc Directory to contain results. Must exist before script is run /var/tmp/obdfilter-survey_out
targets Space separated list of OSTs to test, enclosed in quotes. Each target is referenced using the object label, in the format <fsname>-OST<index>" "lustre-OST0006 lustre-OST0007"

The following is an example command line for executing obdfilter-survey:

mkdir -p /var/tmp/obdfilter-survey_out

nobjlo=1 nobjhi=512 \
thrlo=1 thrhi=1024 \
size=51200 \
rslt_loc=/var/tmp/obdfilter-survey_out \
targets="lustre-OST0006 lustre-OST0007 lustre-OST0008" \
case=disk \
obdfilter-survey

OBDFilter-Survey Input Parameters

Note: The parameters to obdfilter-survey are largely undocumented. This section attempts to accurately reflect the intended meaning and usage of each parameter.

Note: Due to the similarity between sgpdd-survey and obdfilter-survey command options, some material is duplicated between the two pages.

Name Description Typical Values
nobjlo

nobjhi

The number of concurrent objects to create on each iteration, per OST. Starting at nobjlo, the number of objects is doubled on each iteration until nobjhi is reached.

The nobjlo and nobjhi parameters control how many independent objects on the OST will be read or written simultaneously. This is intended to simulate multiple Lustre clients accessing each OST.

nobjlo=1

nobjhi=512

thrlo

thrhi

The range of threads to iterate over. The thread count is doubled on every iteration of the thread count loop. The thrlo and thrhi parameters are used to direct the number of worker threads running in parallel. This is intended to simulate the Lustre OSS threads. thrlo=1

thrhi=1024

size The data set size in MiB per storage device (LUN). The total dataset size for the entire benchmark is calculated as the LUN dataset size multiplied by the number of LUNs in the benchmark:
ds_total = size * count(LUNs)

Set size to a small value (e.g. 100MB) to quickly test the configuration for correctness.

For a full benchmarking run, set ds_total to a value greater than or equal to twice the target system's RAM. This circumvents any cacheing that may occur within the target system. From this, calculate the value of the size input parameter for obdfilter-survey as follows:

size = ds_total / count(LUNs)

where ds_total = 2.5 * RAM

As an example, for a server with 32GB RAM and 4 LUNs, the input size is calculated as follows:

size = (32GB * 2.5) / 4 LUNs
     = (32768MiB * 2.5) / 4
     = 81920 / 4
     = 20480
2.5 * sizeof(RAM) / count(LUNs)

e.g.:

size=20480

rslt_loc Directory to contain results. Must exist before benchmark is run. rslt_loc=/var/tmp/obdfilter-survey_out
rszlo

rszhi

The record size in KB. The default is for 1MB I/O, which mirrors Lustre's write size. Provided that these values are set to 1024 (i.e. 1MiB), there is no need to set them explicitly. This maps to the bpt (blocks per transfer) parameter in sgp_dd, referring to the number of blocks in each I/O transaction.

Issues

  • If the test is aborted before completion, the target OST will not be cleaned up. Over time, the OST will fill with objects if this pattern of aborting runs is repeated. It is difficult to clean these objects from the OST. If this happens, umount the OST and re-mount as type ldiskfs. Go to the O/2 (alphabetic letter capital 'O' – oh – not the numeral zero) directory and remove the objects found there. Do not remove the LAST_ID file. Alternatively, read the object IDs from the detail log of the test run and use the following command while the file system is mounted as type lustre:
    lctl --device <ostref> destroy <1st object> <count>
    
  • ERROR entries in the benchmark output usually mean out of space condition – check detail log.
  • SHORT entries in the benchmark output usually mean test completed too quickly, e.g. because data size is too small – check detail log.
  • On catastrophic fail, echo-clients might not be cleaned up. Use lctl to fix, e.g.:
    [root@oss1 ~]# lctl dl
    0 UP mgc MGC10.73.0.11@tcp 82e16bb1-9e83-1236-4435-c0dcf29a04da 5
    1 UP ost OSS OSS_uuid 3
    2 UP obdfilter lustre-OST0000 lustre-OST0000_UUID 7
    3 AT echo_client lustre-OST0000_ecc lustre-OST0000_ecc_UUID 1
    
    [root@oss1 ~]# lctl
    lctl > cfg lustre-OST0000_ecc
    lctl > cleanup
    lctl > detach
    lctl > exit
    
    [root@oss1 ~]# lctl dl
    0 UP mgc MGC10.73.0.11@tcp 82e16bb1-9e83-1236-4435-c0dcf29a04da 5
    1 UP ost OSS OSS_uuid 3
    2 UP obdfilter lustre-OST0000 lustre-OST0000_UUID 7
    

References

Appendix: Sample obdfilter-survey script

#!/bin/bash
# Minimal obdfilter-survey script
TARGETS="oss01:lustre oss02:lustre1 oss02:lustre2"
NOBJLO=1
NOBJHI=1
THRLO=16
THRHI=16
OUTPUT="/tmp/obdfilter-survey-1j-16t-100s.out"
# SIZE="46000"
SIZE="100"
ssh oss01 mkdir -p $OUTPUT
ssh oss02 mkdir -p $OUTPUT
thrhi=$THRHI thrlo=$THRLO \
nobjhi=$NOBJHI nobjlo=$NOBJLO \
size=$SIZE case="disk" \
targets=$TARGETS rslt_loc=$OUTPUT \
/usr/bin/obdfilter-survey