VDBench
Description
VDBench is an I/O workload generator for measuring storage performance and verifying the data integrity of direct-attached and network connected storage. The software is known to run on several operating platforms.
Purpose
VDBench is typically used to establish baseline performance characteristics of block storage, both for individual disk drives and for RAID LUNs. It can also be used at the file system level to simulate application IO. The processes in this document will use the benchmark to verify the performance of individual drives and LUNs. VDBench will destroy content when running write workloads against raw devices. Do not use on raw devices containing production data.
Preparation
- Install the Java Run-time environment, if not already present on the target machine. Use the official JRE from http://java.com. The complete list of supported run-times is available at: https://www.java.com/en/download/manual.jsp.
- Unless Java is a permanent fixture of the platform run-time, download the 64-bit tarball rather than the RPM package. This will allow the Java software to be installed in an arbitrary, isolated directory structure and easily deleted when the benchmark is concluded. For example:
cd $HOME tar zxf $HOME/jre-8u131-linux-x64.tar.gz
- Unless Java is a permanent fixture of the platform run-time, download the 64-bit tarball rather than the RPM package. This will allow the Java software to be installed in an arbitrary, isolated directory structure and easily deleted when the benchmark is concluded. For example:
- Download VDBench. The current official version is available from the Oracle Technology Network (OTN):
http://www.oracle.com/technetwork/server-storage/vdbench-downloads-1901681.html
The download is free, but Oracle requires that users register an account. An older version remains on SourceForge:
- Unzip the VDBench archive:
mkdir $HOME/vdbench cd $HOME/vdbench $$ unzip $HOME/vdbench50406.zip
- Update the
vdbench
wrapper script contained within the VDBench distribution to point to the JRE location. e.g.:cd $HOME/vdbench sed -i.inst 's/^\(java\)=.*$/\1=$HOME\/jre1.8.0_131\/bin\/java/' vdbench
- Run a quick test to ensure that vdbench can run on the target system:
./vdbench -t
Older versions of VDBench require CSH or TCSH, so we can be thankful for small mercies.
Benchmark Execution
Establish Baseline Performance of Individual Drives
- Ensure that all individual disks are presented to the operating platform for testing. For some storage arrays, one must create separate LUNS for each target device.
- Create a test profile for each target designed to run read only and write only tests. Use the following to create a template:
for i in b c d e f; do sed 's/sdX/sd'${i}'/g' > input_sd${i}_rw_test <<__EOF # SD -- Storage Definition sd=sdX,lun=/dev/sdX,openflags=o_direct # WD -- Workload Definition wd=sdX_wd_r_seq,sd=sd*,xfersize=1024k,rdpct=100,seekpct=sequential wd=sdX_wd_w_seq,sd=sd*,xfersize=1024k,rdpct=0,seekpct=sequential # RD -- Run Definition rd=sdX_run_r_seq_iomax,wd=sdX_wd_r_seq,iorate=max,elapsed=100,interval=10,forthreads=(1-1024,d),warmup=20 rd=sdX_run_w_seq_iomax,wd=sdX_wd_w_seq,iorate=max,elapsed=100,interval=10,forthreads=(1-1024,d),warmup=20 __EOF done
Initially, structure the read tests to run first. This gives the best opportunity to discover and fix any potential error in the benchmark configuration before a write test is run that destroys the data. Adjust the input files according to the results obtained when looking at optimisation; for example, if write performance is poor, one may wish to disable the read tests altogether while adjusting parameters that affect write performance.
Create one test profile for each device under test. Templates containing multiple storage definitions will be evaluated for future use once issues relating to CPU utilisation have been resolved (see notes). - Run each
vdbench
test case in sequence:for i in b c d e f; do ./vdbench -f input_sd${i}_rw_test -o o_sd${i}_rw_test.tod done
- Tabulate the results in a spreadsheet such as Excel and generate graphs to visualise the data. Establish the performance trend and look for any exceptions. One should normally expect to see healthy drives performing within +/-5% of one another. Faulty drives normally stand out quite clearly.
- Replace any bad drives and re-run VDBench against those targets.
Establish Baseline Performance of RAID LUNs
- Create the RAID LUNs that will be used to establish the file system storage volumes for Lustre.
- Repeat the VDBench benchmark using the same test profile as for individual storage devices:
for i in b c d e f; do sed 's/sdX/sd'${i}'/g' > input_sd${i}_raid_rw_test <<__EOF # VDBench baseline performance test for RAID Volumes # SD -- Storage Definition sd=sdX,lun=/dev/sdX,openflags=o_direct # WD -- Workload Definition wd=sdX_raid_wd_r_seq,sd=sd*,xfersize=1024k,rdpct=100,seekpct=sequential wd=sdX_raid_wd_w_seq,sd=sd*,xfersize=1024k,rdpct=0,seekpct=sequential # RD -- Run Definition rd=sdX_raid_run_r_seq_iomax,wd=wd_r_seq,iorate=max,elapsed=100,interval=10,forthreads=(1-1024,d),warmup=20 rd=sdX_raid_run_w_seq_iomax,wd=wd_w_seq,iorate=max,elapsed=100,interval=10,forthreads=(1-1024,d),warmup=20 __EOF done
Note that the device names may be different for assembled LUNs, depending on the driver used and/or vendor-supplied software. e.g. MD RAID devices are typically /dev/mdX and kernel multipath devices are typically /dev/dm-XX. This can vary by Linux distribution as well as by storage vendor.
- Run each vdbench test case in sequence:
for i in b c d e f; do ./vdbench -f input_sd${i}_raid_rw_test -o o_sd${i}_raid_rw_test.tod done
- Tabulate the results in a spreadsheet such as Excel and generate graphs to visualise the data. Establish the performance trend and look for any exceptions. One should normally expect to see healthy volumes performing within +/-5% of one another.
- If any exceptions are discovered, examine the affected LUN to identify the root cause. If a hardware fault has been identified, replace the affected component.
- If one or more disk drives have been replaced, re-run vdbench against the replacement device(s). Note that it may be necessary to destroy the RAID volume in order to re-run the vdbench test case for individual drives. When individual testing is complete, re-assemble the RAID volume and re-run the benchmark for RAID LUNs.
- Finalise the results and record in the spreadsheet.
Notes
VDBench Test Definition Files
In a vdbench input template, there are 3 main sections that are of importance:
- SD: storage definition
- WD: workload definition
- RD: run definition
Definitions must be recorded in the template in the specific order listed above, i.e. SD, then WD and finally RD.
Each definition is contained on a single line. Continuation over multiple lines can be managed by using the standard shell continuation character '\' (backslash) at the end of the line. The continuation character must be immediately preceded by whitespace and must be the last character on the line.
SD: Storage Definition
SD, the storage definition, is used to define the characteristics of the disk or LUN to be tested, e.g.:
sd=sdb,lun=/dev/sdb,openflags=o_direct
Parameter | Description |
---|---|
sd=sdb
|
Marks the start of a storage definition. sdb is an arbitrary label and must be unique within the definition file. Using the device name is recommended – the WWID or some other unique identifier that can be uniquely associated with the device under test are also suitable. Multiple storage definitions can be listed, one per line.
|
lun=/dev/sdb
|
The path to the storage target. |
openflags=o_direct
|
Additional controls or options for opening or closing LUNs or files.
Note: On Linux, one must specify |
WD: Workload Definition
WD, the workload definition describes the test characteristics for a given storage definition:
wd=wd1,sd=sd*,xfersize=1024k,rdpct=100,seekpct=sequential wd=wd2,sd=sd*,xfersize=1024k,rdpct=0,seekpct=sequential
In the above examples, wd1
represents a 100% sequential read workload and wd2
represents a 100% sequential write workload.
Parameter | Description |
---|---|
wd=wd1
|
Marks the start of a workload definition. Must appear after the storage definitions and before any run definitions. wd1 is an arbitrary label and must be unique within the definition file. It is recommended that the workload definition name reflect the type of test, e.g.: sdX_wd_r_seq (where sdX is the device label, wd stands for workload definition, r is read workload, seq means sequential workload).
|
sd=sd*
|
The name of the storage definitions to use. There can be more than one, e.g.: sd=(sd1,sd2) . An asterisk (* ) indicates all storage definitions listed in the template. In this example, sd* refers to all storage definitions with a label beginning with sd .
|
xfersize=1024k
|
The data transfer size distribution. Normally use 1024k for Lustre workloads.
|
rdpct=100
|
The percentage number of read transactions in the workload. 0 indicates zero reads (in other words, a 100% write workload). 100 indicates a 100% read workload.
|
seekpct=sequential
|
The percentage of random seeks in the workload. 0 or sequential indicates zero random seeks. 100 or random means every I/O goes to a random seek address.
|
RD: Run Definition
RD, the run definition, specifies the workload definition to run, the IO rates to generate and how long to run for.
rd=run1,wd=wd1,iorate=max,elapsed=100,interval=10,warmup=20,forthreads=(1-1024,d) rd=run2,wd=wd2,iorate=max,elapsed=100,interval=10,warmup=20,forthreads=(1-1024,d)
Parameter | Description |
---|---|
rd=run1
|
Marks the start of a run definition. Must appear after the storage definitions and workload definitions. run1 is an arbitrary label, unique within the definition file. It is recommended that the run definition name reflect the test characteristics, e.g.: sdX_run_r_seq_iomax (sdX is the device label, run menas run definition, r for read workload, seq for sequential workload, iomax means maximum I/O rate).
|
wd=wd1
|
The workload definition(s) to use. Normally just select one. |
elapsed=100
|
The time, in seconds, for each run. Must be at least 2x the reporting interval. Does not include any warmup time, if specified (total run time will be elapsed time plus warm-up). |
interval=10
|
The reporting interval, i.e. the number of seconds between each report update. |
warmup=20
|
The time to wait before recording results in the run total. Must be a multiple of the reporting interval. In the above example, the first 2 reports will not be recorded in the overall results. The result of the warmup runs will still be reported in the output but will not form part of the overall result. |
forthreads=(1-1024,d)
|
Create a loop for generating results for different thread counts. (1-1024,d) represents a range from 1 to 1024 threads, (d)oubling the thread count on each iteration (i.e. 1, 2, 4, 8, 16, ... 1024).
|
Running VDBench
When executing the vdbench
command, use the -f
flag to refer to the input test definition and -o
to refer to the output directory that will contain the results. e.g.:
./vdbench -f sdb_read_write -o o_sdb_read_write
The input definition file name should conform to the following format:
input_<device>_<test type>_test
e.g.:
input_sdb_rw_test
The output directory name should conform to the following format:
o_<device>_<test type>_test.tod
e.g.:
o_sdb_rw_test.tod
The suffix, .tod
, instructs VDBench to add the date and time of day as the suffix to the output directory. This helps to prevent test results data being overwritten on repeat test runs.