SGPDD Survey: Difference between revisions
(Created page with "== Description == <code>sgpdd-survey</code> is an IO workload generator for benchmarking the performance of disk storage, generating large sequential IO workload on the targe...") |
Elliswilson (talk | contribs) No edit summary |
||
(2 intermediate revisions by one other user not shown) | |||
Line 55: | Line 55: | ||
! scope="row"| <code>rslt_loc</code> | ! scope="row"| <code>rslt_loc</code> | ||
| Directory to contain results. Must exist before benchmark is run | | Directory to contain results. Must exist before benchmark is run | ||
| <code>/var/tmp/sgpdd- | | <code>/var/tmp/sgpdd-survey_out</code> | ||
|- style="text-align: left; vertical-align: top;" | |- style="text-align: left; vertical-align: top;" | ||
! scope="row"| <code>scsidevs</code> | ! scope="row"| <code>scsidevs</code> | ||
Line 74: | Line 74: | ||
<pre style="overflow-x:auto;"> | <pre style="overflow-x:auto;"> | ||
mkdir -p /var/tmp/ | mkdir -p /var/tmp/sgpdd-survey_out | ||
crglo=1 crghi=256 \ | crglo=1 crghi=256 \ | ||
thrlo=1 thrhi=4096 \ | thrlo=1 thrhi=4096 \ | ||
size=51200 \ | size=51200 \ | ||
rslt_loc=/var/tmp/ | rslt_loc=/var/tmp/sgpdd-survey_out \ | ||
scsidevs="ct7-oss1:/dev/sdb ct7-oss1:/dev/sdc ct7-oss1:/dev/sdd" \ | scsidevs="ct7-oss1:/dev/sdb ct7-oss1:/dev/sdc ct7-oss1:/dev/sdd" \ | ||
sgpdd-survey | sgpdd-survey | ||
Line 146: | Line 146: | ||
! scope="row"| <code>rslt_loc</code> | ! scope="row"| <code>rslt_loc</code> | ||
| Directory to contain results. Must exist before benchmark is run. | | Directory to contain results. Must exist before benchmark is run. | ||
| <code>rslt_loc=/var/tmp/sgpdd- | | <code>rslt_loc=/var/tmp/sgpdd-survey_out</code> | ||
|- style="text-align: left; vertical-align: top;" | |- style="text-align: left; vertical-align: top;" | ||
! scope="row"| <code>scsidevs</code> | ! scope="row"| <code>scsidevs</code> | ||
Line 185: | Line 185: | ||
raw /dev/raw/raw1 0 0 | raw /dev/raw/raw1 0 0 | ||
</pre> | </pre> | ||
== References == | |||
* [https://doc.lustre.org/lustre_manual.xhtml#benchmark.sgpdd-survey Lustre Manual: Testing I/O Performance of Raw Hardware (sgpdd-survey)] | |||
[[Category:Benchmarking]] | [[Category:Benchmarking]] |
Latest revision as of 08:56, 2 December 2024
Description
sgpdd-survey
is an IO workload generator for benchmarking the performance of disk storage, generating large sequential IO workload on the target storage devices. It is a wrapper for the sgp_dd
(SCSI Generic Parallel DD) command found in the SCSI device utilities (sg3_utils
) package.
sgp_dd
is a scalable version of the “dd
” command with options for multi-threaded IO with different blocksizes and regionsizes. From the sgp_dd(8)
man page:
"[
sgp_dd
is] specialised for "files" that are Linux SCSI generic (sg) and raw devices. Similar syntax and semantics to dd(1) but does not perform any conversions. Uses POSIX threads to increase the amount of parallelism. This improves speed in some cases."
Purpose
SGPDD-survey is intended to evaluate the raw performance of all LUNs in the storage arrays attached to a server. The benchmark is useful in testing the overall throughput of a storage controller and is typically run after establishing the baseline performance of the individual drives and/or LUNs attached through the controller.
The logic behind this process flow is to determine whether or not the storage controller or HBA or some other higher-level component within the server platform introduces a performance bottleneck, at least for bandwidth. One can normally derive the theoretical bandwidth of the data path from the hardware specifications. Low-level specifications for the disk drives and the I/O paths will determine the theoretical maximum bandwidth of the storage subsystem. One must consider the bandwidth of the PCIe bus, HBA, storage controller and the individual storage devices (disk drives and SSDs). The slowest component in the IO path will determine the maximum throughput that can be obtained by the system overall. The sgpdd-survey
benchmark can help to determine how close to the theoretical bandwidth the system is operating and highlight any significant deviation from the ideal.
sgpdd-survey
is typically run within an individual host across all LUNs presented to that host. In high availability (HA) server configurations, where the storage will be presented in an active-passive failover configuration, the benchmark is normally run against the "primary" storage targets for each server only. That is, the LUNs are assumed to be mapped to their preferred servers.
Preparation
The sgpdd-survey
script is distributed in the lustre-iokit
package, and requires the sgp_dd program from the sg3_utils
package. Install the lustre-iokit
package on each of the machines that will be evaluated by the benchmark. On RHEL / CentOS systems, yum
will automatically resolve additional dependencies.
Configure all storage volumes into their production configuration and present the LUNs for use on the target host.
Benchmark Execution
The sgpdd-survey
script takes its parameters from environment variables established at run time. The parameters that are of most interest are as follows (refer to the section SGPDD-Survey Input Parameters for a detailed breakdown of the parameters and how to calculate suitable values):
Name | Description | Typical Values |
---|---|---|
crglo
|
Initial number of concurrent regions | 1
|
crghi
|
Maximum number of concurrent regions | 256
|
thrlo
|
Initial number of threads | 1
|
thrhi
|
Maximum number of threads | 4096
|
size
|
Data set size in MiB per storage device (LUN) | 2.5 * size(RAM) / count(LUNs)
|
rslt_loc
|
Directory to contain results. Must exist before benchmark is run | /var/tmp/sgpdd-survey_out
|
scsidevs
|
List of storage targets to test. Space separated list enclosed in quotes; each target has the format <hostname>:<device path> . Must choose one of scsidevs or rawdevs but not both
|
Example:
|
rawdevs
|
List of raw devices to test. Space separated list enclosed in quotes; each target has the format <hostname>:<device path> . Must choose one of scsidevs or rawdevs but not both
|
Example:
|
The scsidevs
option is typically chosen for hardware RAID storage systems, whereas the rawdevs
option is typically required for software RAID devices.
The following is an example command line for executing sgpdd-survey
:
mkdir -p /var/tmp/sgpdd-survey_out crglo=1 crghi=256 \ thrlo=1 thrhi=4096 \ size=51200 \ rslt_loc=/var/tmp/sgpdd-survey_out \ scsidevs="ct7-oss1:/dev/sdb ct7-oss1:/dev/sdc ct7-oss1:/dev/sdd" \ sgpdd-survey
SGPDD-Survey Input Parameters
Name | Description | Typical Values |
---|---|---|
crglo
|
The range of concurrent regions to exercise on each iteration, per device. Starting at crglo , the number of concurrent regions is doubled on each iteration until crghi is reached.
The More regions mean less performance, as this will increase the amount of seeking the storage devices need to do. |
crglo=1
|
thrlo
|
The range of threads to iterate over. For each value of the concurrent region range, run sgp_dd with the number of threads starting at thrlo until thrhi is reached. The thread count is doubled on every iteration of the thread count loop. The starting thread count is either thrlo or crglo , whichever is greater.
The |
thrlo=1
|
size
|
The data set size in MiB per storage device (LUN). The total dataset size for the entire benchmark is calculated as the LUN dataset size multiplied by the number of LUNs in the benchmark:
ds_total = size * count(LUNs) Set For a full benchmarking run, set size = ds_total / count(LUNs) where As an example, for a server with 32GB RAM and 4 LUNs, the input size is calculated as follows: size = (32GB * 2.5) / 4 LUNs = (32768MiB * 2.5) / 4 = 81920 / 4 = 20480 |
2.5 * sizeof(RAM) / count(LUNs), in MB
e.g.:
|
rslt_loc
|
Directory to contain results. Must exist before benchmark is run. | rslt_loc=/var/tmp/sgpdd-survey_out
|
scsidevs
|
List of storage targets to test. Space separated list enclosed in quotes; each target has the format <hostname>:<device path> . The host component is optional and can be used when sgpdd-survey is scaled across multiple servers (ref: LU-2043). Normally, sgpdd-survey benchmarking tasks are contained within individual hosts.
Must choose one of |
Example:
|
rawdevs
|
List of raw devices to test. Space separated list enclosed in quotes; each target has the format <hostname>:<device path> . When benchmarking software RAID, use the raw command to map the MD RAID device path to a raw device. See the MD RAID and raw devices section for more detailed information.
Must choose one of |
Example:
|
MD RAID and Raw Devices
Software RAID volumes created with MD RAID should be submitted to the benchmark using the rawdevs
input parameter, not scsidevs
. For this to work, bind the MD RAID block device to a raw device using the raw
command. e.g.:
raw /dev/raw/raw1 /dev/md0
This will create a raw binding for the MD RAID device /dev/md0
to /dev/raw/raw1
. The raw device name is arbitrary but take care not to overwrite an existing binding as the binding will be replaced without issuing a warning. One can also specify the major, minor numbers for the block device to be mapped. To list the existing raw bindings:
raw -qa
To remove a binding, map the raw device to major and minor device numbers 0 0, e.g.:
raw /dev/raw/raw1 0 0