Testing HOWTO: Difference between revisions
Justinmiller (talk | contribs) |
Justinmiller (talk | contribs) (added additional application section) |
||
Line 7: | Line 7: | ||
This HOWTO uses CentOS 7.1 with Lustre 2.8-RC. | This HOWTO uses CentOS 7.1 with Lustre 2.8-RC. | ||
==== | ==== System Configuration ==== | ||
This HOWTO uses a cluster of six virtual machines to run the Lustre tests. Two clients, two MDS, two OSS. This enables testing of a wide variety of Lustre features. | This HOWTO uses a cluster of six virtual machines to run the Lustre tests. Two clients, two MDS, two OSS. This enables testing of a wide variety of Lustre features. | ||
Line 24: | Line 24: | ||
* node06 - client - 192.168.1.106 | * node06 - client - 192.168.1.106 | ||
==== | ==== System Setup ==== | ||
* Lustre installed on all nodes, same version | * Lustre installed on all nodes, same version | ||
* | ** Follow this guide to install Lustre RPMs - [https://wiki.hpdd.intel.com/display/PUB/Walk-thru-+Deploying+Lustre+pre-built+RPMs Walk-thru - Deploying Lustre pre-built RPMs] | ||
* Disable SELINUX | * Disable SELINUX | ||
** Set SELINUX=disabled in /etc/sysconfig/selinux | ** Set SELINUX=disabled in /etc/sysconfig/selinux | ||
* Disable the firewall | * Disable the firewall | ||
** service firewalld stop && systemctl disable firewalld.service | ** service firewalld stop && systemctl disable firewalld.service | ||
* Generate passwordless ssh keys for hosts and exchange identities across all nodes, while also accepting the host keys | |||
* Install the epel-release package to enable the EPEL repo | |||
==== Installing Additional Applications for Testing ==== | |||
* Install PDSH - mandatory | |||
** https://build.hpdd.intel.com/job/toolkit/arch=x86_64,distro=el7/lastSuccessfulBuild/artifact/_topdir/RPMS/x86_64/ | |||
* Install [https://dbench.samba.org/ dbench] - optional, used to gather baseline metrics of disks used in testing | |||
** Available from the EPEL repo, or from the Intel HPDD toolkit builds https://build.hpdd.intel.com/job/toolkit/arch=x86_64,distro=el7/lastSuccessfulBuild/artifact/_topdir/RPMS/x86_64/ | |||
* Install [http://www.iozone.org/ iozone] - optional, used to gather baseline metrics of filesystem performance | |||
** Available from Intel HPDD EL7 builds of the toolkit - https://build.hpdd.intel.com/job/toolkit/arch=x86_64,distro=el7/lastSuccessfulBuild/artifact/_topdir/RPMS/x86_64/ | |||
* Install Parallel IO Simulator (PIOS) - optional, used to simulate shared file and file per process IO | |||
** Available from the Intel HPDD EL6 builds of the toolkit - https://build.hpdd.intel.com/job/toolkit/arch=x86_64,distro=el6/lastSuccessfulBuild/artifact/_topdir/RPMS/x86_64/ | |||
==== Test Configuration ==== | ==== Test Configuration ==== |
Revision as of 08:18, 2 September 2015
This HOWTO is intended to demonstrate the basics of configuring and running a small subset of tests on a multi-node configuration. There are many tests available in the suite, and the principles demonstrated here will apply to them all, but this HOWTO will focus on the sanity test.
Note that there is a configuration distributed with the Lustre test suite, local.sh, that easily enables you to test Lustre on a single node using lookback devices with no additional configuration needed. This works well, and tests the Lustre software, but the purpose of this HOWTO is to demonstrate using multiple servers and clients to test more Lustre features in an environment representative of a real install.
While these examples do use virtual machines, they are merely examples and the specifics should be easy to apply to real hardware with the prerequisites setup.
This HOWTO uses CentOS 7.1 with Lustre 2.8-RC.
System Configuration
This HOWTO uses a cluster of six virtual machines to run the Lustre tests. Two clients, two MDS, two OSS. This enables testing of a wide variety of Lustre features.
- node01 - MGS and MDS - 192.168.1.101
- 512MB MGT - /dev/sdb
- 1GB MDT - /dev/sdc
- node02 - MDS - 192.168.1.102
- 1GB MDT - /dev/sdb
- node03 - OSS - 192.168.1.103
- Four 16GB OST - /dev/sdb, /dev/sdc, /dev/sdd, /dev/sde
- node04 - OSS - 192.168.1.104
- Four 16GB OST - /dev/sdb, /dev/sdc, /dev/sdd, /dev/sde
- node05 - client - 192.168.1.105
- 16GB shared directory for tests and results
- node06 - client - 192.168.1.106
System Setup
- Lustre installed on all nodes, same version
- Follow this guide to install Lustre RPMs - Walk-thru - Deploying Lustre pre-built RPMs
- Disable SELINUX
- Set SELINUX=disabled in /etc/sysconfig/selinux
- Disable the firewall
- service firewalld stop && systemctl disable firewalld.service
- Generate passwordless ssh keys for hosts and exchange identities across all nodes, while also accepting the host keys
- Install the epel-release package to enable the EPEL repo
Installing Additional Applications for Testing
- Install PDSH - mandatory
- Install dbench - optional, used to gather baseline metrics of disks used in testing
- Available from the EPEL repo, or from the Intel HPDD toolkit builds https://build.hpdd.intel.com/job/toolkit/arch=x86_64,distro=el7/lastSuccessfulBuild/artifact/_topdir/RPMS/x86_64/
- Install iozone - optional, used to gather baseline metrics of filesystem performance
- Available from Intel HPDD EL7 builds of the toolkit - https://build.hpdd.intel.com/job/toolkit/arch=x86_64,distro=el7/lastSuccessfulBuild/artifact/_topdir/RPMS/x86_64/
- Install Parallel IO Simulator (PIOS) - optional, used to simulate shared file and file per process IO
- Available from the Intel HPDD EL6 builds of the toolkit - https://build.hpdd.intel.com/job/toolkit/arch=x86_64,distro=el6/lastSuccessfulBuild/artifact/_topdir/RPMS/x86_64/
Test Configuration
Install this configuration file in /usr/lib64/lustre/tests/cfg/multinode.sh
# Enables verbose acc-sm output. VERBOSE=${VERBOSE:-"false"} # File system configuration FSNAME="testfs" FSTYPE=ldiskfs # Network configuration NETTYPE="tcp" # fact hosts mds_HOST=${mds_HOST:-node01} mgs_HOST=${mgs_HOST:-$mds_HOST} ost_HOST=${ost_HOST:-node03} # MDS and MDT configuration MDSCOUNT=2 SINGLEMDS=${SINGLEMDS:-mds1} MDSSIZE=8589000 MDS_FS_MKFS_OPTS=${MDS_FS_MKFS_OPTS:-} MDS_MOUNT_OPTS=${MDS_MOUNT_OPTS:-} MDSFSTYPE=ldiskfs mds1_HOST="node01" MDSDEV1="/dev/sdd" mds1_MOUNT="/mnt/testfs/mdt1" mds1_FSTYPE=ldiskfs mds2_HOST="node02" MDSDEV2="/dev/sdd" mds2_MOUNT="/mnt/testfs/mdt2" mds2_FSTYPE=ldiskfs # MGS and MGT configuration mgs_HOST=${mgs_HOST:-"$mds_HOST"} # combination mgs/mds MGSOPT=${MGSOPT:-} MGS_FS_MKFS_OPTS=${MGS_FS_MKFS_OPTS:-} MGS_MOUNT_OPTS=${MGS_MOUNT_OPTS:-} MGSFSTYPE=ldiskfs MGSDEV="/dev/sdb" MGSSIZE=536000 mgs_MOUNT="/mnt/testfs/mgt" MGSNID="192.168.1.101@tcp" mgs_FSTYPE=ldiskfs # OSS and OST configuration OSTCOUNT=${OSTCOUNT:-8} OSTSIZE=${OSTSIZE:-16777216} OSTFSTYPE=ldiskfs ost1_HOST="node03" OSTDEV1="/dev/sdb" ost1_MOUNT="/mnt/testfs/ost1" ost1_FSTYPE=ldiskfs ost2_HOST="node03" OSTDEV2="/dev/sdc" ost2_MOUNT="/mnt/testfs/ost2" ost2_FSTYPE=ldiskfs ost3_HOST="node03" OSTDEV3="/dev/sdd" ost3_MOUNT="/mnt/testfs/ost3" ost3_FSTYPE=ldiskfs ost4_HOST="node03" OSTDEV4="/dev/sde" ost4_MOUNT="/mnt/testfs/ost4" ost4_FSTYPE=ldiskfs ost5_HOST="node04" OSTDEV5="/dev/sdb" ost5_MOUNT="/mnt/testfs/ost5" ost5_FSTYPE=ldiskfs ost6_HOST="node04" OSTDEV6="/dev/sdc" ost6_MOUNT="/mnt/testfs/ost6" ost6_FSTYPE=ldiskfs ost7_HOST="node04" OSTDEV7="/dev/sdd" ost7_MOUNT="/mnt/testfs/ost7" ost7_FSTYPE=ldiskfs ost8_HOST="node04" OSTDEV8="/dev/sde" ost8_MOUNT="/mnt/testfs/ost8" ost8_FSTYPE=ldiskfs # OST striping configuration STRIPE_BYTES=${STRIPE_BYTES:-1048576} STRIPES_PER_OBJ=${STRIPES_PER_OBJ:-0} # Client configuration CLIENTCOUNT=2 CLIENTS="node05,node06" CLIENT1="node05" CLIENT2="node06" RCLIENTS="node06" MOUNT="/mnt/testfs" MOUNT1="/mnt/testfs" MOUNT2="/mnt/testfs2" DIR=${DIR:-$MOUNT} DIR1=${DIR:-$MOUNT1} DIR2=${DIR2:-$MOUNT2} # UID and GID configuration # Used by several tests to set the UID and GID if [ $UID -ne 0 ]; then log "running as non-root uid $UID" RUNAS_ID="$UID" RUNAS_GID=`id -g $USER` RUNAS="" else RUNAS_ID=${RUNAS_ID:-500} RUNAS_GID=${RUNAS_GID:-$RUNAS_ID} RUNAS=${RUNAS:-"runas -u $RUNAS_ID -g $RUNAS_GID"} fi # Software configuration PDSH="/usr/bin/pdsh -S -Rssh -w" FAILURE_MODE=${FAILURE_MODE:-SOFT} # or HARD POWER_DOWN=${POWER_DOWN:-"powerman --off"} POWER_UP=${POWER_UP:-"powerman --on"} SLOW=${SLOW:-no} FAIL_ON_ERROR=${FAIL_ON_ERROR:-true} # Debug configuration #PTLDEBUG=${PTLDEBUG:-"vfstrace rpctrace dlmtrace neterror ha config \ PTLDEBUG=${PTLDEBUG:-"vfstrace dlmtrace neterror ha config \ ioctl super lfsck"} SUBSYSTEM=${SUBSYSTEM:-"all -lnet -lnd -pinger"} # Lustre timeout TIMEOUT=${TIMEOUT:-"30"} # promise 2MB for every cpu if [ -f /sys/devices/system/cpu/possible ]; then _debug_mb=$((($(cut -d "-" -f 2 /sys/devices/system/cpu/possible)+1)*2)) else _debug_mb=$(($(getconf _NPROCESSORS_CONF)*2)) fi DEBUG_SIZE=${DEBUG_SIZE:-$_debug_mb} DEBUGFS=${DEBUGFS:-"/sbin/debugfs"} TMP=${TMP:-/tmp}