Lustre with ZFS Install

Introduction
This page is an attempt to provide some information on how to install Lustre with a ZFS backend. You are encouraged to add your own version, either as a separate section or by editing this page into a general guide.

Helpful links

 * http://zfsonlinux.org/lustre-configure-single.html
 * https://github.com/chaos/lustre/commit/04a38ba7 - ZFS and HA

SSEC Example
This version applies to systems with JBODs where ZFS manages the disk directly without a Dell Raid Controller in between. This guide is very specific for a single installation at UW SSEC: versions have changed, and we use puppet to provide various software packages and configurations. However, it is included as some information may be useful to others.


 * 1) Lustre Server Prep Work
 * 2) OS Installation (RHEL6)
 * 3) You must use the RHEL/Centos 6.4 Kernel 2.6.32-358
 * 4) Use the "lustre" kickstart option which installs a 6.4 kernel
 * 5) Define the host in puppet so that it is not a default host - NOTE: We Use Puppet at SSEC to distribute various required packages, other environments will vary!
 * 6) Lustre 2.4 installation
 * 7) Puppet Modules Needed
 * zfs-repo
 * lustre-healthcheck
 * ib-mellanox
 * check_mk_agent-ssec
 * puppetConfigFile
 * lustre-shutdown
 * nagios_plugins
 * lustre24-server-zfs
 * selinux-disable
 * 1) Configure Metadata Controller
 * 2) Map metadata drives to enclosures (with scripts to help)
 * 3) For our example mds system we made aliases for 'ssd0' ssd1 ssd2 and ssd3
 * 4) put these in /etc/zfs/vdev_id.conf - for example:
 * 5) alias arch03e07s6 /dev/disk/by-path/pci-0000:04:00.0-sas-0x5000c50056b69199-lun-0
 * 6) run udevadm trigger to load drive aliases
 * 7) On metadata controller, run mkfs.lustre to create metadata partition. On our example system:
 * 8) Use separate MGS for multiple filesystems on same metadata server.
 * 9) Separate MGS: mkfs.lustre --mgs --backfstype=zfs lustre-meta/mgs mirror d2 d3 mirror d4 d5
 * 10) Separate MDT: mkfs.lustre --fsname=arcdata1 --mdt --mgsnode=172.16.23.14@o2ib --backfstype=zfs lustre-meta/arcdata1-meta
 * 11) Create /etc/ldev.conf and add the metadata partition. On example system, we added:
 * 12) geoarc-2-15 - MGS 	zfs:lustre-meta/mgs geoarc-2-15 - arcdata-MDT0000 zfs:lustre-meta/arcdata-meta
 * 13) Create /etc/modprobe.d/lustre.conf
 * 14) options lnet networks="o2ib" routes="tcp metadataip@o2ib0 172.16.24.[220-229]@o2ib0"
 * 15) NOTE: if you do not want routing, or if you are having trouble with setup, the simple options lnet networks="o2ib" is fine
 * 16) Start Lustre. If you have multiple metadata mounts, you can just run service lustre start.
 * 17) Add lnet service to chkconfig and ensure on startup. We may want to leave lustre off on startup for metadata controllers.


 * 1) Configure OSTs
 * 2) Map drives to enclosures (with scripts to help!)
 * 3) Run udevadm trigger to load drive aliases.
 * 4) mkfs.lustre on MD1200s.
 * 5) Example RAIDZ2 on one MD1200: mkfs.lustre --fsname=cove --ost --backfstype=zfs --index=0 --mgsnode=172.16.24.12@o2ib lustre-ost0/ost0 raidz2 e17s0 e17s1 e17s2 e17s3 e17s4 e17s5 e17s6 e17s7 e17s8 e17s9 e17s10 e17s11
 * 6) Example RAIDZ2 with 2 disks from each enclosure, 5 enclosures (our cove test example): mkfs.lustre --fsname=cove --ost --backfstype=zfs --index=0 --mgsnode=172.16.24.12@o2ib lustre-ost0/ost0 raidz2 e13s0 e13s1 e15s0 e15s1 e17s0 e17s1 e19s0 e19s1 e21s0 e21s1
 * 7) Repeat as necessary for additional enclosures.
 * 8) Create /etc/ldev.conf
 * 9) Example on lustre2-8-11:
 * 10) lustre2-8-11 - cove-OST0000    zfs:lustre-ost0/ost0  lustre2-8-11 - cove-OST0001     zfs:lustre-ost1/ost1  lustre2-8-11 - cove-OST0002     zfs:lustre-ost2/ost2
 * 11) Start OSTs. Example: service lustre start. Repeat as necessary for additional enclosures.
 * 12) Add services to chkconfig and setup.
 * 13) Configure backup metadata controller (future)
 * 14) Mount the Lustre file system on clients
 * 15) Add entry to /etc/fstab. With our example system, our fstab entry is:
 * 16) 172.16.24.12@o2ib:/cove        /cove            lustre  defaults,_netdev,user_xattr       0 0
 * 17) Create empty folder for mountpoint, and mount file system (e.g., mkdir /cove; mount /cove).