Compiling Lustre

From Lustre Wiki
Revision as of 23:25, 30 May 2017 by Malcolm (talk | contribs) (→‎RHEL and CentOS 7: Install the Software Development Tools: -- new subsection to demonstrate how to restrict CentOS updates to specific, older, releases.)
Jump to navigation Jump to search

Introduction

Lustre is an open source software project, developed by a community of software engineers across the world. The project is maintained by its developers and is supported by infrastructure to provide the continual integration, build and test of patches. Patches that add new features, update existing functionality or fix bugs.

The Lustre project's maintainers issue periodic releases of the software, extensively and comprehensively tested and qualified for general use. The releases include pre-built binary software packages for supported Linux-based operating system distributions. Many users of Lustre are content to rely upon the binary builds, and in general, it is recommended that the binary distributions from the Lustre project are used. Pre-built binary packages are available for download here:

https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases

There are, of course, times when it is advantageous to be able to compile the Lustre software directly from source code, e.g. to apply a hot-fix patch for a recently uncovered issue, to test a new feature in development, or to allow Lustre to take advantage of a 3rd party device driver (vendor-supplied network drivers, for example).

Limitations

This documentation was originally developed to provide instructions for creating Lustre packages for use with Red Hat Enteprise Linux (RHEL) or CentOS.

Preliminary information on how to compile Lustre for SUSE Linux Enterprise Server (SLES) version 12 service pack 2 (SLES 12 SP2) has also been added. The documentation demonstrates the process for ZFS-based SLES servers, as well as for clients. The processes for compiling Listre on SLES with OFED or LDISKFS support have not been reviewed.

Other operating system distributions will be added over time.

Note: SUSE Linux will mark self-compiled kernel modules as unsupported by the operating system. By default, SLES will refuse to load kernel modules that do not have the supported flag set. the following is an example of the error that will be returned when attempting to load an unsupported kernel module:

sl12sp2-b:~ # modprobe zfs
modprobe: ERROR: module 'zavl' is unsupported
modprobe: ERROR: Use --allow-unsupported or set allow_unsupported_modules 1 in
modprobe: ERROR: /etc/modprobe.d/10-unsupported-modules.conf
modprobe: ERROR: could not insert 'zfs': Operation not permitted
sl12sp2-b:~ # vi /etc/modprobe.d/10-unsupported-modules.conf 

To allow self-compiled kernel modules to be loaded in a SLES OS, add the following entry into /etc/modprobe.d/10-unsupported-modules.conf:

allow_unsupported_modules 1

More information is available from the SUSE documentation

Planning

Since Lustre is a network-oriented file system that runs as modules in the Linux kernel, it has dependencies on other kernel modules, including device drivers. One of the most common tasks requiring a new build from source is to allow Lustre kernel modules to work with 3rd party device drivers not distributed by the operating system. For example, the Open Fabrics Enterprise Distribution (OFED) from the Open Fabrics Alliance (OFA) and OFA's partners provides drivers for InfiniBand and RoCE network fabrics, and is probably the single most common reason for recompiling Lustre.

There are several options available to users when creating Lustre packages from the source, each of which has an effect on the build process.

On the Lustre servers, one must choose the block storage file system used to store data. Lustre file system data is contained on block storage file systems distributed across a set of storage servers. The back-end block storage is abstracted by an API called the Object Storage Device, or OSD. The OSD enables Lustre to use different back-end file systems for persistence.

There is a choice between LDISKFS (based on EXT4) and ZFS OSDs, and Lustre must be compiled with support for at least one of these OSDs.

In addition, users must decide which drivers will be used by Lustre for networking. The Linux kernel has built-in support for Ethernet and InfiniBand, but systems vendors often supply their own device drivers and tools. Lustre's networking stack, LNet, needs to be able to link to these drivers, which requires re-compiling Lustre.

Before commencing, read through this document and decide the options that will be required for the Lustre build. The documentation will cover the following processes:

  1. Lustre with LDISKFS
  2. Lustre with ZFS
  3. Lustre with the following networking drivers:
    1. In-kernel drivers
    2. OpenFabrics Alliance (OFA) OFED
    3. Mellanox OFED
    4. Intel Fabrics

Establish a Build Environment

Compiling the Lustre software requires a computer with a comprehensive installation of software development tools. It is recommended that a dedicated build machine, separate from the intended installation targets is established to manage the build process. The build machine can be a dedicated server or virtual machine and should be installed with a version of the operating system that closely matches the target installation.

The build machine should conform to the following minimum specification:

  • Minimum 32GB storage to accommodate all source code and software development tools
  • Minimum 2GB RAM (for VMs -- more is better, of course)
  • Network interface with access to externally hosted software repositories
  • Supported Linux operating system distribution. Refer to the Lustre source code ChangeLog for specific information on OS distributions known to work with Lustre
  • Access to the relevant OS packages needed to compile the software described in this document. Typically, packages are made available via online repositories or mirrors.

This documentation was developed on a system with an Intel-compatible 64-bit (x86_64) processor architecture, since this represents the vast majority of deployed processor architectures running Lustre file systems today.

In addition to the normal requirements common to open source projects, namely compiler and development library dependencies, Lustre has dependencies on other projects that may also need to be created from source. This means that at some stages in the process of creating Lustre packages, other packages will be compiled and then also installed on the build server. Lustre itself can normally be created entirely without superuser privileges, once the build server is set up with standard software development packages, but projects such as ZFS do not support this method of working.

Nevertheless, every effort has been made to reduce the requirement for super-user privileges during the build process. For RHEL and CentOS users, the process in this document also includes a description of how to make use of a project called Mock, which creates a chroot jail within which to create packages.

Details of the Mock project can be found on GitHub:

https://github.com/rpm-software-management/mock

Use of Mock is optional. It brings its own compromises into the build process, and is used in a somewhat unorthodox way, compared to its traditional usage.

Create a user for managing the Builds

For the most part, super-user privileges are not required to create packages, although the user will be required to install software development tools, and some of the 3rd party software distributions expect their packages to be installed on the build host as well. We recommend using a regular account with some additional privileges (e.g. granted via sudo) to allow installation of packages creating during intermediate steps in the process.

RHEL and CentOS 7: Install the Software Development Tools

There are two options for managing the build environment for creating Lustre packages: use Mock to create an isolated chroot environment, or integrated directly with the build server's OS. Choose one or the other, based on your requirements. Each is described in the sections that follow.

Create a Mock Configuration for Lustre

Mock provides a simple way to isolate complex package builds without compromising the configuration of the host machine's operating platform. It is optional, but very useful, especially when experimenting with builds or working with multiple projects. The software is distributed with RHEL, CentOS and Fedora. Mock is normally used by developers to test RPM builds, starting from an SRPM package, but the environment can be used more generally as a development area.

To install Mock:

sudo yum -y install mock

Optionally, install Git, so that repositories can be cloned outside of the Mock chroot (this will simplify maintenance of the chroot environment):

sudo yum -y install git

Add any users that will be running Mock environments to the mock group:

sudo useradd -m <username>
sudo usermod -a -G mock[,wheel] <username>

The wheel group is optional and will allow the user to run commands with elevated privileges via sudo. Apply with caution, as this can potentially weaken the security of the build host.

The following example creates the user build:

sudo useradd -m build
sudo usermod -a -G mock build

When the software has been installed, create a configuration appropriate to the build target. Mock configurations are recorded in files in /etc/mock. The default configuration is called default.cfg, and is normally a soft link to one of the files in this directory. To use the system default configuration for RHEL or CentOS, run the following command:

ln -snf /etc/mock/centos-7-x86_64.cfg /etc/mock/default.cfg

These configuration files describe the set of packages and repos the chroot environment will have available when it is instantiated, and will automatically populate the chroot by downloading and installing packages from YUM repositories. The configuration can be customised so that it is tailored to the requirements of the user. Refer to the mock(1) manual page for more information.

To create a new configuration specific to the requirements for compiling Lustre and also incorporating requirements for compiling ZFS and OFED, run the following commands:

# Create a copy of the default CentOS 7 x86_64 Mock template and add the source repos
sr=`cat /etc/yum.repos.d/CentOS-Sources.repo` \
awk '/^"""$/{print ENVIRON["sr"]; printf "\n%s\n",$0;i=1}i==0{print}i==1{i=0}' \
/etc/mock/centos-7-x86_64.cfg > /etc/mock/lustre-c7-x86_64.cfg
 
# Change the config name. Populate the Mock chroot with prerequisite packages.
sed -i -e 's/\(config_opts\['\''root'\''\]\).*/\1 = '\''lustre-c7-x86_64'\''/' \
-e 's/\(config_opts\['\''chroot_setup_cmd'\''\]\).*/\1 = '\''install bash bc openssl gettext net-tools hostname bzip2 coreutils cpio diffutils system-release findutils gawk gcc gcc-c++ grep gzip info make patch redhat-rpm-config rpm-build yum-utils sed shadow-utils tar unzip util-linux wget which xz automake git xmlto asciidoc elfutils-libelf-devel zlib-devel binutils-devel newt-devel python-devel hmaccalc perl-ExtUtils-Embed patchutils pesign elfutils-devel bison audit-libs-devel numactl-devel pciutils-devel ncurses-devel libtool libselinux-devel flex tcl tcl-devel tk tk-devel glib2 glib2-devel libuuid-devel libattr-devel libblkid-devel systemd-devel device-mapper-devel parted lsscsi ksh'\''/' \
/etc/mock/lustre-c7-x86_64.cfg
 
# Modify the %_rpmdir RPM macro to prevent build failures.
echo "config_opts['macros']['%_rpmdir'] = \"%{_topdir}/RPMS/%{_arch}\"" >> /etc/mock/lustre-c7-x86_64.cfg
 
# Make the new configuration the default
ln -snf /etc/mock/lustre-c7-x86_64.cfg /etc/mock/default.cfg

This configuration ensures that each time a new Mock environment is created, all of the Lustre build dependencies are automatically downloaded and installed.

Note: Some of the build scripts and Makefiles used by Lustre and other projects assume that there will always be an architecture sub-directory (e.g. x86_64) in the RPM build directories. This is not always the case. In particular, Mock does not create sub-directories based on target architecture. To work around this problem, a custom RPM macro was added into the mock configuration above. If this does not work, then the same macro can be added by hand by running the following command after creating a Mock chroot environment:

mock --shell "echo '%_rpmdir %{_topdir}/RPMS/%{_arch}' >>\$HOME/.rpmmacros"

For information, an example of the build error for SPL manifests as follows:

cp: cannot stat ‘/tmp/spl-build-root-uDSQ5Bay/RPMS/*/*’: No such file or directory
make[1]: *** [rpm-common] Error 1
make[1]: Leaving directory `/builddir/spl'
make: *** [rpm-utils] Error 2

Once the Mock configuration has been created, login as the user that will be managing the builds and then run the following command to prepare the chroot environment:

mock [-r <config>] --init

The -r flag specifies the configuration to use, if the default configuration is unsuitable. This is either the name of one of the files in the /etc/mock directory, minus the .cfg suffix, or the name of a file.

To work interactively within the Mock environment, launch a shell:

mock [-r <config>] --shell

Note: some mock commands will attempt to clean the chroot directory before executing. This will remove any files considered temporary by Mock, which means anything that Mock itself has not provisioned. To avoid this situation, use the -n flag. The --shell command does not run a clean operation, so the -n flag is not required.

Development Software Installation for Normal Build Process

Skip this step if Mock is being used to create the Lustre packages.

Use the following command to install the prerequisite software tools on the build server:

sudo yum install gcc automake make git bc net-tools xmlto asciidoc \
elfutils-libelf-devel zlib-devel binutils-devel newt-devel python-devel \
hmaccalc perl-ExtUtils-Embed rpm-build yum-utils redhat-rpm-config \
patchutils pesign elfutils-devel bison audit-libs-devel numactl-devel \
pciutils-devel ncurses-devel libtool libselinux-devel gcc-c++ bison flex \
tcl tcl-devel tk tk-devel glib2 glib2-devel wget libuuid-devel \
libattr-devel libblkid-devel systemd-devel device-mapper-devel \
parted lsscsi ksh

The packages in the above list are sufficient to build Lustre, ZFS and 3rd party drivers derived from OFED.

CentOS: Constraining YUM to Older OS Releases Using the Vault Repositories

It is the convention with RHEL and CentOS to always retrieve that latest updates for a given major OS release when installing or upgrading software. YUM's repository definitions purposely refer to the latest upstream repositories in order to minimise the risk of users downloading obsolete packages. However, this behaviour is not always desirable. Installation and upgrade policies for a given organisation may impose restrictions on the operating platform, which may extend to mandating specific package versions for applications, including the kernel. A site may have frozen the operating system version to a specific revision or range of updates, or may have restrictions imposed upon them by the application software running on their infrastructure.

This, in turn, affects the environment for building packages, including Lustre. If the run-time environment is bound to a specific OS release, so must the build environment be similarly restricted.

To facilitate this restriction in CentOS, one can leverage the CentOS Vault repository (http://vault.centos.org), which maintains an online archive of every package and update released for every version of CentOS. Every CentOS installation includes a package called centos-release used to track the OS version and provide the YUM repository definitions. The package includes a definition for the Vault repositories available for versions of CentOS prior to the version currently installed. For example, the centos-release package for CentOS 6.9 will include Vault repository definitions for CentOS 6.0 - 6.8.

This can be exploited to help constrain the build server environment such that it matches the intended target environment. The simplest way to do this is to download the latest centos-release rpm, extract the CentOS Vault repository definition and overwrite the original Vault definition on the platform. Once in place, disable the default repositories in YUM, and enable only the Vault repositories for the target OS version. For example:

# Download and install an updated Vault definition:
mkdir $HOME/tmp
cd $HOME/tmp
yumdownloader centos-release
rpm2cpio centos-release*.rpm | cpio -idm
cp etc/yum.repos.d/CentOS-Vault.repo /etc/yum.repos.d/.

# Configure YUM to use only the repositories for the current OS:
yum-config-manager --disable \*

# Get the current OS major and minor version
ver=`sed 's/[^0-9.]*//g' /etc/centos-release`
# Enable the Vault repos that match the OS version
yum-config-manager --enable C$ver-base,C$ver-extras,C$ver-updates

Note: The centos-release package is not itself updated, as this can cause applications and software build processes that depend on correctly identifying the OS version to fail. The purpose of the above approach is to update YUM only, but otherwise maintain the OS version and release of the build environment.

SLES 12: Install the Software Development Tools

SUSE Linux Enterprise Server (SLES), like Red Hat Enterprise Linux, uses an RPM-based package management system, although there are some significant differences between the two platforms.

Use the following command to install the prerequisite software on an SLES 12sp2 build server:

sudo zypper install asciidoc automake bc binutils-devel bison bison \
device-mapper-devel elfutils libelf-devel flex gcc gcc-c++ git \
glib2-tools glib2-devel hmaccalc  libattr-devel libblkid-devel \
libselinux-devel libtool libuuid-devel lsscsi make mksh ncurses-devel \
net-tools numactl parted patchutils pciutils-devel perl pesign \
python-devel rpm-build sysstat systemd-devel tcl tcl-devel tk tk-devel \
wget xmlto zlib-devel

Obtain the Lustre Source Code

The following information applies to the Lustre community releases. To acquire the source code for other distributions of Lustre, such as Intel Enterprise Edition for Lustre, please refer to the vendor's documentation.

The Lustre source code is maintained in a Git repository. To obtain a clone, run the following command:

# Mock users: run "mock --shell" first
cd $HOME
git clone git://git.hpdd.intel.com/fs/lustre-release.git

When the repository has been cloned, change into the clone directory and review the branches:

cd $HOME/lustre-release
git branch -av

For example:

[build@ctb-el73 lustre-release]$ git branch -av
* master                    fc7c513 LU-9306 tests: more debug info for hsm test_24d
  remotes/origin/HEAD       -> origin/master
  remotes/origin/b1_8       cfcb628 LU-4090 kernel: a kernel patch for jbd2 hung
  remotes/origin/b2_1       e91f649 LU-3546 mdt: define mdt_obd_name()
  remotes/origin/b2_2       25a1427 LU-549 llite: Improve statfs performance if selinux is disabled
  remotes/origin/b2_3       b491780 2.3.0-RC6
  remotes/origin/b2_4       d00e4d6 LU-4222 mdt: extra checking for getattr RPC.
  remotes/origin/b2_5       35bb857 LU-0000 build: update build version to 2.5.3.90
  remotes/origin/b2_6       73ea776 New tag 2.6.0-RC2
  remotes/origin/b2_7       8eb2659 New tag 2.7.0-RC4
  remotes/origin/b2_8       ea79df5 New tag 2.8.0-RC5
  remotes/origin/b2_9       e050996 New Lustre release 2.9.0
  remotes/origin/dne2       6354216 LU-5349 include: use __vmalloc() to avoid __GFP_FS default
  remotes/origin/master     fc7c513 LU-9306 tests: more debug info for hsm test_24d
  remotes/origin/multi-rail aa32cc5 LU-8998 pfl: PFL feature implementation
  remotes/origin/pfl        3e66aeb LU-9335 pfl: calculate PFL file LOVEA correctly

The master branch is the main development branch and will form the basis of the next feature release of Lustre. Branches that begin with the lower case letter "b" represent the current and previous Lustre releases, denoted by version number. Thus, b2_9 is the Lustre 2.9.0 branch. Other branches are used to encapsulate long-running development projects such as Progressive File Layouts (PFL) and LNet Multi-rail.

Review the tags as follows:

git tag

There are many more tags than there are branches. Each tag represents an inflection point in development. Lustre version numbers have four fields read from left to right to indicate major, minor, maintenance and hot fix version numbers respectively. For example, version 2.9.0.0 is interpreted as follows:

  • major feature release number
  • minor feature release number
  • maintenance releases number
  • hot fix release number

A maintenance release version number of 0 (zero) indicates that the version is complete and is ready for general use (also referred to as generally available, or GA), and maintenance versions <=10 represent maintenance releases (bug fixes or minor operating system support updates). Tags with a maintenance version greater than 50 are development tags not to be considered for general use.

The tag labels in the lustre-release repository have two different formats:

  • A dot-separated numerical version number (e.g. 2.9.0)
  • A label beginning with lower-case "v" followed by the version number, separated by underscores (e.g. v2_9_0_0)

The different tag formats for a given version number are equivalent and refer to the same point in the git repository history. That is, tags v2_9_0 and 2.9.0 refer to the same commit.

For example, the following tags represent the generally available release of Lustre version 2.9.0:

2.9.0
v2_9_0
v2_9_0_0

The next list of tags all point to the same development build:

2.9.56
v2_9_56
v2_9_56_0

Tags ending with the letters "RC" are release candidates: these are pre-production builds made for testing in anticipation of a final generally available (GA) release. If a release candidate is considered to be stable enough for general use, it is promoted to GA. There may be one or several RC builds before GA is declared for a given version of Lustre.

Use Git to checkout the Lustre release version that will be built. For example, to checkout Lustre version 2.9.0:

git checkout 2.9.0

or

git checkout v2_9_0_0

Prepare the build:

sh autogen.sh

Lustre source code is also available in package format, distributed alongside the binaries for a release. The latest software releases are available from the following URL:

https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases

This page has links to all of the releases. For example, the source RPM for the latest Lustre release on RHEL or CentOS 7 can be downloaded here:

https://downloads.hpdd.intel.com/public/lustre/latest-release/el7/server/SRPMS/

Note: the examples used in the remainder of the documentation are based on a preview version of Lustre version 2.10.0 but the process applies equally to all recent Lustre releases.

LDISKFS Patched Linux Kernel

If the Lustre servers will be using the LDISKFS object storage device (OSD) target, which is itself derived from EXT4, then it is necessary to modify the operating system kernel with some patches. These patches are there to support extensions to the EXT4 device driver, creating the LDISKFS OSD. For the most part, these patches provide performance enhancements or additional hooks useful for testing. In addition, project quota support requires a set of patches that must be applied to the kernel. If project quota support is required, then these patches are essential.

The Lustre community continues to work to reduce the dependency on maintaining LDISKFS patches and it is hoped that at some point in the future, they will be entirely unnecessary. For the time being, however, Lustre's LDISKFS OSD requires modifications to the kernel source code.

Note: Lustre does not require a ptached kernel if the ZFS OSD is used. Lustre installations that use ZFS exclusively do not require a customised kernel.

Note: Lustre clients do not require a patched kernel.

Obtain the Kernel Source Code

If the target build will be based on LDISKFS storage targets, download the kernel sources appropriate to the OS distribution. Refer to the changelog in the Lustre source code for the list of kernels for each OS distribution that are known to work with Lustre. The changelog maintains a historical record for all Lustre releases.

The following excerpt shows the kernel support for Lustre version 2.10.0:

TBD Intel Corporation
       * version 2.10.0
       * See https://wiki.hpdd.intel.com/display/PUB/Lustre+Support+Matrix
         for currently supported client and server kernel versions.
       * Server known to build on patched kernels:
         2.6.32-431.29.2.el6 (RHEL6.5)
         2.6.32-504.30.3.el6 (RHEL6.6)
         2.6.32-573.26.1.el6 (RHEL6.7)
         2.6.32-642.15.1.el6 (RHEL6.8)
         2.6.32-696.el6      (RHEL6.9)
         3.10.0-514.16.1.el7 (RHEL7.3)
         3.0.101-0.47.71     (SLES11 SP3)
         3.0.101-97          (SLES11 SP4)
         3.12.69-60.64.35    (SLES12 SP1)
         4.4.49-92.14        (SLES12 SP2)
         vanilla linux 4.6.7 (ZFS only)
       * Client known to build on unpatched kernels:
         2.6.32-431.29.2.el6 (RHEL6.5)
         2.6.32-504.30.3.el6 (RHEL6.6)
         2.6.32-573.26.1.el6 (RHEL6.7)
         2.6.32-642.15.1.el6 (RHEL6.8)
         2.6.32-696.el6      (RHEL6.9)
         3.10.0-514.16.1.el7 (RHEL7.3)
         3.0.101-0.47.71     (SLES11 SP3)
         3.0.101-97          (SLES11 SP4)
         3.12.69-60.64.35    (SLES12 SP1)
         4.4.49-92.14        (SLES12 SP2)
         vanilla linux 4.6.7

In the above list, Lustre version 2.10.0 supports version 3.10.0-514.16.1.el7 of the RHEL / CentOS 7.3 kernel. Use YUM to download a copy of the source RPM. For example:

cd $HOME
yumdownloader --source  kernel-3.10.0-514.16.1.el7

The following shell script fragment can be used to identify the kernel version for a given operating system and Lustre version, and then use that to download the kernel source:

cd $HOME
kernelversion=`os=RHEL7.3 lu=2.10.0 \
awk '$0 ~ "* version "ENVIRON["lu"]{i=1; next} \
$0 ~ "* Server known" && i {j=1; next} \
(/\*/ && j) || (/\* version/ && i) {exit} \
i && j && $0 ~ ENVIRON["os"]{print $1}' $HOME/lustre-release/lustre/ChangeLog`
[ -n "$kernelversion" ] && yumdownloader --source  kernel-$kernelversion || echo "ERROR: kernel version not found."

Set the os and lu variables at the beginning of the script to the required operating system release and Lustre version respectively.

If Mock is being used to build Lustre, you can download the source RPM from outside the mock shell and then copy it in as follows:

mock --copyin <package> /builddir/.

For example:

mock --copyin kernel-3.10.0-514.16.1.el7.src.rpm /builddir/.

An alternative solution for Mock is to enable the CentOS-Source repository configuration, then run the yumdownloader command directly from the Mock shell. A simple, but crude way to add the source repositories into Mock's YUM configuration is to run the following from the Mock shell:

cat /etc/yum.repos.d/CentOS-Sources.repo >> /etc/yum/yum.conf

However, this will be overwritten on the next invocation of the mock shell. One can permanently update the configuration by appending the CentOS source repositories to the appropriate configuration file in the /etc/mock directory on the build host, and this what was done when preparing the Mock configuration earlier.

If it is necessary to create a build for an older kernel version, it might not be available in the active YUM repository for the distribution. CentOS maintains an archive or all previous releases in a set of YUM repositories called Vault. The CentOS Vault is located at:

http://vault.centos.org

The Vault includes source RPMS, as well as binaries. Unfortunately, CentOS does not include YUM configuration descriptions for the archived source repositories. Instead of YUM, go the the Vault site directly and navigate through the directory structure to get the required files. For example, the source RPMS for the CentOS 7.2 package updates can be found here:

http://vault.centos.org/7.2.1511/updates/Source/

Prepare the Kernel Source

Install the kernel source RPM that was downloaded in the previous step. This will create a standard RPM build directory structure and extract the contents of the source RPM:

cd $HOME
rpm -ivh kernel-[0-9].*.src.rpm

Determine the set of patches that need to be applied to the kernel, based on the operating system distribution. The file lustre-release/lustre/kernel_patches/which_patch maps the kernel version to the appropriate patch series. For example, for RHEL / CentOS 7.3 on Lustre 2.10.0, the file contains:

3.10-rhel7.series       3.10.0-514.16.1.el7 (RHEL 7.3)

Review the list of patches in the series, e.g.:

[build@ctb-el7 ~]$ cat $HOME/lustre-release/lustre/kernel_patches/series/3.10-rhel7.series
raid5-mmp-unplug-dev-3.7.patch
dev_read_only-3.7.patch
blkdev_tunables-3.8.patch
jbd2-fix-j_list_lock-unlock-3.10-rhel7.patch
vfs-project-quotas-rhel7.patch

Note: one of the new features introduced with Lustre 2.10 is support for project quotas. This is a powerful administration feature that allows for additional quotas to be applied to the file system based on a new identifier called a project ID. To implement project quotas for LDISKFS means making a change to EXT4 code in the kernel. Unfortunately, this particular change breaks the kernel ABI (KABI) compatibility guarantee that is a feature of RHEL kernels. If this is a concern, then remove the patch called vfs-project-quotas-rhel7.patch from the patch series file. This action will effectively disable project quota support from Lustre LDISKFS builds.

When the correct patch series has been identified, create a patch file containing all of the kernel patches required by Lustre's LDISKFS OSD:

_TOPDIR=`rpm --eval %{_topdir}`
for i in `cat $HOME/lustre-release/lustre/kernel_patches/series/3.10-rhel7.series`; do
cat $HOME/lustre-release/lustre/kernel_patches/patches/$i
done > $_TOPDIR/SOURCES/patch-lustre.patch

Apply the following changes to the Kernel RPM spec file:

_TOPDIR=`rpm --eval %{_topdir}`
sed -i.inst -e '/find $RPM_BUILD_ROOT\/lib\/modules\/$KernelVer/a\
    cp -a fs/ext3/* $RPM_BUILD_ROOT/lib/modules/$KernelVer/build/fs/ext3 \
    cp -a fs/ext4/* $RPM_BUILD_ROOT/lib/modules/$KernelVer/build/fs/ext4' \
-e '/^# empty final patch to facilitate testing of kernel patches/i\
Patch99995: patch-lustre.patch' \
-e '/^ApplyOptionalPatch linux-kernel-test.patch/i\
ApplyOptionalPatch patch-lustre.patch' \
-e '/^%define listnewconfig_fail 1/s/1/0/' \
$_TOPDIR/SPECS/kernel.spec

These modifications ensure that the patches that Lustre requires for the LDISKFS OSD are applied to the kernel during compilation.

The following changes to the kernel configuration specification are also strongly recommended:

CONFIG_FUSION_MAX_SGE=256
CONFIG_SCSI_MAX_SG_SEGMENTS=128

To apply these changes, run the following commands from the command shell:

_TOPDIR=`rpm --eval %{_topdir}`
sed -i.inst -e 's/\(CONFIG_FUSION_MAX_SGE=\).*/\1256/' \
-e 's/\(CONFIG_SCSI_MAX_SG_SEGMENTS\)/\1128/' \
$_TOPDIR/SOURCES/kernel-3.10.0-x86_64.config
! `grep -q CONFIG_SCSI_MAX_SG_SEGMENTS $_TOPDIR/SOURCES/kernel-3.10.0-x86_64.config.inst` && \
echo "CONFIG_SCSI_MAX_SG_SEGMENTS=128" >> $_TOPDIR/SOURCES/kernel-3.10.0-x86_64.config

Alternatively, there is a kernel.config file distributed with the Lustre source code that can be used in place of the standard file distributed with the kernel. If using a file from the Lustre source, make sure that the first line of the file is as follows:

# x86_64

The following script demonstrates the method for a RHEL / CentOS 7.3 kernel configuration:

_TOPDIR=`rpm --eval %{_topdir}`
echo '# x86_64' > $_TOPDIR/SOURCES/kernel-3.10.0-x86_64.config
cat $HOME/lustre-release/lustre/kernel_patches/kernel_configs/kernel-3.10.0-3.10-rhel7-x86_64.config >> $_TOPDIR/SOURCES/kernel-3.10.0-x86_64.config

Create the kernel RPM packages

Use the following command to build the patched Linux kernel:

_TOPDIR=`rpm --eval %{_topdir}`
rpmbuild -ba --with firmware --with baseonly \
[--without debuginfo] \
[--without kabichk] \
--define "buildid _lustre" \
--target x86_64 \
$_TOPDIR/SPECS/kernel.spec

Note: the "--with baseonly" flag means that only the essential kernel packages will be created and the "debug" and "kdump" options will be excluded from the build. If the project quotas patch is used, the KABI verification must also be disabled using the "--without kabichk" flag.

Save the Kernel RPMs

Copy the resulting kernel RPM packages into a directory tree for later distribution:

_TOPDIR=`rpm --eval %{_topdir}`
mkdir -p $HOME/releases/lustre-kernel
mv $_TOPDIR/RPMS/*/{kernel-*,python-perf-*,perf-*} $HOME/releases/lustre-kernel

ZFS

Lustre servers that will be using ZFS-based storage targets require packages from the ZFS on Linux project (http://zfsonlinux.org). The Linux port of ZFS is developed in cooperation with the OpenZFS project and is a versatile and powerful alternative to EXT4 as a file system target for Lustre OSDs. The source code is hosted on GitHub:

https://github.com/zfsonlinux

There are three options for creating ZFS on Linux packages:

  1. DKMS: packages are distributed as source code and compiled on the target against the installed kernel[s]. When an updated kernel is installed, DKMS-compatible modules will be recompiled to work with the new kernel. The module rebuild is usually triggered automatically on system reboot, but can also be invoked directly from the command-line
  2. KMOD: kernel modules built for a specific kernel version and bundled into a binary package. These modules are not portable between kernel versions, so a change in kernel version requires that the kernel modules are recompiled and re-installed.
  3. KMOD with kernel application binary interface (KABI) compatibility, sometimes referred to as "weak-updates" support. KABI-compliant kernel modules exploit a feature available in certain operating system distributions, such as RHEL, that ensure ABI compatibility across kernel updates in the same family of releases. If a minor kernel update is installed, the KABI guarantee means that modules that were compiled against the older variant can be loaded unmodified by the new kernel without requiring re-compilation from source.

The process for compiling ZFS and SPL is thoroughly documented on the ZFSonLinux GitHub site, but will be summarised here, as compiling ZFS has an implication on the Lustre build process. Each approach has its benefits and drawbacks.

DKMS provides a straightforward packaging system and attempts to accommodate changes in the operating system by automatically rebuilding kernel modules, reducing manual overhead when updating OS kernels. DKMS packages are also generally easy to create and distribute.

The KMOD packages take more work to create, but are easier to install. However, when the kernel is updated, the modules may need to be recompiled. KABI-compliant kernel modules reduce this risk by providing ABI compatibility across minor updates, but only work for some distributions (currently RHEL and CentOS).

The premise of DKMS is simple: each time the OS kernel of a host is updated, DKMS will rebuild any out of tree kernel modules so that they can be loaded by the new kernel. This can be managed automatically on the next system boot, or can be triggered on demand. This does mean that the run-time environment of Lustre servers running ZFS DKMS modules is quite large, as it needs to include a compiler and other development libraries, but it also means that creating the packages for distribution is quick and simple.

Unfortunately, even the simple approach has its idiosyncrasies. You cannot build the DKMS packages for distribution without also building at least the SPL development packages, since the ZFS build depends on SPL, and the source code is simply not sufficient by itself.

There is also a cost associated with recompiling kernel modules from source that needs to be planned for. In order to be able to recompile the modules, DKMS packages require a full software development toolkit and dependencies to be installed on all servers. This does represent a significant overhead for servers, and is usually seen as undesirable for production environments, where there is often an emphasis placed on minimising the software footprint in order to streamline deployment and maintenance, and reduce the security attack surface.

Rebuilding packages also takes time, which will lengthen maintenance windows. And there is always some risk that rebuilding the modules will fail for a given kernel release, although this is rare. DKMS lowers the up-front distribution overhead, but moves some of the cost of maintenance directly onto the servers and the support organisations maintaining the data centre infrastructure.

When choosing DKMS, it is not only the ZFS and SPL modules that need to be recompiled, but also the Lustre modules. To support this, Lustre can also be distributed as a DKMS package.

Note: The DKMS method was in part adopted in order to work-around licensing compatibility issues between the Linux Kernel project, licensed under GPL, and ZFS which is licensed under CDDL, with respect to the distribution of binaries. While both licenses are free open source licenses, there are restrictions on distribution of binaries created using a combination of software source code from projects with these different licenses. There is no restriction on the separate distribution of source code, however. The DKMS modules provide a convenient workaround that simplifies packaging and distribution of the ZFS source with Lustre and Linux kernels. There are differences of opinion in the open source community regarding packaging and distribution, and currently no consensus has been reached.

The vanilla KMOD build process is straightforward to execute and will generally work for any supported Linux distribution. The KABI variant of the KMOD build is very similar with the restriction that it is only useful for distributions that support KABI compatibility. The KABI build is also has some hard-coded directory paths in the supplied RPM spec files, which has effectively mandated a dedicated build environment for creating packages.

Obtain the ZFS Source Code

If the target build will be based on ZFS, then acquire the ZFS software sources from the ZFS on Linux project. ZFS is comprised of two projects:

  • SPL: Solaris portability layer. This is a shim that presents ZFS with a consistent interface and allows OpenZFS to be ported to multiple operating systems.
  • ZFS: The OpenZFS file system implementation for Linux.

Clone the SPL and ZFS repositories as follows:

# Mock users run "mock --shell" first
cd $HOME
git clone https://github.com/zfsonlinux/spl.git
git clone https://github.com/zfsonlinux/zfs.git

When the repositories have been cloned, change into the clone directory of each project and review the branches:

cd $HOME/spl
git branch -av
 
cd $HOME/zfs
git branch -av

For example:

[build@ctb-el73 spl]$ cd $HOME/spl
[build@ctb-el73 spl]$ git branch -av
* master                           8f87971 Linux 4.12 compat: PF_FSTRANS was removed
  remotes/origin/HEAD              -> origin/master
  remotes/origin/master            8f87971 Linux 4.12 compat: PF_FSTRANS was removed
  remotes/origin/spl-0.6.3-stable  ce4c463 Tag spl-0.6.3-1.3
  remotes/origin/spl-0.6.4-release c8acde0 Tag spl-0.6.4.1
  remotes/origin/spl-0.6.5-release b5bed49 Prepare to release 0.6.5.9

The master branch In each project is the main development branch and will form the basis of the next release of SPL and ZFS, respectively.

Review the tags as follows:

git tag

Just like the Lustre project, there are many more tags than there are branches, although the naming convention is simpler. Tags have the format <name>-<version>. The following output lists some of the tags in the spl repository:

[build@ctb-el73 spl]$ git tag | tail -8
spl-0.6.5.6
spl-0.6.5.7
spl-0.6.5.8
spl-0.6.5.9
spl-0.7.0-rc1
spl-0.7.0-rc2
spl-0.7.0-rc3
spl-0.7.0-rc4

Tags with an rc# suffix are release candidates.

Use Git to checkout the release version of SPL and ZFS that will be built and then run the autogen.sh script to prepare the build environment. For example, to checkout SPL version 0.6.5.9:

cd $HOME/spl
git checkout spl-0.6.5.9
sh autogen.sh

To check out SPL version 0.7.0-rc4:

cd $HOME/spl
git checkout spl-0.7.0-rc4
sh autogen.sh

Do the same for ZFS. for example:

cd $HOME/zfs
git checkout zfs-0.6.5.9
sh autogen.sh

For ZFS 0.7.0-rc4:

cd $HOME/zfs
git checkout zfs-0.7.0-rc4
sh autogen.sh

Make sure that the SPL and ZFS versions match for each respective checkout.

The ZFS on Linux source code is also available in the package format distributed alongside the binaries for a release. The latest software releases are available from the following URL:

https://github.com/zfsonlinux/

Links are also available on the main ZFS on Linux site:

http://zfsonlinux.org/

Note: the examples used in the remainder of the documentation are based on a release candidate version of ZFS version 0.7.0, but the process applies equally to all recent releases.

Install the Kernel Development Package

The SPL and ZFS projects comprise kernel modules as well as user-space applications. To compile the kernel modules, install the kernel development packages relevant to the target OS distribution. This must match the kernel version being used to create the Lustre packages. Review the ChangeLog file in the Lustre source code to identify the appropriate kernel version.

The following excerpt shows that Lustre version 2.10.0 supports version 3.10.0-514.16.1.el7 of the RHEL / CentOS 7.3 kernel, and version 4.4.49-92.14 of the SLES 12 SP2 kernel (output has been truncated):

TBD Intel Corporation
       * version 2.10.0
       * See https://wiki.hpdd.intel.com/display/PUB/Lustre+Support+Matrix
         for currently supported client and server kernel versions.
       * Server known to build on patched kernels:
...
         3.10.0-514.16.1.el7 (RHEL7.3)
...
         4.4.49-92.14        (SLES12 SP2)
...

Note: it is also possible to compile the SPL and ZFS packages against the LDISKFS patched kernel development tree, in which case, substitute the kernel development packages from the OS distribution with those created with the LDISKFS patches.

RHEL and CentOS

For RHEL / CentOS systems, use YUM to install the kernel-devel RPM. For example:

sudo yum install kernel-devel-3.10.0-514.16.1.el7

If Mock is being used to create packages, install the kernel-devel RPM using the mock --install command:

mock --install kernel-devel-3.10.0-514.16.1.el7

Note: you can, in fact, run YUM commands within the mock shell, as well.

Note: similar to the way in which the kernel source can be automatically identified and installed for the LDISKFS patched kernel, the following shell script fragment can be used to identify the kernel version for a given operating system and Lustre version, and then use that to install the kernel-devel package:

SUDOCMD=`which sudo 2>/dev/null`
kernelversion=`os=RHEL7.3 lu=2.10.0 \
awk '$0 ~ "* version "ENVIRON["lu"]{i=1; next} \
$0 ~ "* Server known" && i {j=1; next} \
(/\*/ && j) || (/\* version/ && i) {exit} \
i && j && $0 ~ ENVIRON["os"]{print $1}' $HOME/lustre-release/lustre/ChangeLog`
[ -n "$kernelversion" ] && $SUDOCMD yum -y install kernel-devel-$kernelversion || echo "ERROR: kernel version not found."

Set the os and lu variables at the beginning of the script to the required operating system release and Lustre version respectively.

SLES 12 SP2

For SLES12 SP2 systems, use zypper to install the kernel development packages. For example:

sudo zypper install kernel-default-devel-4.4.21-69.1 \
kernel-devel-4.4.21-69.1 \
kernel-source-4.4.21-69.1 

Note: the following shell script fragment can be used to identify the kernel version for a given operating system and Lustre version, and then use that to install the packages:

SUDOCMD=`which sudo 2>/dev/null`
kernelversion=`os="SLES12 SP2" lu=2.10.0 \
awk '$0 ~ "* version "ENVIRON["lu"]{i=1; next} \
$0 ~ "* Server known" && i {j=1; next} \
(/\*/ && j) || (/\* version/ && i) {exit} \
i && j && $0 ~ ENVIRON["os"]{print $1}' $HOME/lustre-release/lustre/ChangeLog`

[ -n "$kernelversion" ] && $SUDOCMD zypper install \
kernel-default-devel-$kernelversion \
kernel-devel-$kernelversion \
kernel-source-$kernelversion || echo "ERROR: kernel version not found."

Set the os and lu variables at the beginning of the script to the required operating system release and Lustre version respectively.

Create the SPL Packages

Run the configure script:

cd $HOME/spl
# For RHEL and CentOS, set the --spec=redhat flag. Otherwise omit.
./configure [--with-spec=redhat] \
[--with-linux=<path to kernel-devel>] \
[--with-linux-obj=<path to kernel-devel>]

To compile KABI-compliant kernel module packages for RHEL and CentOS distributions, use the --with-spec=redhat option. This option is not usable for other OS distributions.

If there is only one set of kernel development packages installed, the configure script should automatically detect the location of the relevant directory tree. However, if there are multiple kernel development packages installed for different kernel versions and revisions, then use the --with-linux and optionally --with-linux-obj flags to identify the correct directory for the target kernel.

For example:

cd $HOME/spl
./configure --with-spec=redhat \
--with-linux=/usr/src/kernels/3.10.0-514.16.1.el7.x86_64

Packages are created using the make command. There are three types of package that can be created from the SPL project. These are selected by providing parameters to the make command. One must create, at a minimum, the user-space packages, at least one other set of packages: the KMOD and/or DKMS packages.

To compile the user-space tools, run this command:

make pkg-utils

To create the kernel modules packages:

make pkg-kmod

To create the DKMS package:

make rpm-dkms

Since later process steps require that dependent packages be installed on the build server, always compile the user-space and KMOD packages even when the intended distribution will be DKMS. To compile all required sets of packages from a single command line invocation:

make pkg-utils pkg-kmod [rpm-dkms]

Note: DKMS packaging has not been evaluated for SLES.

Save the SPL RPMs

Copy the resulting RPM packages into a directory tree for later distribution:

mkdir -p $HOME/releases/zfs-spl
mv $HOME/spl/*.rpm $HOME/releases/zfs-spl

Create the ZFS Packages

The build process for ZFS is very similar to that for SPL. The ZFS package build process has a dependency on SPL, so make sure that the SPL packages created in the previous step have been installed on the build host.

RHEL / CentOS

SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum localinstall \
$HOME/releases/zfs-spl/{spl-[0-9].*,kmod-spl-[0-9].*,kmod-spl-devel-[0-9].*}.x86_64.rpm

Note: it is not unusual for the installation to resolve additional dependencies, including the full kernel package for the version of the kernel that SPL was compiled for.

SLES 12 SP2

sudo rpm -ivh kmod-spl-* spl-0.7.0-rc4.x86_64.rpm

Note: The rpm command is used in the above example due to a peculiarity of the SLES packages for SPL (and which also affects ZFS). In the set of RPMs that are created, two of the packages have very similar names (kmod-spl-devel-*), differing only by the version numbering, as can be seen in the following example:

kmod-spl-devel-0.7.0-rc4.x86_64.rpm
kmod-spl-devel-4.4.21-69-default-0.7.0-rc4.x86_64.rpm

It is essential to install both packages but if both are specified on the command line invocation, the zypper command will only install one of them. The rpm command is not affected. To use zypper instead, so that dependencies are automatically resolved, run the command twice, with the second command containing just the "conflicting" RPM. For example:

sudo zypper install kmod-spl-4.4.21-69-default-0.7.0-rc4.x86_64.rpm \
kmod-spl-devel-0.7.0-rc4.x86_64.rpm \
spl-0.7.0-rc4.x86_64.rpm
sudo zypper install kmod-spl-devel-4.4.21-69-default-0.7.0-rc4.x86_64.rpm

Prepare the build

Run the configure script:

cd $HOME/zfs
# For RHEL and CentOS only, set the --spec=redhat flag.
./configure [--with-spec=redhat] \
[--with-spl=<path to spl-devel> \
[--with-linux=<path to kernel-devel>] \
[--with-linux-obj=<path to kernel obj>]

To compile KABI-compliant kernel module packages for RHEL and CentOS distributions, use the --with-spec=redhat option. For SLES 12 SP2 or other distributions, ignore the --spec=redhat flag.

If there is only one set of kernel development packages installed, the configure script should automatically detect the location of the relevant directory tree. However, if there are multiple kernel development packages installed for different kernel versions and revisions, then use the --with-linux and optionally --with-linux-obj flags to identify the correct directory for the target kernel.

For example:

cd $HOME/zfs
./configure --with-spec=redhat \
--with-linux=/usr/src/kernels/3.10.0-514.16.1.el7.x86_64

In addition to the location of the kernel-devel RPM, the configure script may also need to be informed of the location of the SPL development installation (i.e. the location of the files installed from the spl-devel package, not the Git source code repository). For example:

cd $HOME/zfs
./configure --with-spec=redhat \
--with-spl=/usr/src/spl-0.7.0 \
--with-linux=/usr/src/kernels/3.10.0-514.16.1.el7.x86_64

Packages are created using the make command. Just like SPL, there are three types of package that can be created from the ZFS project. These are selected by providing parameters to the make command. One must create, at a minimum, the user-space packages, at least one other set of packages: the KMOD and/or DKMS packages.

Compile the Packages

To compile the user-space tools, run this command:

make pkg-utils

To create the kernel modules packages:

make pkg-kmod

To create the DKMS package:

make rpm-dkms

It is recommended that the user-space and KMOD packages are always compiled even when the intended distribution will be DKMS. To compile all sets of packages from a single command line invocation:

make pkg-utils pkg-kmod [rpm-dkms]

Save the ZFS RPMs

Copy the resulting RPM packages into a directory tree for later distribution:

mkdir -p $HOME/releases/zfs-spl
mv $HOME/zfs/*.rpm $HOME/releases/zfs-spl

3rd Party Network Fabric Support

This section is optional since, by default, Lustre will use the device drivers supplied by the Linux kernel. Complete this section if 3rd party InfiniBand drivers are required for the target environment. The procedure for creating InfiniBand drivers from external sources varies slightly depending upon the version of the InfiniBand software being used.

Instructions are provided for each of the following driver distributions:

  • OpenFabrics Alliance (OFA) OFED*
  • Mellanox OFED
  • True Scale OFED
  • Intel OmniPath Architecture (OPA)

*OFED: Open Fabrics Enterprise Distribution

Note: whichever distribution of OFED is selected, the resulting RPMs created during the build process for Lustre must be saved for distribution with the Lustre server packages.

Preparation

Any 3rd party drivers must be compiled against the target kernel that will be used by Lustre. This is true for each of the InfiniBand driver distributions, regardless of vendor. If the target systems will be using LDISKFS for the storage, then use kernel packages that have been created with the Lustre LDISKFS patches applied. If the kernel for the target servers has not been patched for LDISKFS, then use the binary kernel packages supplied by the operating system.

Note: Only the kernel-devel package is needed for this part of the build process.

Lustre-patched kernel-devel Package (for LDISKFS Server Builds)

For Lustre LDISKFS patched kernels, where the patched kernel has been recompiled from source, install the kernel-devel package as follows:

SUDOCMD=`which sudo 2>/dev/null`
find `rpm --eval %{_rpmdir}` -type f -name kernel-devel-\*.rpm -exec $SUDOCMD yum localinstall {} \;

Unpatched kernel-devel Package (for ZFS-only Server and Lustre Client Builds)

For "patchless" kernels, install the kernel-devel RPM that matches the supported kernel for the version of Lustre being compiled. Refer to the Lustre changelog in the source code distribution (lustre-release/lustre/ChangeLog) for the list of kernels for each OS distribution that are known to work with Lustre. The ChangeLog file contains a historical record of all Lustre releases.

For example, Lustre version 2.10.0 supports version 3.10.0-514.16.1.el7 of the RHEL / CentOS 7.3 kernel. Use YUM to install the kernel-devel RPM:

SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum install kernel-devel-3.10.0-514.16.1.el7

If Mock is being used to create packages, exit the Mock shell and install the kernel-devel RPM using the mock --install command:

mock --install kernel-devel-3.10.0-514.16.1.el7

Note: similar to the way in which the kernel source can be automatically identified and installed for the LDISKFS patched kernel, the following shell script fragment can be used to identify the kernel version for a given operating system and Lustre version, and then use that to install the kernel-devel package:

SUDOCMD=`which sudo 2>/dev/null`
kernelversion=`os=RHEL7.3 lu=2.10.0 \
awk '$0 ~ "* version "ENVIRON["lu"]{i=1; next} \
$0 ~ "* Server known" && i {j=1; next} \
(/\*/ && j) || (/\* version/ && i) {exit} \
i && j && $0 ~ ENVIRON["os"]{print $1}' $HOME/lustre-release/lustre/ChangeLog`
[ -n "$kernelversion" ] && $SUDOCMD yum -y install kernel-devel-$kernelversion || echo "ERROR: kernel version not found."

Set the os and lu variables at the beginning of the script to the required operating system release and Lustre version respectively.

For older RHEL / CentOS distributions, the required kernel might not be available in the active YUM repository for the distribution. CentOS maintains an archive or all previous releases in a set of YUM repositories called Vault, located at:

http://vault.centos.org

For example, the source RPMS for the CentOS 7.2 package updates can be found here:

http://vault.centos.org/7.2.1511/updates/x86_64/Packages

When the kernel-devel package has been downloaded, install it:

SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum -y install kernel-devel-<version>*.rpm

OpenFabrics Alliance (OFA) Open Fabrics Enterprise Distribution (OFED)

OFED Is maintained by the OpenFabrics alliance: http://openfabrics.org.

Note: At the time of writing, OFED 4.8-rc2 does not work with the latest Lustre release (Lustre 2.10.0), and that OFED 3.18-3 does not compile on RHEL / CentOS 7.3. It is therefore recommended that integrators and systems administrators use the in-kernel InfiniBand drivers, or the drivers supplied by the HCA vendor (Mellanox or Intel True Scale). Since it is rare for systems to make direct use of OFA OFED for production installations, using an alternative driver distribution is preferred in any case.

There are several releases of the OFED distribution, distinguished by version number, and the build process for each is different. OFED version 4 is the latest stable release at the time of writing (May 2017). There is also a version 3.18-3 stable release that is currently more mature but does not compile cleanly on RHEL / CentOS 7.3 or newer. Check the OFA web site for updates and to verify the releases that are compatible with the target operating system distribution.

Instructions are provided for OFED-4.8-rc2 but the method is equivalent for all 4.x and 3.x releases.

Note: in OFED version 3 and 4, the kernel drivers are contained in the compat_rdma RPM. In versions of OFED prior to release 3, the IB kernel drivers were contained in a source RPM called ofa_kernel, which in turn built kernel-ib and related binary packages.

Download the OpenFabrics (OFA) OFED software distribution from http://downloads.openfabrics.org/OFED., and extract the tarball bundle. For example, to download OFED-4.8-rc2:

cd $HOME
wget http://downloads.openfabrics.org/OFED/ofed-4.8/OFED-4.8-rc2.tgz
tar zxf $HOME/OFED-4.8-rc2.tgz

Intel True Scale InfiniBand

Intel provides a software distribution, derived from OFED, to support its True Scale InfiniBand host channel adapters (HCAs). The distribution can be downloaded from Intel's download centre:

https://downloadcenter.intel.com

Once downloaded, extract the Intel-IB bundle. For example:

cd $HOME
tar zxf $HOME/IntelIB-Basic.RHEL7-x86_64.7.4.2.0.6.tgz

Mellanox InfiniBand

Mellanox provides its own distribution of OFED, optimised for the Mellanox chipsets. Occasionally referred to as MOFED, the moustachioed RDMA fabric distribution. The software can be downloaded from the Mellanox web site:

http://www.mellanox.com/page/software_overview_ib

Once downloaded, extract the Mellanox OFED bundle. For example:

cd $HOME
tar zxf $HOME/MLNX_OFED_SRC-4.0-2.0.0.1.tgz

While the overall process for compiling the Mellanox kernel driver is similar to that for OFA and Intel OFED distributions, Mellanox packages the kernel drivers into an source RPM called mlnx-ofa_kernel, rather than compat-rdma.

Intel Omni-Path Architecture

Recent releases of the Intel Omni-Path host fabric interface (HFI) adapters use the drivers supplied by the distribution kernel and do not normally require a customised driver build. However, there are occasionally driver updates included in the IFS distribution from Intel, which may need to be recompiled for LDISKFS kernels. The same is true for older releases of the Intel Omni-Path software. Kernel driver updates are distributed in a compat-rdma kernel driver package which can be treated in the same way as for Intel True Scale OFED distributions.

Compiling the Network Fabric Kernel Drivers

There are many options available for the IB kernel driver builds and it is important to review the documentation supplied with the individual driver distributions to ensure that appropriate options required by the target environment are selected.

The options used in the following example are based on the default selections made by the distributions' install software. These should meet most requirements for x86_64-based systems and be suitable for each of the different vendors. The command-line can be used to build the compat-rdma packages for OFA OFED, Intel True Scale, as well as the mlnx-ofa_kernel package for Mellanox OFED. Some options for OFED are only available on specific kernels or processor architectures and these have been omitted from the example:

rpmbuild --rebuild --nodeps --define 'build_kernel_ib 1' --define 'build_kernel_ib_devel 1' \
--define 'configure_options --with-addr_trans-mod --with-core-mod --with-cxgb3-mod --with-cxgb4-mod --with-ipoib-mod --with-iser-mod --with-mlx4-mod --with-mlx4_en-mod --with-mlx5-mod --with-nes-mod --with-srp-mod --with-user_access-mod --with-user_mad-mod --with-ibscif-mod --with-ipath_inf-mod --with-iscsi-mod --with-qib-mod --with-qlgc_vnic-mod' \

--define 'KVERSION <version>-<release>.<os-dist>.x86_64' \

--define 'K_SRC /usr/src/kernels/<version>-<release>.<os-dist>.x86_64' \
--define 'K_SRC_OBJ /usr/src/kernels/<version>-<release>.<os-dist>.x86_64' \

--define '_release <version>_<release>.<os-dist>' \

<distribution directory>/SRPMS/<package-name>-<version>-<release>.src.rpm


Note: In the command line arguments, the definition of the variable configure_options must appear on a single line.


Pay special attention to the KVERSION</code<, K_SRC, K_SRC_OBJ and _release variables. These must match the target kernel version. In addition, the _release variable must not contain any hyphen (-) characters. Instead, replace hyphens with underscore (_). The _release variable is optional, but recommended as it will help to associate the package build with the kernel version.

The following is a complete example, using kernel version 3.10.0-514.16.1.el7_lustre.x86_64 (a Lustre-patched kernel for RHEL / CentOS 7.3 built using the process described earlier in this document). At the beginning of the example are variables pointing to the kernel driver packages for each of the major distributions:

# OFA OFED 3.x
# ofed_driver_srpm=$HOME/OFED-3.*/SRPMS/compat-rdma-3.*.rpm
# OFA OFED 4.x
# ofed_driver_srpm=$HOME/OFED-4.*/SRPMS/compat-rdma-4.*.src.rpm
# Intel True Scale
# ofed_driver_srpm=$HOME/IntelIB-Basic.RHEL7-x86_64.7.*/IntelIB-OFED.RHEL7-x86_64.3.*/SRPMS/compat-rdma-3.*.src.rpm
# Mellanox OFED 3.x
# ofed_driver_srpm=$HOME/MLNX_OFED_SRC-3.*/SRPMS/mlnx-ofa_kernel-3.*-OFED.3.*.src.rpm
 
ofed_driver_srpm=$HOME/IntelIB-Basic.RHEL7-x86_64.7.*/IntelIB-OFED.RHEL7-x86_64.3.*/SRPMS/compat-rdma-3.*.src.rpm
kernel_dev=3.10.0-514.16.1.el7_lustre.x86_64
kernel_release=`echo $kernel_dev|sed s'/-/_/g'`
 
rpmbuild --rebuild --nodeps --define 'build_kernel_ib 1' --define 'build_kernel_ib_devel 1' \
--define 'configure_options --with-addr_trans-mod --with-core-mod --with-cxgb3-mod --with-cxgb4-mod --with-ipoib-mod --with-iser-mod --with-mlx4-mod --with-mlx4_en-mod --with-mlx5-mod --with-nes-mod --with-srp-mod --with-user_access-mod --with-user_mad-mod --with-ibscif-mod --with-ipath_inf-mod --with-iscsi-mod --with-qib-mod --with-qlgc_vnic-mod' \
--define "KVERSION $kernel_dev" \
--define "K_SRC /usr/src/kernels/$kernel_dev" \
--define "K_SRC_OBJ /usr/src/kernels/$kernel_dev" \
--define "_release $kernel_release" \
$ofed_driver_srpm

The result is a set of kernel drivers for InfiniBand devices that are compatible with the kernel that will be used by Lustre.

An alternative method is to use the standard OFED install script. The following example shows how to supply additional options the the standard OFED installer:

cd $HOME/*OFED*/
./install.pl \
--kernel <kernel version> \
--linux /usr/src/kernels/<kernel-devel version> \
--linux-obj /usr/src/kernels/<kernel-devel version>

This will run through the insteractive build and install process, with options to select the various packages. Since the Lustre build process requires only the kernel drivers, the documentation uses the direct rpmbuild command, which in turn makes the possibility of automation easier to incorporate.

The Mellanox OFED install.pl script is similar, but has more options to control how the build is prosecuted. For example:

cd $HOME/*OFED*/
./install.pl --build-only --kernel-only \
--kernel <kernel version> \
--kernel-sources /usr/src/kernels/<kernel-devel version>

Intel's IFS IB install script is quite different form the OFA and Mellanox OFED scripts, and does not provide an obvious means to specify the kernel version. Nevertheless, using the more direct rpmbuild command above should result in suitable kernel drivers being created for whichever driver distribution is required.

Save the Driver RPMs

Copy the resulting RPM packages into a directory tree for later distribution:

_TOPDIR=`rpm --eval %{_topdir}`
mkdir -p $HOME/releases/ofed
mv $_TOPDIR/RPMS/*/*.rpm $HOME/releases/ofed

Create the Lustre Packages

Preparation

When compiling the Lustre packages from source, the build environment requires access to the kernel development package for the target Linux kernel. If the target server systems will be using LDISKFS for the storage, then use kernel packages that have been created with the Lustre LDISKFS patches applied. If the kernel for the target systems has not been patched for LDISKFS, then use the binary kernel packages supplied by the operating system. Also required are any 3rd party network device drivers not distributed with the kernel itself; typically this means InfiniBand drivers from one of the OFED distributions (either compat-rdma-devel or mlnx-ofa_kernel-devel).

It is possible to install these RPMs onto the build server but this is cumbersome and might require that the build server be reinstalled when creating packages for different releases of Lustre or even when creating packages for both servers and clients. As a simpler alternative, one can sometimes stage the installation of the RPMs into a temporary directory, although ZFS builds may not allow this.

Lustre Server (DKMS Packages only)

The process for creating a Lustre server DKMS package is straightforward:

_TOPDIR=`rpm --eval %{_topdir}`
cd $HOME/lustre-release
./configure --enable-dist
make dist
cp lustre-*.tar.gz $_TOPDIR/SOURCES/
rpmbuild -bs lustre-dkms.spec
rpmbuild --rebuild $_TOPDIR/SRPMS/lustre-dkms-*.src.rpm
mkdir -p $HOME/releases/lustre-server-dkms
mv $_TOPDIR/RPMS/*/*.rpm $HOME/releases/lustre-server-dkms

If the objective is to create a set of DKMS server packages for use with ZFS, then there is no further work required. See also the section on creating DKMS packages for Lustre clients, if required.

Lustre Server (Compiled Packages)

To compile the Lustre server packages requires the development packages for the Linux kernel, and optionally, SPL, ZFS and OFED. The packages used in the following examples have been taken from the builds created in the earlier stages of this process.

Install Lustre-patched kernel-devel Package (for LDISKFS Server Builds)

For Lustre LDISKFS patched kernels, where the patched kernel has been recompiled from source, install the kernel development package or packages. For example:

SUDOCMD=`which sudo 2>/dev/null`
INSTCMD=`which yum 2>/dev/null || which zypper 2>/dev/null`
$SUDOCMD $INSTCMD localinstall $HOME/releases/lustre-kernel/kernel-devel-\*.rpm

Install Unpatched kernel-devel Package (for ZFS-only Server Builds)

For "patchless" kernels, install the kernel-devel packages that match the supported kernel for the version of Lustre being compiled. Refer to the Lustre ChangeLog in the source code distribution (lustre-release/lustre/ChangeLog) for the list of kernels that are known to work with Lustre. The ChangeLog file contains a historical record of all Lustre releases.

RHEL / CentOS 7

For RHEL / CentOS 7, use yum to install the set of kernel development packages required by Lustre. For example, Lustre version 2.10.0 supports version 3.10.0-514.16.1.el7 of the RHEL / CentOS 7.3 kernel and YUM can be used to install the kernel-devel RPM:

SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum install kernel-devel-3.10.0-514.16.1.el7

If Mock is being used to create packages, exit the mock shell and install the kernel-devel RPM using the mock --install command:

mock --install kernel-devel-3.10.0-514.16.1.el7

The following shell script fragment can be used to identify the kernel version for a given operating system and Lustre version, and then use that to install the kernel-devel RPM:

SUDOCMD=`which sudo 2>/dev/null`
kernelversion=`os=RHEL7.3 lu=2.10.0 \
awk '$0 ~ "* version "ENVIRON["lu"]{i=1; next} \
$0 ~ "* Server known" && i {j=1; next} \
(/\*/ && j) || (/\* version/ && i) {exit} \
i && j && $0 ~ ENVIRON["os"]{print $1}' $HOME/lustre-release/lustre/ChangeLog`
[ -n "$kernelversion" ] && $SUDOCMD yum -y install kernel-devel-$kernelversion || echo "ERROR: kernel version not found."

Set the os and lu variables at the beginning of the script to the required operating system release and Lustre version respectively.

SLES 12 SP2

For SLES 12 SP2, use zypper to install the set of kernel development packages required by Lustre. For example:

SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD zypper install \
kernel-default-devel-4.4.21-69.1 \
kernel-devel-4.4.21-69.1 \
kernel-syms-4.4.21-69.1 \
kernel-source-4.4.21-69.1 

Similarly, the following shell script fragment can be used to identify the kernel version for a given operating system and Lustre version, and then install the kernel development packages for SLES:

SUDOCMD=`which sudo 2>/dev/null`
kernelversion=`os="SLES12 SP2" lu=2.10.0 \
awk '$0 ~ "* version "ENVIRON["lu"]{i=1; next} \
$0 ~ "* Server known" && i {j=1; next} \
(/\*/ && j) || (/\* version/ && i) {exit} \
i && j && $0 ~ ENVIRON["os"]{print $1}' $HOME/lustre-release/lustre/ChangeLog`

[ -n "$kernelversion" ] && $SUDOCMD zypper install \
kernel-default-devel-$kernelversion \
kernel-devel-$kernelversion \
kernel-syms-$kernelversion \
kernel-source-$kernelversion || echo "ERROR: kernel version not found."

Note: To compile Lustre, SLES 12 SP2 development environments require the kernel-syms package as well as kernel-default-devel, kernel-devel, and kernel-source.

Install SPL Development Packages (for ZFS server builds)

Install the SPL development packages, if they are not already present on the build host.

RHEL / CentOS 7
SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum localinstall \
$HOME/releases/zfs-spl/{spl-[0-9].*,kmod-spl-[0-9].*,kmod-spl-devel-[0-9].*}.x86_64.rpm
SLES 12 SP2
cd $HOME/releases/zfs-spl
SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD rpm -ivh kmod-spl-* spl-*.x86_64.rpm

Install ZFS Development Packages (for ZFS server builds)

Next, install the ZFS packages.

RHEL / CentOS 7
SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum localinstall \
$HOME/releases/zfs-spl/{zfs-[0-9].*,zfs-dracut-[0-9].*,kmod-zfs-[0-9].*,kmod-zfs-devel-[0-9].*,lib*}.x86_64.rpm
SLES 12 SP2
cd $HOME/releases/zfs-zpl
SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD rpm -ivh kmod-zfs-[0-9].*-default-*.x86_64.rpm \
kmod-zfs-devel-[0-9].*.x86_64.rpm \
lib*.x86_64.rpm \
zfs-[0-9].*.x86_64.rpm \
zfs-dracut-[0-9].*.x86_64.rpm

Optional: Additional Drivers

If there are 3rd party InfiniBand drivers, they must also be installed.

For OFA OFED and Intel True Scale drivers:

SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum localinstall \
$HOME/releases/ofed/{compat-rdma-devel-[0-9].*,compat-rdma-[0-9].*}.x86_64.rpm

For Mellanox OFED drivers:

SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum localinstall \
$HOME/releases/ofed/{mlnx-ofa_kernel-[0-9].*,mlnx-ofa_kernel-devel-[0-9].*,mlnx-ofa_kernel-modules-[0-9].*}.x86_64.rpm

Create the Server RPMS

From a command line shell on the build server, go to the directory containing the cloned Lustre Git repository:

cd $HOME/lustre-release

Make sure that artefacts from any previous builds have been cleaned up, leaving a pristine build environment:

make distclean

Run the configure script:

./configure --enable-server \
[ --disable-ldiskfs ] \
[ --with-linux=<kernel devel path> ] \
[ --with-o2ib=<IB driver path> ] \
[ --with-zfs=<ZFS devel path> | no ] \
[ --with-spl=<SPL devel path> ]

To create server packages, incorporate the --enable-server flag. The --with-linux and --with-o2ib flags should refer to the locations of the extracted kernel-devel and InfiniBand kernel drivers, respectively. The configure script will normally detect the ZFS development tree automatically if it is installed in the default location, but if not, use the --with-zfs and --with-spl options to identify the directories containing the respective development package directories. Lustre will automatically determine if it is compiling server packages with LDISKFS and/or ZFS support. To force Lustre to disable ZFS support, set --with-zfs=no.

RHEL / CentOS 7 Examples

To create Lustre server packages for OFA OFED or Intel True Scale with the LDISKFS-patched kernel:

./configure --enable-server \
--with-linux=/usr/src/kernels/*_lustre.x86_64 \
--with-o2ib=/usr/src/compat-rdma

To create packages for Mellanox OFED with a patched kernel:

./configure --enable-server \
--with-linux=/usr/src/kernels/*_lustre.x86_64 \
--with-o2ib=/usr/src/ofa_kernel/default

To create Lustre packages using the standard, unpatched OS kernel version 3.10.0-514.16.1.el7.x86_64, implying ZFS support:

./configure --enable-server \
--with-linux=/usr/src/kernels/3.10.0-514.16.1.el7.x86_64
SLES 12 SP2 Examples

To create Lustre server packages for an unpatched kernel (i.e. for ZFS-based servers), where the target kernel is the default kernel for the build server:

./configure --enable-server --disable-ldiskfs

A more complex example, referencing a different target kernel:

./configure --enable-server \
--with-linux=/usr/src/linux \
--with-linux-obj=/usr/src/linux-obj/x86_64/default \
--disable-ldiskfs

Note: When using an unpatched kernel as the target for a SLES 12 server, the LDISKFS component must be explicitly disabled. This is done by passing the --disable-ldiskfs flag as input to the configure script.

Compile the Server Packages

To build the Lustre server packages:

make rpms

On successful completion of the build, packages will be created in the current working directory.

Save the Lustre Server RPMs

Copy the Lustre server RPM packages into a directory tree for later distribution:

mkdir -p $HOME/releases/lustre-server
mv $HOME/lustre-release/*.rpm $HOME/releases/lustre-server

Lustre Client (DKMS Packages only)

The process for creating a Lustre client DKMS package is straightforward:

_TOPDIR=`rpm --eval %{_topdir}`
cd $HOME/lustre-release
make distclean
./configure --enable-dist --disable-server --enable-client
make dist
cp lustre-*.tar.gz $_TOPDIR/SOURCES/
rpmbuild -bs --without servers lustre-dkms.spec
rpmbuild --rebuild --without servers $_TOPDIR/SRPMS/lustre-client-dkms-*.src.rpm
mkdir -p $HOME/releases/lustre-client-dkms
mv $_TOPDIR/RPMS/*/*.rpm $HOME/releases/lustre-client-dkms

If the objective is to create a set of DMKS client packages, then there is no further work required.

Lustre Client (All other Builds)

Install the kernel-devel Package

The Lustre client build requires an unpatched version of the kernel-devel package.

RHEL / CentOS 7

Use the following shell script fragment to identify and download the appropriate kernel version for a given operating system and Lustre version:

SUDOCMD=`which sudo 2>/dev/null`
kernelversion=`os=RHEL7.3 lu=2.10.0 \
awk '$0 ~ "* version "ENVIRON["lu"]{i=1; next} \
$0 ~ "* Server known" && i {j=1; next} \
(/\*/ && j) || (/\* version/ && i) {exit} \
i && j && $0 ~ ENVIRON["os"]{print $1}' $HOME/lustre-release/lustre/ChangeLog`
[ -n "$kernelversion" ] && $SUDOCMD yum -y install kernel-devel-$kernelversion || echo "ERROR: kernel version not found."

Set the os and lu variables at the beginning of the script to the required operating system release and Lustre version respectively.

SLES 12 SP2

For SLES 12 SP2, use zypper to install the set of kernel development packages required by Lustre. The following shell script fragment can be used to identify the kernel version for a given operating system and Lustre version, and then install the kernel development packages for SLES:

SUDOCMD=`which sudo 2>/dev/null`
kernelversion=`os="SLES12 SP2" lu=2.10.0 \
awk '$0 ~ "* version "ENVIRON["lu"]{i=1; next} \
$0 ~ "* Server known" && i {j=1; next} \
(/\*/ && j) || (/\* version/ && i) {exit} \
i && j && $0 ~ ENVIRON["os"]{print $1}' $HOME/lustre-release/lustre/ChangeLog`

[ -n "$kernelversion" ] && $SUDOCMD zypper install \
kernel-default-devel-$kernelversion \
kernel-devel-$kernelversion \
kernel-syms-$kernelversion \
kernel-source-$kernelversion || echo "ERROR: kernel version not found."

Note: To compile Lustre, SLES 12 SP2 development environments require the kernel-syms package as well as kernel-default-devel, kernel-devel, and kernel-source.

Optional: Additional Drivers

If there are 3rd party InfiniBand drivers, they must also be installed. The following examples assume that the drivers have been compiled from source code against an unpatched kernel-devel RPM, using the process described earlier. Be careful to distinguish between driver packages created for LDISKFS patched kernels, and drivers compiled against the standard unpatched kernel.

For OFA OFED and Intel True Scale drivers:

SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum localinstall \
$HOME/releases/ofed/{compat-rdma-devel-[0-9].*,compat-rdma-[0-9].*}.x86_64.rpm

For Mellanox OFED drivers:

SUDOCMD=`which sudo 2>/dev/null`
$SUDOCMD yum localinstall \
$HOME/releases/ofed/{mlnx-ofa_kernel-[0-9].*,mlnx-ofa_kernel-devel-[0-9].*,mlnx-ofa_kernel-modules-[0-9].*}.x86_64.rpm

Create the Client RPMs

From a command line shell on the build host, go to the directory containing the cloned Lustre Git repository:

cd $HOME/lustre-release

Make sure that artefacts from any previous builds have been cleaned up, leaving a pristine build environment:

make distclean

Run the configure script. To create client packages, incorporate the --disable-server</code< and --enable-client flags:

./configure --disable-server --enable-client \
[ --with-linux=<kernel devel path> ] \
[ --with-linux-obj=<kernel obj path> ] \
[ --with-o2ib=<IB driver path> ]

The --with-linux and --with-o2ib flags should refer to the locations of the extracted kernel-devel and InfiniBand kernel drivers, respectively.

For example, to create Lustre client packages for OFA OFED or Intel True Scale:

./configure --disable-server --enable-client \
--with-linux=/usr/src/kernels/*.x86_64 \
--with-o2ib=/usr/src/compat-rdma

To create Lustre client packages for Mellanox OFED:

./configure --disable-server --enable-client \
--with-linux=/usr/src/kernels/*.x86_64 \
--with-o2ib=/usr/src/ofa_kernel/default

To create Lustre client packages using the standard, unpatched OS kernel version 3.10.0-514.16.1.el7.x86_64:

./configure --disable-server --enable-client \
--with-linux=/usr/src/kernels/3.10.0-514.16.1.el7.x86_64

Build the Lustre client packages:

make rpms

On successful completion of the build, packages will be created in the current working directory.

Save the Lustre Client RPMs

Copy the Lustre client RPM packages into a directory tree for later distribution:

mkdir -p $HOME/releases/lustre-client
mv $HOME/lustre-release/*.rpm $HOME/releases/lustre-client