Mounting a Lustre File System on Client Nodes

From Lustre Wiki
Revision as of 09:39, 7 May 2018 by Utopiabound (talk | contribs) (→‎Starting and stopping the Lustre Client: Add info for client mounts with systemd (C.f. https://jira.hpdd.intel.com/browse/LU-8293))
Jump to navigation Jump to search

All end-user application I/O happens via a service called the Lustre client. The client is responsible for providing a POSIX interface to applications, creating a coherent presentation of the metadata (file system name space) and object data (file content) to applications running on the client operating system. All Lustre file system IO is transacted over a network protocol.

Client specifications are entirely application-driven and vary widely across the spectrum of applications, organisations and industries. Lustre clients must be running a Linux operating system, and the client software is comprised of kernel modules, with some user-space tools to assist with configuration and management.

Starting and stopping the Lustre Client

Pre-requisites:

  1. Lustre software has been installed on the machine
  2. LNet has been configured

Start a Lustre client using the mount command, the basic syntax of which is:

mount -t lustre \
  [-o <options> ] \
  <MGS NID>[:<MGS NID>]:/<fsname> \
  /lustre/<fsname>

To stop the Lustre client, unmount the file system:

umount <path>

The mount and umount commands require super-user privileges to run.

The mount point directory must exist before the mount command is executed. The recommended convention for the mount point of the client is /lustre/<fsname>/, where <fsname> is the name of the file system.

When the mount command is invoked, the client first registers with the MGS to retrieve the configuration information, also referred to as the log, for the file system that it wants to mount. A single MGS can store the configuration information for more than one file system.

The following example shows the command line used to mount a file system named demo:

mkdir -p /lustre/demo
mount -t lustre \
  192.168.227.11@tcp1:192.168.227.12@tcp1:/demo \
  /lustre/demo

The client will try to connect to the MGS in the order of the NID addresses supplied on the command line. If connection to the first NID fails, the client will attempt a connection using the next NID.

To verify that the file system is mounted on the client, use the df command:

[root@rh7z-c3 ~]# df -ht lustre
File system                                     Size  Used Avail Use% Mounted on
192.168.227.11@tcp1:192.168.227.12@tcp1:/demo   49G  2.9M   49G   1% /lustre/demo

The lctl dl command provides detail on the connections to the Lustre services:
[root@rh7z-c3 ~]# lctl dl
  0 UP mgc MGC192.168.227.11@tcp1 7f07b5f9-27e3-0b09-7456-d83ae184d204 5
  1 UP lov demo-clilov-ffff8800bab6a000 c04fa65d-3f0b-9cbf-b373-6a894da8e0be 4
  2 UP lmv demo-clilmv-ffff8800bab6a000 c04fa65d-3f0b-9cbf-b373-6a894da8e0be 4
  3 UP mdc demo-MDT0000-mdc-ffff8800bab6a000 c04fa65d-3f0b-9cbf-b373-6a894da8e0be 5
  4 UP osc demo-OST0000-osc-ffff8800bab6a000 c04fa65d-3f0b-9cbf-b373-6a894da8e0be 5

If the MGS is unavailable, the mount command will return an error, similar to the following example:

[root@rh7z-c3 ~]# mount -t lustre \
>   192.168.227.11@tcp1:192.168.227.12@tcp1:/demo \
>   /lustre/demo
mount.lustre: mount 192.168.227.11@tcp1:192.168.227.12@tcp1:/demo at /lustre/demo failed: Input/output error
Is the MGS running?

More detailed information on the failure will be in the syslog and kernel ring buffer:

[ 9996.909126] Lustre: 10199:0:(client.c:1967:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1459822631/real 1459822631]  req@ffff8800ad298000 x1530734379532292/t0(0) o250->MGC192.168.227.11@[email protected]@tcp1:26/25 lens 400/544 e 0 to 1 dl 1459822636 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
[10021.923403] Lustre: 10199:0:(client.c:1967:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1459822656/real 1459822656]  req@ffff880138238000 x1530734379532308/t0(0) o250->MGC192.168.227.11@[email protected]@tcp1:26/25 lens 400/544 e 0 to 1 dl 1459822661 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
[10027.155495] LustreError: 15c-8: MGC192.168.227.11@tcp1: The configuration from log 'demo-client' failed (-5). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information.
[10027.207044] Lustre: Unmounted demo-client
[10027.214618] LustreError: 10212:0:(obd_mount.c:1342:lustre_fill_super()) Unable to mount  (-5)

The most common cause of failure is an improperly configured network interface, or LNet NID. Verify that the LNet protocol is able to communicate with the MGS with lctl ping:

lctl ping <MGS NID>

If the ping fails, the command will return an I/O error:

[root@rh7z-c3 ~]# lctl ping 192.168.227.11@tcp1
failed to ping 192.168.227.11@tcp1: Input/output error

Check the LNet settings before continuing.

If the ping succeeds, but the mount still fails, verify that the Lustre services are running on the target host. Also check to see if there are any services running on the client that might be interfering with communication, such as a firewall or SELinux. While SELinux is supported in Lustre 2.8 onwards, older releases of Lustre are not compatible. Temporarily disabling the firewall and SELinux can help narrow down the root cause of issues with Lustre communications.

If there are no OSS services online, but the MGS and the MDS for MDT0 are running, then the client mount command will hang indefinitely until an OSS service starts up.

There are options specific to Lustre that can be applied to the Lustre client mount command. The most common of these are flock, localflock and user_xattr:

  • flock: enable support for cluster-wide, coherent file locks. Must be applied to the mount commands for all clients that will be accessing common data requiring lock functionality. Cluster-wide locking will have a detrimental impact on file system performance, and should only be enabled when absolutely required. For some applications, the locking is only necessary on a sub-set of nodes. For example, the CTDB cluster framework used by Samba to provide a parallel, high-availability SMB gateway, relies on locking of a shared file when coordinating cluster start-up and recovery. However, only the CTDB nodes need to mount the Lustre file system with the flock option. This is an example of application or domain-specific lock requirements.
  • localflock: enable client-local flock support. This is much faster than cluster-wide flock support, but is only suitable for applications that require locks, but don’t run on multiple hosts (or where the data will not be accessed in a manner that would require locking across multiple hosts).
  • user_xattr: Enable support for user extended attributes.

Additionally, consider using the _netdev mount option when mounting the Lustre client, especially when adding an entry into /etc/fstab. This option indicates to the operating system that the file system has a dependency on the network such that it should not be mounted before the network is online and should be unmounted on shutdown prior to stopping the network stack. An example entry for /etc/fstab:

192.168.227.11@tcp1:192.168.227.12@tcp1:/demo /lustre/demo lustre defaults,_netdev 0 0

For systemd based systems, the following is recommended to ensure correct startup and shutdown:

192.168.227.11@tcp1:192.168.227.12@tcp1:/demo /lustre/demo lustre defaults,_netdev,noauto,x-systemd.automount,x-systemd.requires=lnet.service
 0 0

Refer to the mount.lustre(8) man page for more information on the available options.