Starting and Stopping Lustre Services

From Lustre Wiki
Jump to navigation Jump to search

Lustre Start-up Sequence

The normal start-up sequence for a Lustre file system follows:

  1. Start the Management Service (MGS):
    <log into MGS host>
    # For ZFS OSDs only, must import the zpool
    zpool import <zpool name>
    mkdir -p /lustre/mgt
    mount -t lustre <path to MGT> /lustre/mgt
    
  2. Start the Metadata Service[s] (MDS):
    <log into MDS host>
    # For ZFS OSDs only, must import the zpool
    zpool import <zpool name>
    mkdir -p /lustre/<fsname>/mdt<n>
    mount -t lustre  <path to MDTn> /lustre/<fsname>/mdt<n>
    

    where <n> represents the MDT index number. There must be, at a minimum, an mdt0 storage target.

  3. Start the Object Storage Services (OSS)
    <log into OSS host>
    # For ZFS OSDs only, must import the zpool
    zpool import <zpool name>
    mkdir -p /lustre/<fsname>/ost<n>
    mount -t lustre  <path to OSTn> /lustre/<fsname>/ost<n>
    

    where <n> represents the OST index number, starting from 0 (zero)

  4. Mount the Lustre clients:
    <log into client host>
    mkdir -p /lustre/<fsname>
    mount -t lustre  <MGS NID>[:<MGS NID>]:/<fsname> /lustre/<fsname>
    

The MGS NIDs are the LNet network identifiers for the MGS servers and should be listed in preferred search order. The Lustre client software will attempt to connect to each NID in the order specified on the command line (or in the /etc/fstab file). The client will stop searching when it has successfully established a connection to the MGS.

Note: The clients will connect to the Management Service (MGS) first, not the Metadata Service (MDS). This is in fact common to all Lustre assets except the MGS itself, but it is more obvious with the clients when mounting a file system.

If the two services are installed on an HA cluster, make sure to list the NIDs such that the expected preferred primary node is listed first.

If there are no OSS services online, but the MGS and the MDS for MDT0 are running, then the client mount will wait indefinitely until an OSS service starts up.

Lustre Shutdown Sequence

The shutdown sequence for a Lustre file system environment is:

  1. Stop all of the Lustre clients
    <Log into client host>
    umount /lustre/<fsname>
    
  2. Stop the Metadata Service[s] (MDS)
    <Log into MDS host>
    umount /lustre/<fsname>/mdt<n>
    
  3. Stop the Object Storage Services (OSS)
    <Log into OSS host>
    umount /lustre/<fsname>/ost<n>
    
  4. [Optional] Stop the Management Service (MGS)
    <Log into MGS host>
    umount /lustre/mgt
    

Note: For ZFS OSDs, it is not necessary to export the zpool after stopping Lustre services. The services are stopped by the umount command. The zpools need only be exported when migrating the ZFS pool for import on a different host.

Why not start the MDS after the OSSs?

The metadata server is the gateway for all I/O in a Lustre file system because it controls all of the namespace operations and is responsible for providing the layout of files across the object storage. Stopping the MDS before the OSTs prevents any new IO during shutdown, and starting the MDS after the OSTs prevents any new IO until all the services are online. In effect, files cannot be created or destroyed if the metadata service is offline.

As such it is reasonable to expect that the MDS is the last service component started in a Lustre file system and the first service that is stopped on file system shutdown. However, when considering expansion of a Lustre file system by adding more OSTs, the MDT services must be online before the new storage targets are added.

If the file system is new, then the startup sequence must be MGS → MDS → OSS → Client, and if a new OST is to be added to the file system, the MDS must be online before the new OST is started.

A startup sequence of MGS → OSS → MDS → Client is acceptable only if the file system is established and the server configuration is static. New storage targets can only be added to Lustre when the MGS and MDT0 are online.