| C H A P T E R 28 |
|
User Utilities (man1) |
This chapter describes user utilities and includes the following sections:
The lfs utility can be used for user configuration routines and monitoring. With lfs you can create a new file with a specific striping pattern, determine the striping pattern of existing files, and gather the extended attributes (object numbers and location) of a specific file.
lfs
lfs check <mds|osts|servers>
lfs df [-i] [-h] [path]
lfs find [[!] --atime|-A [-+]N] [[!] --mtime|-M [-+]N]
[[!] --ctime|-C [-+]N] [--maxdepth|-D N] [--name|-n <pattern>]
[--print|-p] [--print0|-P] [[!] --obd|-O <uuid[s]>]
[[!] --size|-S [+-]N[kMGTPE]] --type |-t {bcdflpsD}]
[[!] --gid|-g|--group|-G <gname>|<gid>]
[[!] --uid|-u|--user|-U <uname>|<uid>]
<dirname|filename>
lfs osts [path]
lfs getstripe [--obd|-O <uuid>] [--quiet|-q] [--verbose|-v]
[--count|-c] [--index|-i | --offset|-o]
[--size|-s] [--pool|-p] [--directory|-d]
[--recursive|-r] <dirname|filename>
lfs setstripe [--size|-s stripe_size] [--count|-c stripe_cnt]
[--index|-i [--offset|-o start_ost_index] [--pool|-p <pool>]
lfs setstripe -d <dirname>
lfs poollist <filename[.<pool>] | <pathname>
lfs quota [-q] [-v] [-o obd_uuid|-I ost_idx|-i mdt_idx] [-u|-g <uname>|uid|gname|gid>] <filesystem>
lfs quota -t <-u|-g> <filesystem>
lfs quotacheck [-ugf] <filesystem>
lfs quotachown [-i] <filesystem>
lfs quotaon [-ugf] <filesystem>
lfs quotaoff [-ug] <filesystem>
lfs quotainv [-ug] [-f] <filesystem>
lfs setquota <-u|--user|-g|--group> <uname|uid|gname|gid>
[--block-softlimit <block-softlimit>]
[--block-hardlimit <block-hardlimit>]
[--inode-softlimit <inode-softlimit>]
[--inode-hardlimit <inode-hardlimit>]
<filesystem>
lfs setquota <-u|--user|-g|--group> <uname|uid|gname|gid>
[-b <block-softlimit>] [-B <block-hardlimit>]
[-i <inode-softlimit>] [-I <inode-hardlimit>]
<filesystem>
lfs setquota -t <-u|-g>
[--block-grace <block-grace>]
[--inode-grace <inode-grace>]
<filesystem>
lfs setquota -t <-u|-g>
[-b <block-grace>] [-i <inode-grace>]
<filesystem>
lfs help
| Note - In the above example, the <filesystem> parameter refers to the mount point of the Lustre file system. The default mount point is /mnt/lustre |
The lfs utility is used to create a new file with a specific striping pattern, determine the default striping pattern, gather the extended attributes (object numbers and location) for a specific file, find files with specific attributes, list OST information or set quota limits. It can be invoked interactively without any arguments or in a non-interactive mode with one of the supported arguments.
The various lfs options are listed and described below. For a complete list of available options, type help at the lfs prompt.
|
Displays the status of MDS or OSTs (as specified in the command) or all servers (MDS and OSTs). |
|||
|
Reports file system disk space usage or inode usage of each MDT/OST. Can limit the scope to a specific OST pool. |
|||
|
Searches the directory tree rooted at the given directory/filename for files that match the given parameters. The --maxdepth option limits find to decend at most N levels of directory tree. The --print and --print0 options print the full filename, followed by a new line or NUL character correspondingly. Using ! before an option negates its meaning (files NOT matching the parameter). Using + before a numeric value means files with the parameter OR MORE. Using - before a numeric value means files with the parameter OR LESS. |
|||
|
File was last accessed N*24 hours ago. (There is no guarantee that atime is kept coherent across the cluster.) OSTs store a transient atime that is updated when clients do read requests. Permanent atime is written to the MDS when the file is closed. However, on-disk atime is only updated if it is more than 60 seconds old (/proc/fs/lustre/mds/*/max_atime_diff). Lustre considers the latest atime from all OSTs. If a setattr is set by user, then it is updated on both the MDS and OST, allowing the atime to go backward. |
|||
|
File has a size in bytes or kilo-, Mega-, Giga-, Tera-, Peta- or Exabytes if a suffix is given. |
|||
|
File has a type (block, character, directory, pipe, file, symlink, socket or Door [for Solaris]). |
|||
|
File belongs to a specific group (numeric group ID allowed). |
|||
|
Lists all OSTs for all mounted file systems. If the [path] provided is located on a Lustre-mounted file system, then only OSTs belonging to that file system are displayed. |
|||
|
Lists the striping information for a given filename or directory. By default, the stripe count, stripe size and offset are returned.
|
|||
|
Lists the stripe size (how much data to write to one OST before moving to the next OST). |
|||
|
Lists entries about a specified directory instead of its contents (in the same manner as ls -d). |
|||
|
Create new files with a specific file layout (stripe pattern) configuration.[1] |
|||
|
Number of OSTs over which to stripe a file. A stripe count of 0 uses the file system-wide default stripe count (1). A stripe count of -1 stripes over all available OSTs, and normally results in a file with 80 stripes. |
|||
|
--size stripe_size[2] |
Number of bytes to store on an OST before moving to the next OST. A stripe size of 0 uses the file system’s default stripe size, 1MB. This can be specified with k (KB), m (MB), or g (GB), respectively. |
||
|
The OST index (base 10, starting at 0) on which to start striping for this file. A start-ost value of -1 allows the MDS to choose the starting index. This is the default, and it means that the MDS selects the starting OST as it wants. It has no relevance on whether the MDS will use round-robin or QoS weighted allocation for the remaining stripes in the file. We strongly recommend selecting this default value, as it allows space and load balancing to be done by the MDS as needed. |
|||
|
Name of the pre-defined pool of OSTs (see lctl) that will be used for striping. The stripe_cnt, stripe_size and start_ost_index values are used as well. The start_ost_index value must be part of the pool or an error is returned. |
|||
|
Lists pools in the file system or pathname, or OSTs in the file system’s pool. |
|||
|
quota [-q] [-v] [-o obd_uuid|-i mdt_idx|-I ost_idx] [-u|-g <uname|uid|gname|gid>] <filesystem> |
Displays disk usage and limits, either for the full file system or for objects on a specific OBD. A user or group name or an ID can be specified. If both user and group are omitted, quotas for the current UID/GID are shown. The -q option provides more quiet output by suppressing the printing of the header. It also fills in blank spaces in the ''grace'' column with zeros (when there is no grace period set), to ensure that the number of columns is consistent. The -v option provides more verbose (with per-OBD statistics) output. |
||
|
Displays block and inode grace times for user (-u) or group (-g) quotas. |
|||
|
Scans the specified file system for disk usage, and creates or updates quota files. Options specify quota for users (-u), groups (-g), and force (-f). |
|||
|
Changes the file’s owner and group on OSTs of the specified file system. |
|||
|
Turns on file system quotas. Options specify quota for users (-u), groups (-g), and force (-f). |
|||
|
Turns off file system quotas. Options specify quota for users (-u), groups (-g), and force (-f). |
|||
|
Clears quota files (administrative quota files if used without -f, operational quota files otherwise), all of their quota entries for users (-u) or groups (-g). After running quotainv, you must run quotacheck before using quotas. CAUTION: Use extreme caution when using this command; its results cannot be undone. |
|||
|
setquota <-u|-g> <uname>|<uid>|<gname>|<gid> [--block-softlimit <block-softlimit>] [--block-hardlimit <block-hardlimit>] [--inode-softlimit <inode-softlimit>] [--inode-hardlimit <inode-hardlimit>] <filesystem> |
Sets file system quotas for users or groups. Limits can be specified with --{block|inode}-{hardlimit|softlimit} or their short equivalents -b, -B, -i, -I. Users can set 1, 2, 3 or 4 limits.[3] Also, limits can be specified with special suffixes, -b, -k, -m, -g, -t, and -p to indicate units of 1, 2^10, 2^20, 2^30, 2^40 and 2^50, respectively. By default, the block limits unit is 1 kilobyte (1,024), and block limits are always kilobyte-grained (even if specified in bytes). See Examples. |
||
|
Sets the file system quota grace times for users or groups. Grace time is specified in “XXwXXdXXhXXmXXs” format or as an integer seconds value. See Examples. |
|||
$ lfs setstripe -s 128k -c 2 /mnt/lustre/file1
Creates a file striped on two OSTs with 128 KB on each stripe.
$ lfs setstripe -d /mnt/lustre/dir
Deletes a default stripe pattern on a given directory. New files use the default striping pattern.
$ lfs getstripe -v /mnt/lustre/file1
Lists the detailed object allocation of a given file.
$ lfs setstripe --pool my_pool -c 2 /mnt/lustre/file
Creates a file striped on two OSTs from the pool my_pool
$ lfs poollist /mnt/lustre/
Lists the pools defined for the mounted Lustre file system /mnt/lustre
$ lfs poollist my_fs.my_pool
Lists the OSTs which are members of the pool my_pool in file system my_fs
$ lfs getstripe -v /mnt/lustre/file1
Lists the detailed object allocation of a given file.
$ lfs find /mnt/lustre
Efficiently lists all files in a given directory and its subdirectories.
$ lfs find /mnt/lustre -mtime +30 -type f -print
Recursively lists all regular files in a given directory more than 30 days old.
$ lfs find --obd OST2-UUID /mnt/lustre/
Recursively lists all files in a given directory that have objects on OST2-UUID. The lfs check servers command checks the status of all servers (MDT and OSTs).
$ lfs find /mnt/lustre --pool poolA
Finds all directories/files associated with poolA.
$ lfs find /mnt//lustre --pool ""
Finds all directories/files not associated with a pool.
$ lfs find /mnt/lustre ! --pool ""
Finds all directories/files associated with pool.
$ lfs check servers
Checks the status of all servers (MDT, OST)
$ lfs osts
Lists all OSTs in the file system.
$ lfs df -h
Lists space usage per OST and MDT in human-readable format.
$ lfs df -i
Lists inode usage per OST and MDT.
$ lfs df --pool <filesystem>[.<pool>] | <pathname>
List space or inode usage for a specific OST pool.
$ lfs quotachown -i /mnt/lustre
$ lfs quotacheck -ug /mnt/lustre
Checks quotas for user and group. Turns on quotas after making the check.
$ lfs quotaon -ug /mnt/lustre
Turns on quotas of user and group.
$ lfs quotaoff -ug /mnt/lustre
Turns off quotas of user and group.
$ lfs setquota -u bob --block-softlimit 2000000 --block-hardlimit 1000000 /mnt/lustre
Sets quotas of user ‘bob’, with a 1 GB block quota hardlimit and a 2 GB block quota softlimit.
$ lfs setquota -t -u --block-grace 1000 --inode-grace 1w4d /mnt/lustre
Sets grace times for user quotas: 1000 seconds for block quotas, 1 week and 4 days for inode quotas.
$ lfs quota -u bob /mnt/lustre
$ lfs quota -t -u /mnt/lustre
Show grace times for user quotas on /mnt/lustre.
$ lfs setstripe --pool my_pool /mnt/lustre/dir
Associates a directory with the pool my_pool, so all new files and directories are created in the pool.
$ lfs find /mnt/lustre --pool poolA
Finds all directories/files associated with poolA.
$ lfs find /mnt//lustre --pool ""
Finds all directories/files not associated with a pool.
$ lfs find /mnt/lustre ! --pool ""
Finds all directories/files associated with pool.
The lfs_migrate utility is a simple tool to migrate files between Lustre OSTs.
lfs_migrate [-c|-s] [-h] [-l] [-n] [-y] [file|directory ...]
The lfs_migrate utility is a simple tool to assist migration of files between Lustre OSTs. It is simply copying each specified file to a new file, verifying the file contents have not changed, and then renaming the new file back to the original filename. This allows balancing space usage between OSTs, moving files of OSTs that are starting to show hardware problems (though are still functional), or OSTs will be discontinued.
Because lfs_migrate is not closely integrated with the MDS, it cannot determine whether a file is currently open and/or in-use by other applications or nodes. That makes it UNSAFE for use on files that might be modified by other applications, since the migrated file is only a copy of the current file. This will result in the old file becoming an open-unlinked file and any modifications to that file will be lost.
Files to be migrated can be specified as command-line arguments. If a directory is specified on the command-line then all files within that directory are migrated. If no files are specified on the command-line, then a list of files is read from the standard input, making lfs_migrate suitable for use with lfs(1) find to locate files on specific OSTs and/or matching other file attributes.
The current file allocation policies on the MDS dictate where the new files are placed, taking into account whether specific OSTs have been disabled on the MDS via lctl (8) (preventing new files from being allocated there), whether some OSTs are overly full (reducing the number of files placed on those OSTs), or if there is a specific default file striping for the target directory (potentially changing the stripe count, stripe size, OST pool, or OST index of a new file).
Options supporting lfs_migrate are described below.
$ lfs_migrate /mnt/lustre/file
To rebalance all files within /mng/lustre/dir.
$ lfs find /test -obd test-OST004 -size +4G | lfs_migrate -y
To migrate files within /test filesystem on OST004 larger than 4 GB in size.
Hard links could be handled correctly in Lustre 2.0 by using lfs(1) fid2path.
Eventually, this functionality will be integrated into lfs(1) itself and will integrate with the MDS layout locking to make it safe in the presence of opened files and ongoing file I/O.
lfs_migrate is part of the Lustre(7) file system package, and was added in the 1.8.4 release.
Lfsck ensures that objects are not referenced by multiple MDS files, that there are no orphan objects on the OSTs (objects that do not have any file on the MDS which references them), and that all of the objects referenced by the MDS exist. Under normal circumstances, Lustre maintains such coherency by distributed logging mechanisms, but under exceptional circumstances that may fail (e.g. disk failure, file system corruption leading to e2fsck repair). To avoid lengthy downtime, you can also run lfsck once Lustre is already started.
The e2fsck utility is run on each of the local MDS and OST device file systems and verifies that the underlying ldiskfs is consistent. After e2fsck is run, lfsck does distributed coherency checking for the Lustre file system. In most cases, e2fsck is sufficient to repair any file system issues and lfsck is not required.
lfsck [-c|--create] [-d|--delete] [-f|--force] [-h|--help] [-l|--lostfound] [-n|--nofix] [-v|--verbose] --mdsdb mds_database_file --ostdb ost1_database_file [ost2_database_file...] <filesystem>
| Note - As shown, the <filesystem> parameter refers to the Lustre file system mount point. The default mount point is /mnt/lustre. |
| Note - For lfsck, database filenames must be provided as absolute pathnames. Relative paths do not work, the databases cannot be properly opened. |
Options supporting lfsck are described below.
The lfsck utility is used to check and repair the distributed coherency of a Lustre file system. If an MDS or an OST becomes corrupt, run a distributed check on the file system to determine what sort of problems exist. Use lfsck to correct any defects found.
For more information on using e2fsck and lfsck, including examples, see Recovering from Errors or Corruption on a Backing File System. For information on resolving orphaned objects, see Working with Orphaned Objects.
The e2fsprogs package contains the filefrag tool which reports the extent of file fragmentation.
filefrag [ -belsv ] [ files... ]
The filefrag utility reports the extent of fragmentation in a given file. Initially, filefrag attempts to obtain extent information using FIEMAP ioctl, which is efficient and fast. If FIEMAP is not supported, then filefrag uses FIBMAP.
| Note - Lustre only supports FIEMAP ioctl. FIBMAP ioctl is not supported. |
In default mode[4], filefrag returns the number of physically discontiguous extents in the file. In extent or verbose mode, each extent is printed with details. For Lustre, the extents are printed in device offset order, not logical offset order.
The options and descriptions for the filefrag utility are listed below.
|
Uses the 1024-byte blocksize for the output. By default, this blocksize is used by Lustre, since OSTs may use different block sizes. |
||
$ filefrag /mnt/lustre/foo /mnt/lustre/foo: 6 extents found
Lists verbose output in extent format.
$ filefrag -ve /mnt/lustre/foo Checking /mnt/lustre/foo Filesystem type is: bd00bd0 Filesystem cylinder groups is approximately 5 File size of /mnt/lustre/foo is 157286400 (153600 blocks) ext: device_logical: start..end physical: start..end: length: device: flags: 0: 0.. 49151: 212992.. 262144: 49152: 0: remote 1: 49152.. 73727: 270336.. 294912: 24576: 0: remote 2: 73728.. 76799: 24576.. 27648: 3072: 0: remote 3: 0.. 57343: 196608.. 253952: 57344: 1: remote 4: 57344.. 65535: 139264.. 147456: 8192: 1: remote 5: 65536.. 76799: 163840.. 175104: 11264: 1: remote /mnt/lustre/foo: 6 extents found
Lustre uses the standard mount(8) Linux command. When mounting a Lustre file system, mount(8) executes the /sbin/mount.lustre command to complete the mount. The mount command supports these Lustre-specific options:
|
Number of times a client will retry to mount the file system |
Timeouts are the most common cause of hung applications. After a timeout involving an MDS or failover OST, applications attempting to access the disconnected resource wait until the connection gets established.
When a client performs any remote operation, it gives the server a reasonable amount of time to respond. If a server does not reply either due to a down network, hung server, or any other reason, a timeout occurs which requires a recovery.
If a timeout occurs, a message (similar to this one), appears on the console of the client, and in /var/log/messages:
LustreError: 26597:(client.c:810:ptlrpc_expire_one_request()) @@@ timeout req@a2d45200 x5886/t0 o38->mds_svc_UUID@NID_mds_UUID:12 lens 168/64 ref 1 fl RPC:/0/0 rc 0
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.