LFSCK Phase 1 - OI Scrub and Inode Iterator UI Design

From Lustre Wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Introduction

The OI Scrub is an administrative tool. Control the OI Scrub is through user interfaces. The UI will be implemented as lctl commands. This document provides an initial design for the lctl commands.

There are three types of user interfaces for the OI Scrub: switch interfaces, status interfaces and adjustment interfaces.

Switch interfaces

Start OI Scrub by lctl start_scrub

Used to start OI Scrub after MDT mount up.

SYNOPSIS:

   lctl start_scrub <-M MDT_device> [-m | --method <iteration_method>] [-e exit_when_fail]
       [-i | --interval <sync_interval>] [-S <max_speed>] [-r | --reset] [-n | --dryrun] [-h | --help]

OPTIONS:

   -M  Specify a MDT device to start the OI Scrub. This will be a requirement when Distributed Namespace (DNE) is introduced.
   
   -m  Specify the object iteration method. Currently, for the LFSCK phase one, only inode table based iteration will be available. Namespace based scanning maybe implemented in future. If it is not specified, the saved value (for resumed from break point case) or default value (inode table based iteration) will be used.
   
   -e  Specify if the OI Scrub will stop if an object process fails. If it is not specified, the saved value (for resumed from break point case) or default value (continue for failed items) will be used. This option can be changed during the OI Scrub running through the adjustment interfaces.
   
   -i  Define the OI Scrub status sync interval in number of iterations. After this number of iterations, a checkpoint will be written. If it is not specified, the saved value (for resumed from break point case) or default value (10000 iterations) will be used. This option can be changed during the OI Scrub running through a adjustment interface.
   
   -S  Set the delay between consecutive iterations. If it is not specified, the saved value (for resumed from break point case) or default value 0 (delay 0 time between processing inodes) will be used. This option can be changed during the OI Scrub running through a adjustment interface.
   
   -r  Reset the start position for the object iteration to the beginning of the specified MDT device. By default the iterator will resume the scanning from last break point (saved by the OI Scrub). If it is the first run OI Scrub (no saved break point), it will start from the beginning.
   
   -n  Dry-run: perform a trial run with out making any changes.
   
   -h  help information.

Stop OI Scrub by ‘lctl stop_scrub’

Used to stop OI Scrub after MDT mount.

SYNOPSIS:

   lctl stop_scrub <-M MDT_device> [-h | --help]
   

OPTIONS:

   -M  The same as ‘lctl start_scrub’.
   
   -h  help information.

Status interfaces

Show OI Scrub status by ‘lctl show_scrub’

Used to show the OI Scrub status after MDT mount.

SYNOPSIS:

   lctl show_scrub <-M MDT_device> [-P | --parameters] [-I] [-i] [-S] [-s] [-h | --help]

OPTIONS:

   -M  The same as ‘lctl start_scrub’.
   
   -P  Show the parameters, including:
   
     * The OI Scrub iteration method.
     * The behavior for failure.
     * The interval time for OI Scrub status sync.
     * The speed limitation.
     * In dryrun mode or not.
   
   -I  Show historical information, including:
   
     * How many successfully OI Scrub running on the specified MDT device.
     * When the last OI Scrub ran,  succeed or not. If it failed, where the break point is, and so on.
   
   -i  Show current information, including:
   
     * OI Scrub running status. If runs, the latest checkpoint, and where the current position is, and so on.
   
   -S  Show historical statistics, including:
     * How many objects have been processed/updated (or failed to process/update) from the beginning position for the specified MDT device. Count of the objects processed/updated (or failed to process/update) before the break points.
     * The total time used by the OI Scrub since the first unfinished or failed OI Scrub running (after the last successfully OI Scrub running), only the running time among break points are added up, and the current OI Scrub running is counted also.
     * The historical average OI Scrub processing speed according to above statistics.
     * Maybe additional metrics
   
   -s Show current statistics, including:
   
     * How many objects have been processed/updated (or failed to process/update) for current OI Scrub running. If the OI Scrub is not running, will be zero.
     * The time used by the current OI Scrub running. If the OI Scrub is not running, will be zero.
     * The average OI Scrub processing speed according to above statistics. If the OI Scrub is not running, will be zero.
     * Maybe additional metrics
   
   -h  help information.

Adjustment interfaces

Adjust exit_when_fail behavior

Change the OI Scrub behavior when fail to process a object during the OI Scrub running.

SYNOPSIS:

   lctl set_param scrub_ewf=xxx

VALUES:

   0  continue even if fail (default)
   !0 exit when fail

Adjust checkpoint count

Change the count of OI Scrub iterations between writing checkpoints.

SYNOPSIS:

   lctl set_param checkpoint_count=xxx

VALUES:

   10000 write checkpoint every 10000 iterations (default)
   >0    valid
   <0    invalid

Rate control

Changing the OI Scrub iterator delay between iterations.

SYNOPSIS:

   lctl set_param iterator_delay=xxx

VALUES:

   0    reset to default (run without any delay between iterations)
   >0   size of delay between iterations.
   <0   invalid

Mount options for OI Scrub

Typically, the MDT will detect file-level backup/restore by itself when it starts up. For convenience, new MDT mount options are provided: *scrub* and *noscrub*, to allow start (or not) the OI Scrub manually when mounts up MDT.