Lustre Script Coding Style: Difference between revisions

Latest revision as of 00:08, 23 April 2025

Bash Style

Bash is a programming language. It includes functions. Shell code outside of functions is effectively code in an implicit main() function. An entire function should be fully seen on one page (~70-90 lines) and be readily comprehensible. If you have any doubts, then it is too complicated. Make it easier to understand by separating it into subroutines.
The total length of a line (including comment) must not exceed 80 characters. Take advantage of bash's += operator for constants or linefeed escapes \.
- Lines can be split without the need for a linefeed escape after |, ||, & and && operators.
The indentation must use 8-column tabs and not spaces. For line continuation, an additional tab should be used to indent the continued line, or align after [ or ( for continued logic operations.
Comments are just as important in a shell script as in C code.
Use $(...) instead of `...` for subshell commands:
- $(...) is easier to see the start and end of the subshell command
- $(...) avoids confusion between '...' and `...` with a small font
- $(...) can be nested
Use the subshell syntax only when you have to:
- When you need to capture the output of a separate program
- Using the construct with functions leads to stray output and/or convoluted code struggling to avoid output pollution
- It is more computationally efficient to not fork() the Bash process. Bash is slow enough already.
Use "here string" like function <<<$var instead of echo $var | function to avoid forking a subshell and pipe
Use file arguments like awk '...' $file or input redirection like function << $file instead of a useless use of cat
Use built-in Bash Parameter Expansion for variable/string manipulation rather than forking sed/tr:
- Use ${VAR#prefix} or ${VAR%suffix} to remove prefix or suffix respectively
- Use ${VAR/pattern/string} to replace pattern with string
Avoid use of "grep foo | awk '{ print $2 }'" since "awk '/foo/ { print $2 }' works just as well and avoids a separate fork + pipe
If a variable is intended to be used as a boolean, then it must be assigned as follows:

        local mybool=false         # or true
        
        if $mybool; then
                do_stuff
        fi

for loops it is possible to avoid a subshell for $(seq 10) using the built-in iterator for fixed-length loops:
- Unfortunately, {1..$var} does not work, so use (( ... )) arithmetic operator

        for ((i=0; i < $var; i++)); do
                something_with $i
        done

Use export FOOBAR=val instead of FOOBAR=val; export FOOBAR for clarity and simplicity
Use [[ expr ]] instead of [ expr ]
- The [[ test understands regular expression matching with the =~ operator
- The easiest way to use it is by putting the expression in a variable and expanding it after the operator without quotes.

Use (( expr )) instead of [ expr ] or let expr when evaluating numerical expressions
- This can include mathematical operators like $((...))
- Will properly compare numeric values, unlike ... that is comparing strings

        # wrong: this is surprisingly "true" because '5' > '3' and the rest of the string is ignored
        [[ 5 > 33 ]] && echo "y" || echo "n"
        y
        # right: this is what you expect for the '>' operator
        (( 5 > 33 )) && echo "y" || echo "n"
        n

- This uses normal <=, >=, == comparisons instead of -lt, -eq, -gt
- Can use for (( i=0; i <= END; i++ )); do and other numerical expressions instead of an external subshell for seq

Use $((...)) for arithmetic expressions instead of expr ...
- No need for $ when referencing variable names inside $((...))
- $((...)) can handle hex values and common math operators
Error checks should prefer the form [[ check ]] || action to avoid leaving a dangling "false" on the return stack
- Otherwise, [[ check ]] && action will leave a dangling "false" on the stack if check fails and an immediately following return/end of function will return an error

Test Framework

Variables

Names of variables local to current test function which are not exported to the environment should be declared with "local" and use lowercase letters
Names of global variables or variables that exported to the environment should be UPPERCASE letters
Use $TMP/ for temporary non-Lustre files instead of /tmp/
Wse $SECONDS to get the current time when measuring test durations instead of $(date +%s) fork+exec:

    local start=$SECONDS
    
    do something
    local elapsed=$((SECONDS - start))

Functions

Each function must have a section describing what it does and explain the list of parameters

# One line description of this function's purpose
#
# More detailed description of what the function is doing if necessary
#
# usage: function_name [--option argument] {required_argument} ...
# option: meaning of "option" and its argument
# required_argument: meaning of "required_argument"
# 
# expected output and/or return value(s)

Function arguments should be given local variable names for clarity, rather than being used as $1 $2 $3 in the function

    local facet=$1
    local file="$2"
    local size=$3

Use sleep 0.1 instead of usleep 100000, since usleep is RHEL-specific

Tests and Libraries

To avoid clustering a single test-framework.sh file, there should be a <test-lib>.sh file for each test that contains specific functions and variables for that test.
Any functions, variables that global to all tests should be put in test-framework.sh
A test file only need to source test-framework.sh and necessary <test-lib>.sh file

Subtests

test files should be named $tfile for the filename, or base name like $tfile.1 or $tfile.source to simplify debugging
test directories should be named $tdir, and should be used if a large number of files are created for the subtest
small/few test files/dirs do not need to be explicitly deleted at the end of the test, that is done by test-framework.sh at the start/end of each test script
large (over 1MB)/many (over 50) test files/dirs in a subtest should be cleaned up explicitly with a stack_trap so that they are always cleaned up even if the test exits with an error, and do not fill the test filesystem

 
    stack_trap "rm -f $DIR/$tfile.big"
    fallocate -l 100M $DIR/$tfile.big || error "$tfile.big create failed"
    
    stack_trap "unlinkmany $DIR/$tdir/$tfile- 1000"
    createmany -o $DIR/$tdif/$tfile- 1000 || error "$tfile creation failed"

creating large test files is by far the fastest with "fallocate" *when supported* (ldiskfs only), as determined by check_set_fallocate
use test_mkdir to add some variety to directory creation (random local, striped, remote) if directory location is not critical to the test
ensure that directory location and MDS facet are aligned. Since 2.14.54 directories may be created on any MDT, so "do_facet mds1 ..." may be on the wrong MDS.
Use "mkdir_on_mdt0 $DIR/$tdir" to create directories on MDT0000 for use with mds1, or "$LFS getdirstripe -m $DIR/$tdir" to determine MDT index, and "mds$((idx+1))" for facet name.
the error messages in a subtest should be unique so that it is easy to determine which check failed

 
    lfs migrate -c3 $tfile || error "'lfs migrate -c3' failed"
    lfs migrate -c1 $tfile || error "second 'lfs migrate -c1' failed"

use skip to skip subtests that should not run because of permanent functional deficiency (e.g. non-existent functionality in backing filesystem, older version of client/server, wrong configuration)

 
    (( MDS1_VERSION_CODE >= $(version_code 2.15.53) )) ||
        skip "need MDS >= 2.15.53 to check foobar works"
    [[ $mds1_FSTYPE == "ldiskfs" ]] || skip "MDS is not ldiskfs"

use skip_env for minor environmental deficiency in developer test environment (e.g. missing binary) that _should_ exist in autotest:

 
    kinit || skip_env "Kerberos not installed"

@@ Line 22: / Line 22: @@
   <nowiki>
          local mybool=false         # or true
          if $mybool; then
                  do_stuff
@@ Line 28: / Line 28: @@
 </nowiki>
 * for loops it is possible to avoid a subshell for <code>$(seq 10)</code> using the built-in iterator for fixed-length loops:
-** Unfortunately, <code>{1..$var}</code> does not work, but <code>eval {1..$var}</code> does
+** Unfortunately, <code>{1..$var}</code> does not work, so use <code>(( ... ))</code> arithmetic operator
   <nowiki>
-         for i in {1..10}; do
+         for ((i=0; i < $var; i++)); do
                  something_with $i
          done
@@ Line 38: / Line 38: @@
 ** The <code>[[</code> test understands regular expression matching with the <code>=~</code> operator
 ** The easiest way to use it is by putting the expression in a variable and expanding it after the operator without quotes.
 * Use <code><nowiki>(( expr ))</nowiki></code> instead of <code><nowiki>[ expr ]</nowiki></code> or <code>let expr</code> when evaluating numerical expressions
 ** This can include mathematical operators like <code>$((...))</code>
-** This uses normal <code><=</code>, <code>>=</code>, <code>==</code> comparisons
+** Will properly compare numeric values, unlike <code>[[ ... ]]</code> that is comparing strings
+ <nowiki>
+        # wrong: this is surprisingly "true" because '5' > '3' and the rest of the string is ignored
+        [[ 5 > 33 ]] && echo "y" || echo "n"
+        y
+        # right: this is what you expect for the '>' operator
+        (( 5 > 33 )) && echo "y" || echo "n"
+        n
+</nowiki>
+** This uses normal <code><=</code>, <code>>=</code>, <code>==</code> comparisons instead of <code>-lt</code>, <code>-eq</code>, <code>-gt</code>
+** Can use <code><nowiki>for (( i=0; i <= END; i++ )); do</nowiki></code> and other numerical expressions instead of an external subshell for <code>seq</code>
 * Use <code>$((...))</code> for arithmetic expressions instead of <code>expr ...</code>
 ** No need for <code>$</code> when referencing variable names inside <code>$((...))</code>
-** <code>$((...))</code> can handle hex values and math operators
+** <code>$((...))</code> can handle hex values and common math operators
 * Error checks should prefer the form <code><nowiki>[[ check ]] || action</nowiki></code> to avoid leaving a dangling "false" on the return stack
 ** Otherwise, <code><nowiki>[[ check ]] && action</nowiki></code> will leave a dangling "false" on the stack if <code>check</code> fails and an immediately following return/end of function will return an error
@@ Line 49: / Line 61: @@
 == Test Framework ==
 === Variables ===
-* Names of variables local to current script which are not exported to the environment should be declared with "<code>local</code>" and use lowercase letters
+* Names of variables local to current test function which are not exported to the environment should be declared with "<code>local</code>" and use lowercase letters
 * Names of global variables or variables that exported to the environment should be UPPERCASE letters
+* Use <code>$TMP/</code> for temporary non-Lustre files instead of <code>/tmp/</code>
+* Wse <code>$SECONDS</code> to get the current time when measuring test ''durations'' instead of <code>$(date +%s)</code> fork+exec:
+ <nowiki>
+    local start=$SECONDS
+    do something
+    local elapsed=$((SECONDS - start))
+</nowiki>
 === Functions ===
@@ Line 65: / Line 85: @@
 # expected output and/or return value(s)
 </nowiki>
-* Function arguments should be given local variable names for clarity like "<code>local facet=$1</code>", rather than being used as <code>$1</code> in the function
+* Function arguments should be given local variable names for clarity, rather than being used as <code>$1 $2 $3</code> in the function
+ <nowiki>
+    local facet=$1
+    local file="$2"
+    local size=$3
+</nowiki>
 * Use <code>sleep 0.1</code> instead of <code>usleep 100000</code>, since <code>usleep</code> is RHEL-specific
@@ Line 72: / Line 97: @@
 * Any functions, variables that global to all tests should be put in <code>test-framework.sh</code>
 * A test file only need to source <code>test-framework.sh</code> and necessary <code><nowiki><test-lib>.sh</nowiki></code> file
+=== Subtests ===
+* test files should be named <code>$tfile</code> for the filename, or base name like <code>$tfile.1</code> or <code>$tfile.source</code> to simplify debugging
+* test directories should be named <code>$tdir</code>, and should be used if a large number of files are created for the subtest
+* small/few test files/dirs do not need to be explicitly deleted at the end of the test, that is done by test-framework.sh at the start/end of each test script
+* large (over 1MB)/many (over 50) test files/dirs in a subtest should be cleaned up explicitly with a <code>stack_trap</code> so that they are always cleaned up even if the test exits with an error, and do not fill the test filesystem
+  <nowiki>
+    stack_trap "rm -f $DIR/$tfile.big"
+    fallocate -l 100M $DIR/$tfile.big || error "$tfile.big create failed"
+    stack_trap "unlinkmany $DIR/$tdir/$tfile- 1000"
+    createmany -o $DIR/$tdif/$tfile- 1000 || error "$tfile creation failed"
+</nowiki>
+* creating large test files is by far the fastest with "fallocate" *when supported* (ldiskfs only), as determined by <code>check_set_fallocate</code>
+* use <code>test_mkdir</code> to add some variety to directory creation (random local, striped, remote) if directory location is not critical to the test
+* ensure that directory location and MDS facet are aligned.  Since 2.14.54 directories may be created on any MDT, so "<code>do_facet mds1 ...</code>" may be on the wrong MDS.
+* Use "<code>mkdir_on_mdt0 $DIR/$tdir</code>" to create directories on MDT0000 for use with <code>mds1</code>, or "<code>$LFS getdirstripe -m $DIR/$tdir</code>" to determine MDT index, and "<code>mds$((idx+1))</code>" for facet name.
+* the <code>error</code> messages in a subtest should be unique so that it is easy to determine which check failed
+  <nowiki>
+    lfs migrate -c3 $tfile || error "'lfs migrate -c3' failed"
+    lfs migrate -c1 $tfile || error "second 'lfs migrate -c1' failed"
+</nowiki>
+* use <code>skip</code> to skip subtests that should not run because of permanent functional deficiency (e.g. non-existent functionality in backing filesystem, older version of client/server, wrong configuration)
+  <nowiki>
+    (( MDS1_VERSION_CODE >= $(version_code 2.15.53) )) ||
+        skip "need MDS >= 2.15.53 to check foobar works"
+    [[ $mds1_FSTYPE == "ldiskfs" ]] || skip "MDS is not ldiskfs"
+</nowiki>
+* use <code>skip_env</code> for minor environmental deficiency in developer test environment (e.g. missing binary) that _should_ exist in autotest:
+  <nowiki>
+    kinit || skip_env "Kerberos not installed"
+</nowiki>
 [[Category: Development]]

Lustre Script Coding Style: Difference between revisions

Latest revision as of 00:08, 23 April 2025

Contents

Bash Style

Test Framework

Variables

Functions

Tests and Libraries

Subtests

Navigation menu

Lustre Script Coding Style: Difference between revisions

Latest revision as of 00:08, 23 April 2025

Bash Style

Test Framework

Variables

Functions

Tests and Libraries

Subtests

Navigation menu

Search