Lustre Script Coding Style: Difference between revisions

From Lustre Wiki
Jump to navigation Jump to search
(assigning category)
(→‎Bash Style: add "for (( ... )); do")
 
(15 intermediate revisions by the same user not shown)
Line 2: Line 2:
* Bash is a programming language.  It includes functions.  Shell code outside of functions is effectively code in an implicit main() function.  An entire function should be fully seen on one page (~70-90 lines) and be readily comprehensible.  If you have any doubts, then it is too complicated.  Make it easier to understand by separating it into subroutines.
* Bash is a programming language.  It includes functions.  Shell code outside of functions is effectively code in an implicit main() function.  An entire function should be fully seen on one page (~70-90 lines) and be readily comprehensible.  If you have any doubts, then it is too complicated.  Make it easier to understand by separating it into subroutines.
* The total length of a line (including comment) must not exceed 80 characters.  Take advantage of bash's <code>+=</code> operator for constants or linefeed escapes <code>\</code>.
* The total length of a line (including comment) must not exceed 80 characters.  Take advantage of bash's <code>+=</code> operator for constants or linefeed escapes <code>\</code>.
* Lines can be split without the need for a linefeed escape after <code>|</code>, <code>||</code>, <code>&</code> and <code>&&</code> operators.
** Lines can be split without the need for a linefeed escape after <code>|</code>, <code>||</code>, <code>&</code> and <code>&&</code> operators.
* The indentation must 8-column tabs and not spaces. For line continuation, an additional tab should be used to indent the continued line.
* The indentation must use 8-column tabs and not spaces. For line continuation, an additional tab should be used to indent the continued line, or align after <code>[</code> or <code>(</code> for continued logic operations.
* Comments are just as important in a shell script as in C code.
* Comments are just as important in a shell script as in C code.
* Use <code>$(...)</code> instead of <code>`...`</code> for subshell, but avoid them if you can. Text results from functions should be put into a well-named variable. Use the subshell syntax only when you have to (e.g. when you need to capture the output of a separate program).  Using the construct with functions leads to stray output and/or convoluted code struggling to avoid output pollutionIt is also more computationally efficient to not fork() the BASH process. BASH is slow enough already. <code>`...`</code> is obsolete and <code>$(...)</code> is easier to see the start and end of the subshell command, avoids confusion with <code>'...'</code> and a small font, and <code>$(...)</code> can be nested.
* Use <code>$(...)</code> instead of <code>`...`</code> for subshell commands:
* If a variable is intended to be used as a boolean, then it must be assigned as followed:
** <code>$(...)</code> is easier to see the start and end of the subshell command
** <code>$(...)</code> avoids confusion between <code>'...'</code> and <code>`...`</code> with a small font
** <code>$(...)</code> can be nested
* Use the subshell syntax only when you have to:
** When you need to capture the output of a separate program
** Using the construct with functions leads to stray output and/or convoluted code struggling to avoid output pollution
** It is more computationally efficient to not fork() the Bash process. Bash is slow enough already.
* Use "here string" like <code>function <<<$var</code> instead of <code>echo $var | function</code> to avoid forking a subshell and pipe
* Use file arguments like <code>awk '...' $file</code> or input redirection like <code>function << $file</code> instead of a [http://porkmail.org/era/unix/award.html useless use of <code>cat</code>]
* Use built-in Bash [https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html Parameter Expansion] for variable/string manipulation rather than forking <code>sed/tr</code>:
** Use <code>${VAR#prefix}</code> or <code>${VAR%suffix}</code> to remove <code>prefix</code> or <code>suffix</code> respectively
** Use <code>${VAR/pattern/string}</code> to replace <code>pattern</code> with </code>string</code>
* Avoid use of "<code>grep foo | awk '{ print $2 }'</code>" since "<code>awk '/foo/ { print $2 }'</code> works just as well and avoids a separate fork + pipe
* If a variable is intended to be used as a boolean, then it must be assigned as follows:
  <nowiki>
  <nowiki>
local mybool=false        # or true
        local mybool=false        # or true
if ${mybool}; then
       
        do_stuff
        if $mybool; then
fi
                do_stuff
        fi
</nowiki>  
</nowiki>  
* for loops it is possible to avoid a subshell for <code>$(seq 10)</code> using the built-in iterator for fixed-length loops:
** Unfortunately, <code>{1..$var}</code> does not work, so use <code>(( ... ))</code> arithmetic operator
<nowiki>
        for ((i=0; i < $var; i++)); do
                something_with $i
        done
</nowiki>
* Use <code>export FOOBAR=val</code> instead of <code>FOOBAR=val; export FOOBAR</code> for clarity and simplicity
* Use <code>export FOOBAR=val</code> instead of <code>FOOBAR=val; export FOOBAR</code> for clarity and simplicity
* Use <code><nowiki>[[ expr ]]</nowiki></code> instead of <code><nowiki>[ expr ]</nowiki></code>, especially since the <code>[[</code> test understands regular expression matching with the <code>=~</code> operatorThe easiest way to use it is by putting the RE in a variable and expanding the RE after the operator without quotes.
* Use <code><nowiki>[[ expr ]]</nowiki></code> instead of <code><nowiki>[ expr ]</nowiki></code>
* Use <code>$((...))</code> for arithmetic expressions instead of <code>expr</code>
** The <code>[[</code> test understands regular expression matching with the <code>=~</code> operator
** The easiest way to use it is by putting the expression in a variable and expanding it after the operator without quotes.
 
* Use <code><nowiki>(( expr ))</nowiki></code> instead of <code><nowiki>[ expr ]</nowiki></code> or <code>let expr</code> when evaluating numerical expressions
** This can include mathematical operators like <code>$((...))</code>
** This uses normal <code><=</code>, <code>>=</code>, <code>==</code> comparisons instead of <code>-lt</code>, <code>-eq</code>, <code>-gt</code>
** Can use <code><nowiki>for (( i=0; i <= END; i++ )); do</nowiki></code> and other numerical expressions instead of an external subshell for <code>seq</seq>
 
* Use <code>$((...))</code> for arithmetic expressions instead of <code>expr ...</code>
** No need for <code>$</code> when referencing variable names inside <code>$((...))</code>
** <code>$((...))</code> can handle hex values and common math operators
* Error checks should prefer the form <code><nowiki>[[ check ]] || action</nowiki></code> to avoid leaving a dangling "false" on the return stack
** Otherwise, <code><nowiki>[[ check ]] && action</nowiki></code> will leave a dangling "false" on the stack if <code>check</code> fails and an immediately following return/end of function will return an error


== Test Framework ==
== Test Framework ==
=== Variables ===
=== Variables ===
* Names of variables local to current script which are not exported to the environment should be declared with "local" and use lowercase letters
* Names of variables local to current test function which are not exported to the environment should be declared with "<code>local</code>" and use lowercase letters
* Names of global variables or variables that exported to the environment should be uppercase letters
* Names of global variables or variables that exported to the environment should be UPPERCASE letters
* Use <code>$TMP/</code> for temporary non-Lustre files instead of <code>/tmp/</code>
* Wse <code>$SECONDS</code> to get the current time when measuring test ''durations'' instead of <code>$(date +%s)</code> fork+exec:
<nowiki>
    local start=$SECONDS
   
    do something
    local elapsed=$((SECONDS - start))
</nowiki>
 
=== Functions ===
=== Functions ===
* Each function must have a section describing what it does and explain the list of parameters
* Each function must have a section describing what it does and explain the list of parameters
Line 34: Line 76:
# expected output and/or return value(s)
# expected output and/or return value(s)
</nowiki>
</nowiki>
* Function arguments should be given local variable names for clarity, rather than being used as <code>$1 $2 $3</code> in the function
<nowiki>
    local facet=$1
    local file="$2"
    local size=$3
</nowiki>
* Use <code>sleep 0.1</code> instead of <code>usleep 100000</code>, since <code>usleep</code> is RHEL-specific
=== Tests and Libraries ===
=== Tests and Libraries ===
* To avoid clustering a single <code>test-framework.sh</code> file, there should be a <code><nowiki><test-lib>.sh</nowiki></code> file for each test that contains specific functions and variables for that test.
* To avoid clustering a single <code>test-framework.sh</code> file, there should be a <code><nowiki><test-lib>.sh</nowiki></code> file for each test that contains specific functions and variables for that test.
* Any functions, variables that global to all tests should be put in <code>test-framework.sh</code>
* Any functions, variables that global to all tests should be put in <code>test-framework.sh</code>
* A test file only need to source <code>test-framework.sh</code> and necessary <code><nowiki><test-lib>.sh</nowiki></code> file
* A test file only need to source <code>test-framework.sh</code> and necessary <code><nowiki><test-lib>.sh</nowiki></code> file
=== Subtests ===
* test files should be named <code>$tfile</code> for the filename, or base name like <code>$tfile.1</code> or <code>$tfile.source</code> to simplify debugging
* test directories should be named <code>$tdir</code>, and should be used if a large number of files are created for the subtest
* small/few test files/dirs do not need to be explicitly deleted at the end of the test, that is done by test-framework.sh at the start/end of each test script
* large (over 1MB)/many (over 50) test files/dirs in a subtest should be cleaned up explicitly with a <code>stack_trap</code> so that they are always cleaned up even if the test exits with an error, and do not fill the test filesystem
  <nowiki>
    stack_trap "rm -f $DIR/$tfile.big"
    fallocate -l 100M $DIR/$tfile.big || error "$tfile.big create failed"
   
    stack_trap "unlinkmany $DIR/$tdir/$tfile- 1000"
    createmany -o $DIR/$tdif/$tfile- 1000 || error "$tfile creation failed"
</nowiki>
* creating large test files is by far the fastest with "fallocate" *when supported* (ldiskfs only), as determined by <code>check_set_fallocate</code>
* use <code>test_mkdir</code> to add some variety to directory creation (random local, striped, remote) if directory location is not critical to the test
* ensure that directory location and MDS facet are aligned.  Since 2.14.54 directories may be created on any MDT, so "<code>do_facet mds1 ...</code>" may be on the wrong MDS.
* Use "<code>mkdir_on_mdt0 $DIR/$tdir</code>" to create directories on MDT0000 for use with <code>mds1</code>, or "<code>$LFS getdirstripe -m $DIR/$tdir</code>" to determine MDT index, and "<code>mds$((idx+1))</code>" for facet name.
* the <code>error</code> messages in a subtest should be unique so that it is easy to determine which check failed
  <nowiki>
    lfs migrate -c3 $tfile || error "'lfs migrate -c3' failed"
    lfs migrate -c1 $tfile || error "second 'lfs migrate -c1' failed"
</nowiki>
* use <code>skip</code> to skip subtests that should not run because of permanent functional deficiency (e.g. non-existent functionality in backing filesystem, older version of client/server, wrong configuration)
  <nowiki>
    (( MDS1_VERSION_CODE >= $(version_code 2.15.53) )) ||
        skip "need MDS >= 2.15.53 to check foobar works"
    [[ $mds1_FSTYPE == "ldiskfs" ]] || skip "MDS is not ldiskfs"
</nowiki>
* use <code>skip_env</code> for minor environmental deficiency in developer test environment (e.g. missing binary) that _should_ exist in autotest:
  <nowiki>
    kinit || skip_env "Kerberos not installed"
</nowiki>
   


[[Category: Development]]
[[Category: Development]]

Latest revision as of 00:10, 12 April 2024

Bash Style

  • Bash is a programming language. It includes functions. Shell code outside of functions is effectively code in an implicit main() function. An entire function should be fully seen on one page (~70-90 lines) and be readily comprehensible. If you have any doubts, then it is too complicated. Make it easier to understand by separating it into subroutines.
  • The total length of a line (including comment) must not exceed 80 characters. Take advantage of bash's += operator for constants or linefeed escapes \.
    • Lines can be split without the need for a linefeed escape after |, ||, & and && operators.
  • The indentation must use 8-column tabs and not spaces. For line continuation, an additional tab should be used to indent the continued line, or align after [ or ( for continued logic operations.
  • Comments are just as important in a shell script as in C code.
  • Use $(...) instead of `...` for subshell commands:
    • $(...) is easier to see the start and end of the subshell command
    • $(...) avoids confusion between '...' and `...` with a small font
    • $(...) can be nested
  • Use the subshell syntax only when you have to:
    • When you need to capture the output of a separate program
    • Using the construct with functions leads to stray output and/or convoluted code struggling to avoid output pollution
    • It is more computationally efficient to not fork() the Bash process. Bash is slow enough already.
  • Use "here string" like function <<<$var instead of echo $var | function to avoid forking a subshell and pipe
  • Use file arguments like awk '...' $file or input redirection like function << $file instead of a useless use of cat
  • Use built-in Bash Parameter Expansion for variable/string manipulation rather than forking sed/tr:
    • Use ${VAR#prefix} or ${VAR%suffix} to remove prefix or suffix respectively
    • Use ${VAR/pattern/string} to replace pattern with string
  • Avoid use of "grep foo | awk '{ print $2 }'" since "awk '/foo/ { print $2 }' works just as well and avoids a separate fork + pipe
  • If a variable is intended to be used as a boolean, then it must be assigned as follows:
        local mybool=false         # or true
        
        if $mybool; then
                do_stuff
        fi
 
  • for loops it is possible to avoid a subshell for $(seq 10) using the built-in iterator for fixed-length loops:
    • Unfortunately, {1..$var} does not work, so use (( ... )) arithmetic operator
        for ((i=0; i < $var; i++)); do
                something_with $i
        done

  • Use export FOOBAR=val instead of FOOBAR=val; export FOOBAR for clarity and simplicity
  • Use [[ expr ]] instead of [ expr ]
    • The [[ test understands regular expression matching with the =~ operator
    • The easiest way to use it is by putting the expression in a variable and expanding it after the operator without quotes.
  • Use (( expr )) instead of [ expr ] or let expr when evaluating numerical expressions
    • This can include mathematical operators like $((...))
    • This uses normal <=, >=, == comparisons instead of -lt, -eq, -gt
    • Can use for (( i=0; i <= END; i++ )); do and other numerical expressions instead of an external subshell for seq</seq>
  • Use $((...)) for arithmetic expressions instead of expr ...
    • No need for $ when referencing variable names inside $((...))
    • $((...)) can handle hex values and common math operators
  • Error checks should prefer the form [[ check ]] || action to avoid leaving a dangling "false" on the return stack
    • Otherwise, [[ check ]] && action will leave a dangling "false" on the stack if check fails and an immediately following return/end of function will return an error

Test Framework

Variables

  • Names of variables local to current test function which are not exported to the environment should be declared with "local" and use lowercase letters
  • Names of global variables or variables that exported to the environment should be UPPERCASE letters
  • Use $TMP/ for temporary non-Lustre files instead of /tmp/
  • Wse $SECONDS to get the current time when measuring test durations instead of $(date +%s) fork+exec:
    local start=$SECONDS
    
    do something
    local elapsed=$((SECONDS - start))

Functions

  • Each function must have a section describing what it does and explain the list of parameters
# One line description of this function's purpose
#
# More detailed description of what the function is doing if necessary
#
# usage: function_name [--option argument] {required_argument} ...
# option: meaning of "option" and its argument
# required_argument: meaning of "required_argument"
# 
# expected output and/or return value(s)

  • Function arguments should be given local variable names for clarity, rather than being used as $1 $2 $3 in the function
    local facet=$1
    local file="$2"
    local size=$3

  • Use sleep 0.1 instead of usleep 100000, since usleep is RHEL-specific

Tests and Libraries

  • To avoid clustering a single test-framework.sh file, there should be a <test-lib>.sh file for each test that contains specific functions and variables for that test.
  • Any functions, variables that global to all tests should be put in test-framework.sh
  • A test file only need to source test-framework.sh and necessary <test-lib>.sh file

Subtests

  • test files should be named $tfile for the filename, or base name like $tfile.1 or $tfile.source to simplify debugging
  • test directories should be named $tdir, and should be used if a large number of files are created for the subtest
  • small/few test files/dirs do not need to be explicitly deleted at the end of the test, that is done by test-framework.sh at the start/end of each test script
  • large (over 1MB)/many (over 50) test files/dirs in a subtest should be cleaned up explicitly with a stack_trap so that they are always cleaned up even if the test exits with an error, and do not fill the test filesystem
 
    stack_trap "rm -f $DIR/$tfile.big"
    fallocate -l 100M $DIR/$tfile.big || error "$tfile.big create failed"
    
    stack_trap "unlinkmany $DIR/$tdir/$tfile- 1000"
    createmany -o $DIR/$tdif/$tfile- 1000 || error "$tfile creation failed"

  • creating large test files is by far the fastest with "fallocate" *when supported* (ldiskfs only), as determined by check_set_fallocate
  • use test_mkdir to add some variety to directory creation (random local, striped, remote) if directory location is not critical to the test
  • ensure that directory location and MDS facet are aligned. Since 2.14.54 directories may be created on any MDT, so "do_facet mds1 ..." may be on the wrong MDS.
  • Use "mkdir_on_mdt0 $DIR/$tdir" to create directories on MDT0000 for use with mds1, or "$LFS getdirstripe -m $DIR/$tdir" to determine MDT index, and "mds$((idx+1))" for facet name.
  • the error messages in a subtest should be unique so that it is easy to determine which check failed
 
    lfs migrate -c3 $tfile || error "'lfs migrate -c3' failed"
    lfs migrate -c1 $tfile || error "second 'lfs migrate -c1' failed"

  • use skip to skip subtests that should not run because of permanent functional deficiency (e.g. non-existent functionality in backing filesystem, older version of client/server, wrong configuration)
 
    (( MDS1_VERSION_CODE >= $(version_code 2.15.53) )) ||
        skip "need MDS >= 2.15.53 to check foobar works"
    [[ $mds1_FSTYPE == "ldiskfs" ]] || skip "MDS is not ldiskfs"

  • use skip_env for minor environmental deficiency in developer test environment (e.g. missing binary) that _should_ exist in autotest:
 
    kinit || skip_env "Kerberos not installed"