Lustre Script Coding Style: Difference between revisions

From Lustre Wiki
Jump to navigation Jump to search
(→‎Bash Style: add "here string" usage)
(→‎Bash Style: minor improvements)
Line 3: Line 3:
* The total length of a line (including comment) must not exceed 80 characters.  Take advantage of bash's <code>+=</code> operator for constants or linefeed escapes <code>\</code>.
* The total length of a line (including comment) must not exceed 80 characters.  Take advantage of bash's <code>+=</code> operator for constants or linefeed escapes <code>\</code>.
* Lines can be split without the need for a linefeed escape after <code>|</code>, <code>||</code>, <code>&</code> and <code>&&</code> operators.
* Lines can be split without the need for a linefeed escape after <code>|</code>, <code>||</code>, <code>&</code> and <code>&&</code> operators.
* The indentation must 8-column tabs and not spaces. For line continuation, an additional tab should be used to indent the continued line.
* The indentation must use 8-column tabs and not spaces. For line continuation, an additional tab should be used to indent the continued line, or align after <code>[</code> or <code>(</code> for continued logic operations.
* Comments are just as important in a shell script as in C code.
* Comments are just as important in a shell script as in C code.
* Use <code>$(...)</code> instead of <code>`...`</code> for subshell, but avoid them if you can.  Text results from functions should be put into a well-named variable.  Use the subshell syntax only when you have to (e.g. when you need to capture the output of a separate program).  Using the construct with functions leads to stray output and/or convoluted code struggling to avoid output pollution.  It is also more computationally efficient to not fork() the BASH process. BASH is slow enough already.  <code>`...`</code> is obsolete and <code>$(...)</code> is easier to see the start and end of the subshell command, avoids confusion with <code>'...'</code> and a small font, and <code>$(...)</code> can be nested.
* Use well-named lowercase <code>local</code> variable names, and UPPERCASE global variable names for clarity.
* Use <code>$(...)</code> instead of <code>`...`</code> for subshell commands, since the former is easier to see the start and end of the subshell command, avoids confusion with <code>'...'</code> and a small font, and <code>$(...)</code> can be nested.  Use the subshell syntax only when you have to (e.g. when you need to capture the output of a separate program).  Using the construct with functions leads to stray output and/or convoluted code struggling to avoid output pollution.  It is also more computationally efficient to not fork() the Bash process. Bash is slow enough already.   
* Use "here string" like <code>function <<<$var</code> instead of <code>echo $var | function</code> to avoid forking a subshell and pipe
* Use "here string" like <code>function <<<$var</code> instead of <code>echo $var | function</code> to avoid forking a subshell and pipe
* Use built-in Bash [https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html Parameter Expansion] for variable/string manipulation rather than forking sed/tr if possible.
* Avoid use of "<code>grep foo | awk '{ print $2 }'</code>" since "<code>awk '/foo/ { print $2 }'</code> works just as well and avoids a separate fork + pipe
* If a variable is intended to be used as a boolean, then it must be assigned as followed:
* If a variable is intended to be used as a boolean, then it must be assigned as followed:
  <nowiki>
  <nowiki>
Line 23: Line 26:
* Use <code><nowiki>[[ expr ]]</nowiki></code> instead of <code><nowiki>[ expr ]</nowiki></code>, especially since the <code>[[</code> test understands regular expression matching with the <code>=~</code> operator.  The easiest way to use it is by putting the RE in a variable and expanding the RE after the operator without quotes.
* Use <code><nowiki>[[ expr ]]</nowiki></code> instead of <code><nowiki>[ expr ]</nowiki></code>, especially since the <code>[[</code> test understands regular expression matching with the <code>=~</code> operator.  The easiest way to use it is by putting the RE in a variable and expanding the RE after the operator without quotes.
* Use <code>$((...))</code> for arithmetic expressions instead of <code>expr</code>
* Use <code>$((...))</code> for arithmetic expressions instead of <code>expr</code>
* Use <code>{1..10}</code> for generating a constant list of integers instead of <code>$(seq 10)</code>, though this unfortunately <code>{1..$num}</code> does not work for variable-length lists


== Test Framework ==
== Test Framework ==

Revision as of 13:51, 18 June 2020

Bash Style

  • Bash is a programming language. It includes functions. Shell code outside of functions is effectively code in an implicit main() function. An entire function should be fully seen on one page (~70-90 lines) and be readily comprehensible. If you have any doubts, then it is too complicated. Make it easier to understand by separating it into subroutines.
  • The total length of a line (including comment) must not exceed 80 characters. Take advantage of bash's += operator for constants or linefeed escapes \.
  • Lines can be split without the need for a linefeed escape after |, ||, & and && operators.
  • The indentation must use 8-column tabs and not spaces. For line continuation, an additional tab should be used to indent the continued line, or align after [ or ( for continued logic operations.
  • Comments are just as important in a shell script as in C code.
  • Use well-named lowercase local variable names, and UPPERCASE global variable names for clarity.
  • Use $(...) instead of `...` for subshell commands, since the former is easier to see the start and end of the subshell command, avoids confusion with '...' and a small font, and $(...) can be nested. Use the subshell syntax only when you have to (e.g. when you need to capture the output of a separate program). Using the construct with functions leads to stray output and/or convoluted code struggling to avoid output pollution. It is also more computationally efficient to not fork() the Bash process. Bash is slow enough already.
  • Use "here string" like function <<<$var instead of echo $var | function to avoid forking a subshell and pipe
  • Use built-in Bash Parameter Expansion for variable/string manipulation rather than forking sed/tr if possible.
  • Avoid use of "grep foo | awk '{ print $2 }'" since "awk '/foo/ { print $2 }' works just as well and avoids a separate fork + pipe
  • If a variable is intended to be used as a boolean, then it must be assigned as followed:
local mybool=false         # or true
if $mybool; then
        do_stuff
fi
 
  • for loops it is possible to avoid a subshell for $(seq 10) using the built-in iterator for fixed-length loops (unfortunately, {1..$var} does not work):
for i in {1..10}; do
      something_with $i
done

  • Use export FOOBAR=val instead of FOOBAR=val; export FOOBAR for clarity and simplicity
  • Use [[ expr ]] instead of [ expr ], especially since the [[ test understands regular expression matching with the =~ operator. The easiest way to use it is by putting the RE in a variable and expanding the RE after the operator without quotes.
  • Use $((...)) for arithmetic expressions instead of expr
  • Use {1..10} for generating a constant list of integers instead of $(seq 10), though this unfortunately {1..$num} does not work for variable-length lists

Test Framework

Variables

  • Names of variables local to current script which are not exported to the environment should be declared with "local" and use lowercase letters
  • Names of global variables or variables that exported to the environment should be uppercase letters

Functions

  • Each function must have a section describing what it does and explain the list of parameters
# One line description of this function's purpose
#
# More detailed description of what the function is doing if necessary
#
# usage: function_name [--option argument] {required_argument} ...
# option: meaning of "option" and its argument
# required_argument: meaning of "required_argument"
# 
# expected output and/or return value(s)

Tests and Libraries

  • To avoid clustering a single test-framework.sh file, there should be a <test-lib>.sh file for each test that contains specific functions and variables for that test.
  • Any functions, variables that global to all tests should be put in test-framework.sh
  • A test file only need to source test-framework.sh and necessary <test-lib>.sh file