C H A P T E R  16

POSIX

This chapter describes how to install and run the POSIX compliance suite of file system tests and includes the following sections:


16.1 Introduction to POSIX

Portable Operating System Interface (POSIX) is a set of standard, operating system interfaces based on the Unix OS. POSIX defines file system behavior on single UNIX node. Although used mainly with UNIX systems, the POSIX standard can apply to any operating system.

POSIX specifies the user and software interfaces to the OS. Required program-level services include basic I/O (file, terminal, and network) services. POSIX also defines a standard threading library API which is supported by most modern operating systems.

POSIX in a cluster means that most of the operations are atomic. Clients cannot see the metadata. POSIX offers strict mandatory locking which gives guarantee of semantics. Users do not have control on these locks.



Note - Lustre is not completely POSIX-compliant, so test results may show some errors. If you have questions about test results, contact our QE and Test Team (lustre-koala-team@sun.com).



16.2 Installing POSIX

Several quick start versions of the POSIX compliance suite are available to download. Each version is gcc- and architecture-specific. You need to determine which version of gcc you are running locally ({{{gcc -v}}}) and then download the appropriate tarball.

If a package is not available for your particular combination of gcc+architecture, see Building and Running a POSIX Compliance Test Suite on Lustre.

The following quick start versions are provided:

16.2.1 POSIX Installation Using a Quick Start Version

Use this procedure to install POSIX using a quick start version.

1. Download the POSIX scripts into /usr/src/posix.

Both scripts are available at:

http://downloads.lustre.org/public/tools/benchmarks/posix/

2. Launch the setup script. Run:

cd /usr/src/posix 
sh one-step-setup.sh

3. Edit the configuration file /mnt/lustre/TESTROOT/tetexec.cfg with appropriate values for your system.

4. Save the TESTROOT for running Lustre tests. Run:

cd /mnt/lustre 
tar zcvf /usr/src/posix/TESTROOT.tgz TESTROOT


Note - The quick start installation procedure only works with the paths /home/tet and /mnt/lustre. If you want to change the paths, follow the steps in Building and Running a POSIX Compliance Test Suite on Lustre and create a new tarball.


5. Launch the test suite. Run:

su - vxs0 
. ../profile 
tcc -e -a /mnt/lustre/TESTROOT -s scen.exec -p


16.3 Building and Running a POSIX Compliance Test Suite on Lustre

This section describes how to build and run a POSIX compliance test suite for a compiler and architecture for which we do not provide a quick start package.

16.3.1 Building the Test Suite from Scratch

This section describes building a POSIX compliance suite to test a Lustre file system.

1. Download all POSIX files in http://downloads.lustre.org/public/tools/benchmarks/posix



Note - We now use the latest release of the LSB-VSX POSIX test suite (lts_vsx-pcts2.0beta2.tgz) and the generic TET/VSXgen framework (tet_vsxgen_3.02.tgz). In this release, the issue of "getgroups() did not return NGROUPS_MAX" has been fixed.


2. DO NOT configure or mount a Lustre file system yet.

3. Run the {{{install.sh}}} script and select /home/tet for the root directory for the test suite installation. Say 'y' to install the users and groups. Accept the defaults to install the packages.

4. Create a temporary directory to hold the POSIX tests while they are being built. Run:

mkdir -p /mnt/lustre/TESTROOT;chown vsx0.vsxg0 !$

5. Log in as the test user. Run:

su - vsx0

6. Build the test suite. Run:

../setup.sh

Most of the default answers are correct, except the root directory from which to run the testsets. For this you should specify /mnt/lustre/TESTROOT. For "Install pseudolanguages?", answer 'n'.

7. When the script prompts "Install scripts into TESTROOT/BIN..?", do not stop the script from running (this does not work). Instead, use another terminal to replace the existing files with the downloaded files. Enter:

cp .../myscen.bld  /home/tet/test_sets/scen.bld 
cp .../myscen.exec /home/tet/test_sets/scen.exec

This confines the tests that are run to those relevant for file systems, avoiding hours of running other tests on sockets, math, stdio, libc, shell, etc.

8. Continue with the installation at this point. Answer 'y' to the "Build testsets" question.

The script builds and installs all file system tests and then runs them all. Although the script is running the files on a local file system, this is a valuable baseline for comparison with the behavior of Lustre.

The results are put into /home/tet/test_sets/results/0002e/journal. It is suggested that you rename or symlink this directory to /home/tet/test_sets/results/ext3/journal (or the name of the local file system that the test was run on).

Running the full test should only take about 5 minutes.

9. Answer 'n' to re-running just the failed tests.

The results (in a table) are in /home/tet/test_sets/results/report.

10. Save the test suite for later use, to run additional tests on a Lustre file system. Tar up the tests to avoid rebuilding them each time. Enter:

tar cvzf TESTROOT.tgz -C /mnt/lustre TESTROOT


Tip - At this time, you probably want to remove the installed tests, to save a bit of space and, more importantly, to avoid confusion if you forget to mount your Lustre file system before running the tests.


16.3.2 Running the Test Suite Against Lustre

1. As root, set up your Lustre file system, mounted on /mnt/lustre (e.g., sh llmount.sh) and untar the POSIX tests back to their home. Enter:

tar --same-owner -xzpvf /path/to/tarball/TESTROOT.tgz -C /mnt/lustre

2. As the vsx0 user, you can re-run the tests as many times as necessary. If you are newly su'd or logged in as the vsx0 user, you need to source the environment with '. profile' so your path and other environment is set up correctly. To run the tests, enter:

. /home/tet/profile 
tcc -e -s scen.exec -a /mnt/lustre/TESTROOT -p

Each new result is put in a new directory under /home/tet/test_sets/results and given a directory name similar to 0004e, an increasing number that ends with e for test execution or b for building the tests).

3. To look at a formatted report, enter:

vrpt results/0004e/journal | less

Some tests are "Unsupported", "Untested", or "Not In Use", which does not necessarily indicate a problem.

4. To compare two test results, run:

vrptm results/ext3/journal results/0004e/journal | less

This is more interesting than looking at the result of a single test, since it helps find test failures that are specific to the file system instead of the Linux VFS or kernel. Up to 6 test results can be compared at one time.

It is often useful to rename the results directory to have more meaningful names (such as before_unlink_fix).


16.4 Isolating and Debugging Failures

When failures occur, you need to gather information about what is happening at runtime. For example, some tests may cause kernel panics depending on your configuration.



Note - The default value for this option puts the debug log under your test directory in /mnt/lustre/TESTROOT, which may not be useful if you experience a kernel panic and lustre (or your machine) crashes.


VSX_DBUG_FLAGS=t:d:n:f:F:L:l,2:p:P
tcc -Tall5 -e -s scen.exec -a /mnt/lustre/TESTROOT -p 2>&1 | tee /tmp/POSIX-command-line-output.log

Each subtest (e.g., 'access', 'create') usually contains a number of single tests. The report shows exactly which single test fails. In this case, you can find more information directly from the VSX source code. For example, if the fifth single test of subtest chmod failed, you could look at the source:

/home/tet/test_sets/tset/POSIX.os/files/chmod/chmod.c

...which contains a single test array:

public  struct tet_testlist tet_testlist[] = {
test1, 1,
test2, 2,
test3, 3,
test4, 4,
test5, 5,
test6, 6,
test7, 7,
test8, 8,
test9, 9,
test10, 10,
test11, 11,
test12, 12,
test13, 13,
test14, 14,
test15, 15,
test16, 16,
test17, 17,
test18, 18,
test19, 19,
test20, 20,
test21, 21,
test22, 22,
test23, 23,
NULL, 0
};

If this single test is causing problems, as in the case of a kernel panic, or if you are trying to isolate a single failure, it may be useful to edit the tet_testlist array down to the single test in question and then recompile the test suite. Then, you can create a new tarball of the resulting TESTROOT directory, named appropriately (e.g, TESTROOT-chmod-5-only.tgz) and re-run the POSIX suite using the steps above.

It may also be helpful to edit the scen.exec file to run only the test set in question:

all 
"total tests in POSIX.os 1"
/tset/POSIX.os/files/chmod/T.chmod


Note - Rebuilding individual POSIX tests is not straightforward due to the reliance on tcc. One option is to substitute edited source files into the source tree while following the manual installation procedure described above and let the existing POSIX install scripts do the work. The installation scripts (specifically /home/tet/test_sets/run_testsets.sh), contain relevant commands to build the test suite -- something akin to tcc -p -b -s $HOME/scen.bld $* -- but these commands may not work outside the scripts. Let us know if you get better mileage rebuilding these tests.