Using Maloo

From Lustre Wiki
Jump to navigation Jump to search


Using Maloo

Introduction

Maloo is Whamcloud's shared database of Lustre testing results, allowing those testing Lustre to collaborate and track progress. Maloo is available for interested parties to upload and view test data in order to review results for automated patch regression testing, help isolate intermittent test failures, and review past test results. The Maloo database contains several years of test results, from testing every version of every patch submitted to the Lustre tree.

Getting started

Maloo is located at https://testing.whamcloud.com. At present, all access to view and upload results requires a user account in order to avoid abuse of a publicly-available system resource. You can register for an account using the New User? link on the login page. A newly created account is not usable until an administrator validates it to prevent system abuse – we aim to validate accounts within 24 hours (usually much sooner).

Once your account has been validated, when you log in you will by default see a list of the latest test results.

See the presentation Lustre Autotest Maloo Gerrit presentation for tips on how to effectively use these tools.

Viewing Patch Test Results

Most Lustre developers will interact with Maloo because of automatic test session results linked to patches in Gerrit. Each patch will run several different test sessions for different system configurations (e.g. review-dne-zfs-part-4, review-dne-selinux-ssk-part-1, review-ldiskfs-ubuntu, etc.) and each test session typically contains a number of test scrips (e.g. sanity, conf-sanity, replay-single, racer, etc.). Each test script in a session may have a number of subtests (e.g. test_1b), which are the individual test items that get marked PASS/FAIL, and their results are aggregated up to the session level. The purpose of splitting the testing for a single patch into multiple test sessions is to be able to run these tests in parallel and obtain PASS/FAIL results for all sooner.

Each session will report its status into Gerrit on the patch that triggered the testing in a format similar to that shown below:

   Maloo                                                                                                       10-16 07:56
   
   Patch Set 4:
   
   Failed enforced test review-dne-zfs-part-1 on CentOS 8.3/x86_64 uploaded by Trevis Autotest 2 from trevis-206vm5:
   https://testing.whamcloud.com/test_sessions/74f65aac-d2c0-47ed-8b3a-85d526cc4f31 ran 4 tests. 1 tests failed: sanity.

The unique link allows viewing the aggregate results of all test scripts run for that session (PASS results in green, FAIL results in red, SKIP results in orange), and further links allow drilling down into the results of the failing script(s), and the failing subtest(s) to see the test output, client and server console logs, as well as kernel debug logs.

Searching Maloo

On the top of each Maloo page the Results->Search menu item allows searching the test results by a variety of parameters. Commonly, one wants to search for specific Sub test results (e.g. sanity subtest test_418 that have FAILed in the past week) to see if a test failure is unique to the patch being tested, or if it is being hit intermittently for other patches as well, and whether an existing JIRA ticket has already been associated with the failure. Available search parameters include the test group, date range, Git development branch, server filesystem type (ldiskfs or zfs), client or server distro, architecture, and others.

Because Maloo stores the results for millions of subtests, searches should typically be constrained to the smallest region of interest possible, to reduce the volume of returned results, and to minimize the search time. Typically, test results are most relevant within the past week, and should be limited to a specific test script and subtest, as well as a failure type (FAIL or TIMEOUT). When trying to isolate the introduction of an intermittent subtest failure, selecting a specific date range in 4- or 8-week intervals provides results in a reasonable timeframe.

Annotating Test Failures

When reviewing test results for a patch, any test failures for that patch should be annotated with a specific LU ticket number in Jira to help track the source of failures, and to allow intermittent test failures to be resubmitted for testing. The Associate bug... link at the bottom of the subtest failure should be used (e.g. test_418 failed with LU-13997), rather than for the whole test script, unless the script reported a failure with no associated subtest failure. If there is a test failure that is not associated with an existing LU ticket and has been seen on multiple different (unrelated) patches, then the Raise bug... button for the subtest can be used to file a new Jira ticket for tracking that failure.

When searching Maloo for specific Sub test failures, it is possible to mass-Associate the failed subtests to a specific Jira LU ticket number, if you are confident that all of the failures have the same cause. One or more subtests can be selected in the leftmost column, and then Associate bugs... selected to specify the LU ticket number.

Viewing All Sessions

The main page view (Sessions from the menu at the top of the page) shows test sessions which have completed recently, in reverse chronological order. This page shows the list of test sessions uploaded, which can be filtered using the criteria available at the top of the page, such as the build number, client or server distro, CPU architecture, etc.

Uploading test results

Autotest

The vast majority of test uploads are initiated by the Whamcloud Autotest application which automatically runs test sessions as patches are pushed to Gerrit, the application then monitors when the test completes and uploads the results automatically to Maloo.

Manual upload

Please note that Maloo only accepts properly formatted result uploads.

The format of test results is the YAML markup generated by the Lustre test framework. The output of the tests consists of a number of YAML files and log files. Maloo receives these results in a gzipped tar file of the tests results for each test session. The files must be in the root of the tar file.

Select Results->Upload link from the menu at the top of the page to access the manual upload page. There are three ways to manually upload results:

Web Upload

If you have a set of results that need to be uploaded, simply choose the file and select upload from the page, the results will be available a few seconds after the upload completes.

Command-line

For command-line uploads you will need to download the .maloorc configuration file which is available from the Results->Upload page.

Using auster

The lustre/tests/auster script in the Lustre source tree can automatically upload results when a test run completes. In order to do this, the .maloorc file is required. This file provides the address of the server, and a unique token associating uploaded results with your user account.

 maloo_upload.sh

You can use the lustre/tests/maloo_upload.sh shell script in the Lustre source tree and configuration file from the Results->Upload page which can be used to perform uploads from the command line.