ZFS Compression

Notes on enabling and working with ZFS compression. This has shown at LLNL to compress their HPC dataset size by about 50% and also improve read performance for large files, while not hurting write performance (as long as the OSS nodes have sufficiently fast CPUs).

We assume this is for ZFS on lustre, but will work the same for standalone ZFS. To do this on a Lustre/ZFS system, you have to perform these commands on every OST in the system. I would definitely not bother doing this on metadata, since most metadata is already compressed by default in ZFS.

'NOTE - here I'll assume you always work on a whole pool or filesystem, but these commands should work on any arbitrary directory too.

 = the pool, filesystem, or directory you choose.

Check ZFS compression
zfs get compression 

Set ZFS compression
zfs set compression=on  turns on compression with default algorithm (lzjb). You can chose values other than "on" for the compression algorithm, see 'man zfs' for details. To use a different algorithm, use for example "compression=lz4" instead of "on".

lz4 is likely the best choice now. It has been tested extensively, and provides very good compression balanced with performance. Basically, it stops trying to compress after compressing some initial part of the data and getting poor results. Details from experts on this topic is needed I think you might only need to set this on pools, and filesystems should inherit, but to be safe, you can apply it to everything.

Check compression ratio
On OSS: zfs get compressratio 

On client machines: Keep in mind that fastidious users will likely notice their data apparently shrank if they move things around to a compressed filesystem. "du --aparent-size" on some files or directories can help show what is happening.