ZFS Compression: Difference between revisions

From Lustre Wiki
Jump to navigation Jump to search
No edit summary
(Reworked a bit to be less 'notes about compression'. Updated some information.)
 
(3 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Notes on enabling and working with zfs compression
ZFS supports transparent compression of data at the block level for datasets. Enabling compression can improve storage efficiency by decreasing the amount of data written to disk and improve read performance for large files, as less data needs to be read from disk.


We assume this is for zfs on lustre, but will work the same for standalone zfs. To do this on a lustre/zfs system, you have to perform these commands on every OST in the system. I would definitely not bother doing this on metadata.
If desired, compression will need to be enabled individually for all relevant OSTs.
 
'''NOTE'' - here I'll assume you always work on a whole pool or filesystem, but these commands should work on any arbitrary directory too.
 
<STORAGE> = the pool, filesystem, or directory you choose.


== Check ZFS compression ==
== Check ZFS compression ==
   
   
<pre>zfs get compression <STORAGE></pre>
<pre>zfs get compression <OST/dataset/here></pre>
   
   
== Set ZFS compression ==
== Set ZFS compression ==


<pre>zfs set compression=on <STORAGE></pre>
<pre>zfs set compression=on <OST/dataset/here></pre>
   
   
turns on compression with default algorithm (lzjb). You can chose values other than "on" for the compression algorithm, see 'man zfs' for details.
turns on compression with default algorithm (lz4 since ZFS version 0.6.5). You can chose values other than "on" for the compression algorithm, see 'man zfsprops' for details.
   
   
To use a different algorithm, use for example "compression=lz4" instead of "on".  
To use a different algorithm, use for example "compression=zstd" instead of "on".  


'''lz4''' is likely the best choice now. It has been tested extensively, and provides very good compression balanced with performance. Basically, it stops trying to compress after compressing some initial part of the data and getting poor results. ''Details from experts on this topic is needed''
'''lz4''' is likely the best choice now. It has been tested extensively, and provides very good compression balanced with performance. It implements an early abort feature<ref>https://github.com/openzfs/zfs/blob/e0bf43d64ed01285321bf6c3a308f699c5483efc/module/zfs/lz4_zfs.c#L520</ref>, meaning it stops trying to compress if initial compression results do not meet a threshold, so performance with incompressible data isn't degraded.
I think you might only need to set this on pools, and filesystems should inherit, but to be safe, you can apply it to everything.
   
   
== Check compression ratio ==
== Check compression ratio ==


On OSS:  
On OSS:  
<pre>zfs get compressratio <STORAGE></pre>
<pre>zfs get compressratio <OST/dataset/here></pre>


On client machines:
On client machines:
Keep in mind that fastidious users will likely notice their data apparently shrank if they move things around to a compressed filesystem. "du --aparent-size" on some files or directories can help show what is happening.
<pre>du --apparent-size -h <file/dir></pre> will show logical usage of a file (in human readable units), whereas <pre>du -h <file></pre> will show actual usage on disk (in human readable units).
 
[[Category:ZFS]][[Category:Howto]]

Latest revision as of 13:31, 15 October 2024

ZFS supports transparent compression of data at the block level for datasets. Enabling compression can improve storage efficiency by decreasing the amount of data written to disk and improve read performance for large files, as less data needs to be read from disk.

If desired, compression will need to be enabled individually for all relevant OSTs.


Check ZFS compression

zfs get compression <OST/dataset/here>

Set ZFS compression

zfs set compression=on <OST/dataset/here>

turns on compression with default algorithm (lz4 since ZFS version 0.6.5). You can chose values other than "on" for the compression algorithm, see 'man zfsprops' for details.

To use a different algorithm, use for example "compression=zstd" instead of "on".

lz4 is likely the best choice now. It has been tested extensively, and provides very good compression balanced with performance. It implements an early abort feature[1], meaning it stops trying to compress if initial compression results do not meet a threshold, so performance with incompressible data isn't degraded.

Check compression ratio

On OSS:

zfs get compressratio <OST/dataset/here>

On client machines:

du --apparent-size -h <file/dir>

will show logical usage of a file (in human readable units), whereas

du -h <file>

will show actual usage on disk (in human readable units).