ZFS Tunables for Lustre Object Storage Servers (OSS): Difference between revisions
(move new tunables into the main table, improve description of values. remove spl_kmem_cache_slab_limit, as we recommend the default value?) |
m (update a few default values from current code) |
||
Line 47: | Line 47: | ||
|zfs_vdev_async_write_min_active | |zfs_vdev_async_write_min_active | ||
|Minimum number of asynchronous write I/Os active to each device | |Minimum number of asynchronous write I/Os active to each device | ||
| | |2 | ||
|5 | |5 | ||
|- | |- | ||
Line 57: | Line 57: | ||
|zfs_vdev_sync_read_min_active | |zfs_vdev_sync_read_min_active | ||
|Minimum number of synchronous read I/Os active to each device | |Minimum number of synchronous read I/Os active to each device | ||
| | |10 | ||
|16 | |16 | ||
|- | |- | ||
|zfs_vdev_sync_read_max_active | |zfs_vdev_sync_read_max_active | ||
|Maximum number of synchronous read I/Os active to each device | |Maximum number of synchronous read I/Os active to each device | ||
| | |10 | ||
|16 | |16 | ||
|} | |} |
Latest revision as of 13:34, 25 May 2018
Parameter | Notes | Default | Suggested |
---|---|---|---|
metaslab_debug_unload | Prevent ZFS from unloading the spacemaps from a metaslab once it is read in | 0 | 1 |
zfetch_max_distance | Max bytes to readahead per stream | 8MB | 64MB |
zfs_vdev_scheduler | VDEV Scheduler | noop | deadline |
zfs_arc_max | Maximum size of RAM cache | 50% RAM | 75% RAM |
zfs_dirty_data_max^* | Amount of dirty data on the system; able to absorb more workload variation before throttling | 10% RAM (max 4GB) | 10% RAM or 2-3s of full-bandwidth writes |
zfs_vdev_async_read_max_active | Maximum asynchronous read I/Os active to each device | 3 | 16 |
zfs_vdev_aggregation_limit | Maximum amount of data to aggregate for a single write | 128KB | recordsize (1-4MB) |
zfs_vdev_async_write_active_min_dirty_percent | Threshold below which IO scheduler will limit concurrent operations to the minimum. Above this value, concurrent operations increases linearly until zfs_vdev_async_write_active_min_dirty_percent .
|
30 | 20 |
zfs_vdev_async_write_min_active | Minimum number of asynchronous write I/Os active to each device | 2 | 5 |
zfs_vdev_async_write_max_active | Maximum number of asynchronous write I/Os active to each device | 10 | 10 |
zfs_vdev_sync_read_min_active | Minimum number of synchronous read I/Os active to each device | 10 | 16 |
zfs_vdev_sync_read_max_active | Maximum number of synchronous read I/Os active to each device | 10 | 16 |
When dirty data is less than zfs_vdev_async_write_active_min_dirty_percent
of zfs_dirty_data_max
, ZFS keeps only zfs_vdev_async_write_min_active
outstanding writes per VDEV. Dirty data will build up more quickly below this threshold, and because there is only one outstanding write per disk by default, ZFS would start to delay or even halt writes.
Note that the zfs_dirty_data_max
parameter should ideally match the backend storage capability, allowing 2-3s of dirty data to be aggregated on the server to allow write merging and more efficient I/O ordering. The code simply uses 10% of system memory as the default, capped at zfs_dirty_data_max_max
(default 25% of RAM, or 4GB, whichever is less). Setting zfs_dirty_data_max
explicitly will bypass the default zfs_dirty_data_max_max
limit of 4GB.
For a comprehensive description of all available ZFS and SPL module parameters, refer to the zfs-module-parameters(5) and spl-module-parameters(5) man pages.
In addition to the kernel module parameters, it is recommended that ZFS compression is also enabled when creating ZFS datasets for OSTs. Creating Lustre Object Storage Services (OSS) provides examples of the commands to create OSTs with compression enabled.