Architecture - ZFS large dnodes
Note: The content on this page reflects the state of design of a Lustre feature at a particular point in time and may contain outdated information.
- DMU node, 512 bytes in original ZFS implementation and includes "bonus buffer" for users to store extra data
- extended attribute generally of a limited size (4kB or less), not to be confused with the ZFS 'xattr' which is more like a named byte stream that may be an arbitrary size
- ZPL-format dnode, uses bonus buffer to store POSIX data for ZFS
|large_dnode||performance||increased dnode size to allow more data in the inode|
EA in large dnode
|Scenario:||Storage of EA in dnode|
|Business Goals:||Fast access to Lustre EA values|
|details||Stimulus:||EA needs to be stored in ZFS-format dnode|
|Stimulus source:||Lustre OSD (MDT/OST) storing EA data to object|
|Environment:||EA being stored on a specific znode within a transaction|
|Artifact:||EA is stored in the znode|
|Response:||time to store EA data|
|Response measure:||time is not significantly more than znode update without EA, much less than storing data in ZFS xattr|
DMU must work with both larger and original size dnodes in the same pool. There is currently no requirement for dynamic dnode size within a single filesystem
ZFS must be able to work on a filesystem with large dnodes, even if it cannot initially access the extra dnode space
'zfs' tool must be able to specify dnode size for a filesystem at creation time
EA is stored in microzap (as described in ZFS_Microzap) to allow efficient access, retrieval, and flexibility.
Questions and Issues
How do we handle case where EAs overflow available space in dnode? Strong consideration should be given to allowing EAs (also?) be stored in an external block or in a protected EA xattr file.