Difference between revisions of "Projects"

From Lustre Wiki
Jump to: navigation, search
(Future Projects)
m (Potential Projects: fix formatting)
Line 65: Line 65:
 
|-
 
|-
 
| [[ioctl() number cleanups]]
 
| [[ioctl() number cleanups]]
|| Clean up Linux IOC numbering to properly use "size" field so that mixed 32- and 64-bit kernel/userspace ioctls work correctly.  Attention needs to be paid to maintaining userspace compatibility for a number of releases, so the old ioctl() numbers cannot simply be removed.
+
||Clean up Linux IOC numbering to properly use "size" field so that mixed 32- and 64-bit kernel/userspace ioctls work correctly.  Attention needs to be paid to maintaining userspace compatibility for a number of releases, so the old ioctl() numbers cannot simply be removed.
 
|| 1 ||[https://bugzilla.lustre.org/show_bug.cgi?id=20731 b=20731]
 
|| 1 ||[https://bugzilla.lustre.org/show_bug.cgi?id=20731 b=20731]
 
|-
 
|-
 
| [[Updated man pages]]
 
| [[Updated man pages]]
|| Update the online manual pages for Lustre user tools and the lustreapi library.  Split the existing lfs.1 and lctl.8 man pages into separate pages for each sub-command, describing options and providing usage examples.
+
||Update the online manual pages for Lustre user tools and the lustreapi library.  Split the existing lfs.1 and lctl.8 man pages into separate pages for each sub-command, describing options and providing usage examples.
 
|| 2 ||[https://jira.hpdd.intel.com/browse/LU-4315 LU-4315]
 
|| 2 ||[https://jira.hpdd.intel.com/browse/LU-4315 LU-4315]
 
|-
 
|-
Line 89: Line 89:
 
|-
 
|-
 
| [[RPC Replay Signatures]]
 
| [[RPC Replay Signatures]]
||<small>Allow MDS/OSS to determine if client can legitimately replay an RPC, by digitally signing it at processing time and verifying the signature at replay time.
+
||Allow MDS/OSS to determine if client can legitimately replay an RPC, by digitally signing it at processing time and verifying the signature at replay time.
 
|| 6 ||[https://bugzilla.lustre.org/show_bug.cgi?id=18547 b=18547]
 
|| 6 ||[https://bugzilla.lustre.org/show_bug.cgi?id=18547 b=18547]
 
|-
 
|-
Line 109: Line 109:
 
|-
 
|-
 
| [[Lustre Snapshots|Integrated Lustre Snapshots]]
 
| [[Lustre Snapshots|Integrated Lustre Snapshots]]
|| Allow Lustre to internally mount and manage ZFS snapshots and other datasets within a single namespace
+
||Allow Lustre to internally mount and manage ZFS snapshots and other datasets within a single namespace
 
|| 7 ||
 
|| 7 ||
 
|-
 
|-

Revision as of 16:30, 24 January 2017

Current Projects

Feature Feature Summary Point of Contact Tracker Target Date (YYYY-MM)
NRS Delay policy Use NRS for fault injection. Intentionally delay request processing to simulate server load. Chris Horn (Cray) LU-6283 2015-05
Lock ahead Allow user space to request LDLM extent locks in advance of need. Intended to optimize shared file IO. Patrick Farrell (Cray) LU-6179 2015-07
ZFS ZIL Support Add support for the ZFS Intent Log (ZIL) to the Lustre osd-zfs Alex Zhuravlev (Intel) LU-4009 2016-06
Layout Enhancement Add support for multiple layouts on a single file, for File Level Replication, Data on MDT, PFL, etc. Jinshan Xiong, Niu Yawei (Intel) LU-3480 2016-09
Progressive File Layouts Allow composite file layouts to be instantiated incrementally during file writes Jinshan Xiong, Niu Yawei (Intel) LU-8998 2017-06
Multi-Rail LNet Use multiple LNet network interfaces concurrently to improve reliability and performance Amir Shehata (Intel), Olaf Weber (SGI) LU-7734 2017-03
TBF policy enhancement An enhancement of NRS/TBF policy to support complex TBF policy with NID/JOBID expressions Li Xi (DDN) LU-7470 2016-06
Simplified Userspace Snapshots Allow snapshots of ZFS targets to be mounted as a coherent filesystem Fan Yong (Intel) LU-8900 2017-01
Enhanced Adaptive Compression in Lustre Introduce compression for the Lustre client and server Michael Kuhn (Universität Hamburg) 2019-02
Data on MDT Allow small files to be stored directly on the MDT for reduced RPC count and improved performance. Mikhail Pershin (Intel) LU-3285 2017-06
Quota for Projects Allow specifying a "project" or "subtree" identifier for files for accounting to a project, separate from UID/GID. Shuichi Ihara (DDN) LU-4017 2017-06
Patchless Server Remove Lustre kernel patches to allow Lustre servers to be more easily ported to new kernels, and to be built against vendor kernels without changing the vendor kernel RPMs. Andreas Dilger (Intel) LU-20 2017-03
File Level Redundancy Phase 1 Solution Architecture and HDL, Exclusive Open, RAID 1 Layout, Layout Modification Method, Read Only RAID 1 Jinshan Xiong (Intel) LU-3254 2017-09

Future Projects

Feature Feature Summary Point of Contact Tracker
File Level Redundancy Phase 3 Data Redundancy with immediate asynchronous write from client. Jinshan Xiong (Intel) LU-3254
File Level Redundancy Phase 4 Data Redundancy with erasure coding of files. Jinshan Xiong (Intel), Andreas Dilger (Intel) LU-3254
OSP multiple modify requests to MDT Improve performance of cross-MDT modify operations. Grégoire Pichon (Atos) LU-6864
OFD Read Cache Server side persistent read cache with SSD, NVMe. One of storage cache tier between OSS Readcache with server memory and Storage Li Xi (DDN)
Fileheat based Policy engine Add support new attribute "file heat" of objects to track "hot" files, for OFD read cache as a policy engine Li Xi (DDN)
IB Multi-rail Use multiple IB interfaces as a single Lustre NID to improve data transferring bandwidth and redundancy against the failure of IB. Kenichiro Sakai (Fujitsu) LU-6495
Directory Level Snapshot Directory level snapshot (DL-SNAP) is designed for user level file backups. DL-SNAP will be implemented by using copy-on-write mechanism on top of ldiskfs without modification of disk format. Kenichiro Sakai (Fujitsu)

Potential Projects

These projects are potential areas of development that are looking for an interested party to work on or sponsor another developer to do. Many of them have more detailed descriptions, but it is worthwhile to contact the lustre-devel mailing list to discuss the project before starting implementation.

Feature Feature Summary Complexity Tracker
ioctl() number cleanups Clean up Linux IOC numbering to properly use "size" field so that mixed 32- and 64-bit kernel/userspace ioctls work correctly. Attention needs to be paid to maintaining userspace compatibility for a number of releases, so the old ioctl() numbers cannot simply be removed. 1 b=20731
Updated man pages Update the online manual pages for Lustre user tools and the lustreapi library. Split the existing lfs.1 and lctl.8 man pages into separate pages for each sub-command, describing options and providing usage examples. 2 LU-4315
Improve testing Efficiency Improve the performance, efficiency, and coverage of the acceptance-small.sh test scripts. As a basic step, printing the duration of each test script in the acceptance-small.sh test summary would tell us where the testing time is being spent. 3 b=23051
Config save/edit/restore Need to be able to backup/edit/restore the client/MDS/OSS config llog files after a writeconf. One reason is for config recovery if the config llog becomes corrupted. Another reason is that all of the filesystem tunable parameters (including all of the OST pool definitions) are stored in the config llog and are lost if a writeconf is done. Being able to dump the config log to a plain text file, edit it, and then restore it would make administration considerably easier. 3 b=17094
Error message improvements Review and improve the Lustre error messages to be more useful. A larger project is to change the core Lustre error message handling to generate better structured error messages so that they can be parsed/managed more easily. 4
Improve QOS Round-Robin object allocator Improve LOV QOS allocator to always do weighted round-robin allocation, instead of degrading into weighted random allocations once the OST free space becomes imbalanced. This evens out allocations continuously, avoids crazy/bad OST allocation imbalances when QOS becomes active, and allows adding weighting for things like current load, OST RAID rebuild, etc. 5 LU-9
RPC Replay Signatures Allow MDS/OSS to determine if client can legitimately replay an RPC, by digitally signing it at processing time and verifying the signature at replay time. 6 b=18547
Virtual Lustre Block Device Lustre object lloop driver exports block device to userspace, bypassing filesystem. Code partly works and is currently part of Lustre, but has correctness issues and potential performance problems. It needs to be ported to newer kernels. 6 LU-6585
Swap on Lustre Depends on the Lustre block device. Has problems when working under memory pressure, which makes it mostly useless until those problems are fixed. 7 b=5498
Directory readdir+ Bulk metadata readdir/stat interface to speed up "ls -l" operations. Send back requested inode attributes for all directory entries as part of the extended dirent data. Integrate with any proposed API for this on the client. Needs Large Readdir RPCs to be efficient over the wire, since more data will be returned for every entry. 7 b=17845
Small file IO aggregation Small file IO aggregation (multi-object RPCs), most likely for writes first, and possibly later for reads in conjunction with statahead. 7 b=944
Integrated Lustre Snapshots Allow Lustre to internally mount and manage ZFS snapshots and other datasets within a single namespace 7
Client-side data encryption Encrypt files and directories (or possibly just filenames in directories) on the client before sending to the server. This avoids sending unencrypted data over the network, or ever having the data in plaintext on the server (in case of separate decryption from network plus encryption on disk). It seems possible to leverage the existing GSSAPI sptlrpc_cli_wrap_bulk() and friends to do bulk data encryption/decryption on the client using a per-file key (itself stored encrypted by the users' key(s) with the file on the MDS), and salted with the OST object ID or stripe index and object offset, rather than a per-session key and then not decrypting the data at the server before writing it to disk. 7 []
Local object zero-copy IO Efficient data IO between a client and a local OST object; optimization to support local clients. Likely implemented as a fast-path connection between the OSC and the local OFD/OSD. Read cache should be kept on the OSD instead of at the client VFS level, so that the cache can be shared among all users of this OST. 9

Past Projects

Feature Feature Summary Point of Contact Tracker Target Date (YYYY-MM) Merge Date (YYYY-MM)
Dynamic LNET Configuration Introduces a user space script which will read routes from a config file and add those routes to LNET dynamically with the lctl utility. This allows the support of very large routing tables Amir Shehata (Intel) LU-2456 2014-11 2014-11
LFSCK Phase 3 - DNE consistency Enhance LFSCK to work with DNE filesystems, including remote directory entries, and OST orphan handling for multiple MDTs. Fan Yong (Intel) LU-2307 2014-11 2014-11
LFSCK Phase 4 - performance tuning Enhance LFSCK performance and efficiency. Fan Yong (Intel) LU-6361 2015-05 2015-05
Multiple metadata RPCs Support of multiple metadata modifications per client (in last_rcvd file) to improve the multi-threaded metadata performance of a single client Grégoire Pichon (Bull/Atos) LU-5319 2015-05 2015-05
DNE Phase IIb Asynchronous Commit of cross-MDT updates for improved performance. Remote rename and remote hard link functionality. Wang Di (Intel) LU-3534 2015-05 2015-05
Kerberos Revival Fix up existing Kerberos code so it is tested working again. Sébastien Buisson (Bull/Atos) LU-6356 2015-06 2015-06
Filesystem default OST pool Allow a default filesystem-wide default OST pool to be specified. Lai Siyao (Intel) LU-7660, LU-7335 2016-07
UID/GID mapping Map UID/GID for remote client nodes to local UID/GID on the MDS and OSS. Allows a single Lustre filesystem to be shared across clients with administrative domains. Stephen Simms (Indiana University) LU-3291 2015-11 2016-06
Subdirectory Mounts Ability for client to mount subdirectories of a Lustre filesystem Wang Shilong (DDN) LU-28 2016-01 2016-06
Server side advise and hinting Add support new APIs and utilities for server/storage side advise of accessing file for server cache Li Xi (DDN) LU-4931 2016-01 2016-09
Large Bulk IO Increase the OST bulk IO maximum size to 16MB or larger for more efficient disk IO submission. Shuichi Ihara (DDN) LU-7990 2016-04 2016-09
Shared Key Crypto Allow node authentication and/or RPC encryption using symmetric shared key crypto with GSSAPI. Avoids complexity in configuring Kerberos across multiple domains. Stephen Simms (Indiana University) LU-3289 2015-12 2016-12