»LID Home
»LID History

The lu_object class and it’s corresponding device class, lu_device are the roots of the Lustre object hierarchies. These classes are abstract, so they must be sub-classed further to obtain useable objects. The sub-classes will provide concrete implementations of the methods declared in these abstract classes.

1. lu_object

All Lustre objects are based on this class. It provides the infrastructure for creating, manipulating and destroying multi-layer objects.

struct lu_object {
         * Header for this object.
        struct lu_object_header           *lo_header;
         * Device for this layer.
        struct lu_device                  *lo_dev;
         * Operations for this object.
        const struct lu_object_operations *lo_ops;
         * Linkage into list of all layers.
        struct list_head                   lo_linkage;
         * Depth. Top level layer depth is 0.
        int                                lo_depth;
         * Flags from enum lu_object_flags.
        unsigned long                      lo_flags;
         * Link to the device, for debugging.
        struct lu_ref_link                *lo_dev_ref;

The object’s fields are:


points at the object’s lu_object_header


points at the lu_device associated with this slice


the lu_object_operations defined for this object


points to the next slice in a multi-layered object


layer depth (0 = top level, 1 = second level, etc.)


object’s flags - currently (August 2009), only flag defined is LU_OBJECT_ALLOCATED which is set if loo_object_init() has been called for this slice


a debugging aid

Related functions are:

1.1. lu_object_init()

int lu_object_init (struct lu_object *o,
                    struct lu_object_header *h, struct lu_device *d);

Initialise the object (layer) o that was created by device d and is part of the compound object with header h. This stores h and d into o, obtains a reference to d and initialises the linkage to the next layer.

1.2. lu_object_fini()

void lu_object_fini (struct lu_object *o);

Finalise object (layer) o. This removes the reference to the object’s device.

1.3. lu_object_add_top()

void lu_object_add_top (struct lu_object_header *h, struct lu_object *o);

Adds object (layer) o as the top-level object in the layered object with header h.

1.4. lu_object_add()

void lu_object_add (struct lu_object *before, struct lu_object *o);

Adds object (layer) o into a layered object after object before.

2. lu_object_header

All Lustre compound (multi-layered) objects contain a single instance of lu_object_header within the top-level layer. It contains various data items (such as the object’s FID that identifies the object within the lu_site in which it is located) that are required to be able to find and use the object.

struct lu_object_header {
         * Object flags from enum lu_object_header_flags. Set and checked
         * atomically.
        unsigned long       loh_flags;
         * Object reference count. Protected by lu_site::ls_guard.
        atomic_t            loh_ref;
         * Fid, uniquely identifying this object.
        struct lu_fid       loh_fid;
         * Common object attributes, cached for efficiency. From enum
         * lu_object_header_attr.
        __u32               loh_attr;
         * Linkage into per-site hash table. Protected by lu_site::ls_guard.
        struct hlist_node   loh_hash;
         * Linkage into per-site LRU list. Protected by lu_site::ls_guard.
        struct list_head    loh_lru;
         * Linkage into list of layers. Never modified once set (except lately
         * during object destruction). No locking is necessary.
        struct list_head    loh_layers;
         * A list of references to this object, for debugging.
        struct lu_ref       loh_reference;

The object’s fields are:


flags from:

enum lu_object_header_flags {
         * Don't keep this object in cache. Object will be destroyed as soon
         * as last reference to it is released. This flag cannot be cleared
         * once set.

the object’s reference count


the object’s FID that uniquely identifies it


the object’s common attributes - a combination of :

enum lu_object_header_attr {
        LOHA_EXISTS   = 1 << 0,
        LOHA_REMOTE   = 1 << 1,
         * UNIX file type is stored in S_IFMT bits.
        LOHA_FT_START = 1 << 12, /**< S_IFIFO */
        LOHA_FT_END   = 1 << 15, /**< S_IFREG */

the linkage through which the lu_site hash table can find this object


the linkage into the lu_site LRU list


the link to the first layer in the object


list of references to this object, for debugging

Related functions are:

2.1. lu_object_header_init()

int lu_object_header_init(struct lu_object_header *h);

Initialise object header h. This zeroes the structure, sets the refcount to 1 and initialises the list heads.

2.2. lu_object_header_fini()

void lu_object_header_fini(struct lu_object_header *h);

Finalise object header h.

3. lu_object_operations

This is a small set of fundamental operations that must be defined for all object types. All of these methods take a pointer to the object, and all (except loo_object_invariant) also take a pointer to a lu_env that provides the execution (thread specific) context. Some methods require further arguments.

3.1. loo_object_init()

         * Allocate lower-layer parts of the object by calling
         * lu_device_operations::ldo_object_alloc() of the corresponding
         * underlying device.
         * This method is called once for each object inserted into object
         * stack. It's responsibility of this method to insert lower-layer
         * object(s) it create into appropriate places of object stack.
        int (*loo_object_init)(const struct lu_env *env,
                               struct lu_object *o,
                               const struct lu_object_conf *conf);

Responsible for allocating the next lower object slice in a multi-layered object and linking that slice into the stack. This is not a recursive operation. Allocation of a multi-layer object is done iteratively within lu_object_alloc() which is the only place that this method is invoked. Returns 0 on success.

The conf argument may provide extra info about the object being created. It is always NULL for server objects but can be non-NULL for client side objects.

3.2. loo_object_start()

         * Called (in bottom-to-top order) during object allocation after all
         * layers were allocated and initialized. Can be used to perform
         * initialization depending on lower layers.
        int (*loo_object_start)(const struct lu_env *env,
                                struct lu_object *o);

Optional method that performs one-time initialisation of a layer after all the layers have been allocated. Returns 0 on success.

NOTE - currently (August 2009), comment in header file that states that this method is called in top to bottom order conflicts with the code that shows the ordering to be bottom to top.

3.3. loo_object_delete()

         * Called before lu_object_operations::loo_object_free() to signal
         * that object is being destroyed. Dual to
         * lu_object_operations::loo_object_init().
        void (*loo_object_delete)(const struct lu_env *env,
                                  struct lu_object *o);

Optional method that is invoked once to release a object’s resources (but not the object itself).

3.4. loo_object_free()

         * Dual to lu_device_operations::ldo_object_alloc(). Called when
         * object is removed from memory.
        void (*loo_object_free)(const struct lu_env *env,
                                struct lu_object *o);

Required method that should free the memory used by the object.

3.5. loo_object_release()

         * Called when last active reference to the object is released (and
         * object returns to the cache). This method is optional.
        void (*loo_object_release)(const struct lu_env *env,
                                   struct lu_object *o);

Optional method that notifies object that it is no longer being used and is about to be cached.

3.6. loo_object_print()

         * Optional debugging helper. Print given object.
        int (*loo_object_print)(const struct lu_env *env, void *cookie,
                                lu_printer_t p, const struct lu_object *o);

3.7. loo_object_invariant()

         * Optional debugging method. Returns true iff method is internally
         * consistent.
        int (*loo_object_invariant)(const struct lu_object *o);

4. lu_device

Instances of lu_device represent a layer in the server-side device stack. Each object in a multi-layer object is associated with a corresponding lu_device that provides device operations and linkage to the lu_site that contains the object.

struct lu_device {
         * reference count. This is incremented, in particular, on each object
         * created at this layer.
         * \todo XXX which means that atomic_t is probably too small.
        atomic_t                           ld_ref;
         * Pointer to device type. Never modified once set.
        struct lu_device_type       *ld_type;
         * Operation vector for this device.
        const struct lu_device_operations *ld_ops;
         * Stack this device belongs to.
        struct lu_site                    *ld_site;
        struct proc_dir_entry             *ld_proc_entry;

        /** \todo XXX: temporary back pointer into obd. */
        struct obd_device                 *ld_obd;
         * A list of references to this object, for debugging.
        struct lu_ref                      ld_reference;

The device’s fields are:


the number of objects associated with this device


points to a lu_device_type that defines a class of devices


points to a lu_device_operations that defines the operations for this device


the lu_site that the device’s stack is associated with


something related to /proc filesystem but appears to be unused?


the obd_device this device is associated with


all the references to this device (used for debugging)

Related functions are:

4.1. lu_stack_fini()

void lu_stack_fini (const struct lu_env *env, struct lu_device *top);

Finalises the device stack whose top-level is top. This purges all the objects referenced by the site, finalises the devices in the device stack by calling their ltdo_device_fini() methods and then destroys those devices by calling the ltdo_device_free() methods.

4.2. lu_device_get()

void lu_device_get (struct lu_device *d);

Acquire a reference to device d.

4.3. lu_device_put()

void lu_device_put (struct lu_device *d);

Release reference to device d.

4.4. lu_device_init()

int  lu_device_init (struct lu_device *d, struct lu_device_type *t);

Initialise device d of type t. If this is the first device of this type to be initialised, the type’s ldto_start() method is invoked (if defined).

4.5. lu_device_fini()

void lu_device_fini (struct lu_device *d);

Finalises device d. If this is the last device of its type to be finalised, the type’s ldto_stop() method is invoked (if defined).

5. lu_device_operations

These operations are common for all device classes. They all take a pointer to a lu_env that provides the execution (thread specific) context and a pointer to the device being operated on.

5.1. ldo_object_alloc()

        struct lu_object *(*ldo_object_alloc)(const struct lu_env *env,
                                              const struct lu_object_header *h,
                                              struct lu_device *d);

This function allocates a single layer object for the device, d. It is not responsible for allocating any lower layers. At a minimum, it should initialise the lo_dev and lo_ops fields in the allocated object. The lu_object_header, h, is passed as it contains additional info, such as the object’s FID. On success, returns a pointer to the allocated object

5.2. ldo_process_config()

        int (*ldo_process_config)(const struct lu_env *env,
                                  struct lu_device *d, struct lustre_cfg *cfg);

Apply the configuration command cfg to device d. The command is described in a lustre_cfg. Unusually, this function may use recursion to pass the command down to lower level devices when either the command is not handled at this level or it should be handled at more than one level in the device stack. Returns 0 on success.

5.3. ldo_recovery_complete()

        int (*ldo_recovery_complete)(const struct lu_env *env,
                                     struct lu_device *d);

Notifies the device that recovery has completed. The function should recursively call ldo_recovery_complete of the next layer down so that all the devices in the stack are notified. Returns 0 on success.

5.4. ldo_prepare()

        int (*ldo_prepare)(const struct lu_env *env,
                           struct lu_device *parent,
                           struct lu_device *dev);

This function is called after the device layers have been initialised (ldo_process_config has been called) but before the device stack starts serving requests. It should initialise any local objects (state) that is required at this level and then recursively call the same method for the device below. Returns 0 on success.

6. lu_device_type

This defines a class of devices and provides functions to create, destroy, initialise and finalise those devices.

struct lu_device_type {
         * Tag bits. Taken from enum lu_device_tag. Never modified once set.
        __u32                                   ldt_tags;
         * Name of this class. Unique system-wide. Never modified once set.
        char                                   *ldt_name;
         * Operations for this type.
        const struct lu_device_type_operations *ldt_ops;
         * \todo XXX: temporary pointer to associated obd_type.
        struct obd_type                        *ldt_obd_type;
         * \todo XXX: temporary: context tags used by obd_*() calls.
        __u32                                   ldt_ctx_tags;
         * Number of existing device type instances.
        unsigned                                ldt_device_nr;
         * Linkage into a global list of all device types.
         * \see lu_device_types.
        struct list_head                        ldt_linkage;

The fields are:


tag bits that classify the device type - one of:

enum lu_device_tag {
        /** this is meta-data device */
        LU_DEVICE_MD = (1 << 0),
        /** this is data device */
        LU_DEVICE_DT = (1 << 1),
        /** data device in the client stack */
        LU_DEVICE_CL = (1 << 2)

name of the device type - e.g. "cmm" or "mdd"


the operations defined for this device type


a pointer to the obd_type for the OBD device class that devices of this type will use


these are the context tags that will be used for operations by the devices of this type (see lu_context_tag for values)


the number of devices of this type in use


linkage into a global list of device types

Related functions are:

6.1. lu_device_type_init()

int lu_device_type_init(struct lu_device_type *ldt);

Initialises device type ldt by invoking its ldto_init() method. Adds device type to global list of device types. Returns 0 on success.

6.2. lu_device_type_fini()

void lu_device_type_fini(struct lu_device_type *ldt);

Finalises device type ldt. Removes the device type from the global list of device types and invokes its ltdo_fini() method.

6.3. lu_types_stop()

void lu_types_stop(void);

This invokes the ldto_stop() method for all device types in the global list.

7. lu_device_type_operations

Half of these operations are concerned with the lifecycle of devices of this type and the other operations are related to the device type itself.

7.1. ldto_device_alloc()

        struct lu_device *(*ldto_device_alloc)(const struct lu_env *env,
                                               struct lu_device_type *t,
                                               struct lustre_cfg *lcfg);

This function allocates the memory for the device and carries out basic initialisation by calling lu_device_init() and setting up the required pointers to the type specific function vectors. A pointer to the new device is returned.

7.2. ldto_device_free()

        struct lu_device *(*ldto_device_free)(const struct lu_env *env,
                                              struct lu_device *d);

Frees the memory used by device d and returns a pointer to the next (lower) device in the stack. Before the device is freed, lu_device_fini() is called.

7.3. ldto_device_init()

        int  (*ldto_device_init)(const struct lu_env *env,
                                 struct lu_device *d,
                                 const char *name,
                                 struct lu_device *next);

Carries out post-allocation initialisation. name is the class name of the device being initialised and next points at the next (lower) device in the stack. Returns 0 on success.

7.4. ldto_device_fini()

        struct lu_device *(*ldto_device_fini)(const struct lu_env *env,
                                              struct lu_device *d);

Finalises the device before its memory is freed. Returns a pointer to the next (lower) device in the stack.

7.5. ldto_init()

        int  (*ldto_init)(struct lu_device_type *t);

Called on module load to initialise the device type and is primarily concerned with registering the type’s context keys. Returns 0 on success.

7.6. ldto_fini()

        void (*ldto_fini)(struct lu_device_type *t);

Called on module unload to finalise device type and is primarily concerned with deregistering the type’s context keys.

7.7. ldto_start()

        void (*ldto_start)(struct lu_device_type *t);

Called (from lu_device_init()) when the first device of this type is being initialised. Primarily concerned with "reviving" the type’s context keys.

7.8. ldto_stop()

        void (*ldto_stop)(struct lu_device_type *t);

Called (from lu_device_fini()) when the last device of this type is being finalised. Primarily concerned with "quiescing" the type’s context keys.

8. lu_attr

lu_attr contains common object attributes. These attributes correspond to the fields stored in a filesystem inode.

 * Common object attributes.
struct lu_attr {
        /** size in bytes */
        __u64          la_size;
        /** modification time in seconds since Epoch */
        __u64          la_mtime;
        /** access time in seconds since Epoch */
        __u64          la_atime;
        /** change time in seconds since Epoch */
        __u64          la_ctime;
        /** 512-byte blocks allocated to object */
        __u64          la_blocks;
        /** permission bits and file type */
        __u32          la_mode;
        /** owner id */
        __u32          la_uid;
        /** group id */
        __u32          la_gid;
        /** object flags */
        __u32          la_flags;
        /** number of persistent references to this object */
        __u32          la_nlink;
        /** blk bits of the object*/
        __u32          la_blkbits;
        /** blk size of the object*/
        __u32          la_blksize;
        /** real device */
        __u32          la_rdev;
         * valid bits
         * \see enum la_valid
        __u64          la_valid;

la_valid is a bitmask that specifies which of the members contain valid values :

/** Bit-mask of valid attributes */
enum la_valid {
        LA_ATIME = 1 << 0,
        LA_MTIME = 1 << 1,
        LA_CTIME = 1 << 2,
        LA_SIZE  = 1 << 3,
        LA_MODE  = 1 << 4,
        LA_UID   = 1 << 5,
        LA_GID   = 1 << 6,
        LA_BLOCKS = 1 << 7,
        LA_TYPE   = 1 << 8,
        LA_FLAGS  = 1 << 9,
        LA_NLINK  = 1 << 10,
        LA_RDEV   = 1 << 11,
        LA_BLKSIZE = 1 << 12,

9. lu_env

Many Lustre functions are passed a lu_env structure that defines the function’s environment (execution context) in terms of one or more lu_context structures.

struct lu_env {
         * "Local" context, used to store data instead of stack.
        struct lu_context  le_ctx;
         * "Session" context for per-request data.
        struct lu_context *le_ses;

You can see that the environment contains a lu_context (called le_ctx) that holds state that is local to the thread that is executing the object’s methods. Optionally, there can be another lu_context (pointed to by le_ses) that holds data that is specific to an individual request.

The following operations are provided to manipulate environments:

9.1. lu_env_init()

int  lu_env_init  (struct lu_env *env, __u32 tags);

Initialise the environment pointed to by env (it has already been allocated). This simply initialises le_ctx using the supplied context tags, tags, and enters that context. It also sets le_ses to NULL. Returns 0 on success.

9.2. lu_env_fini()

void lu_env_fini  (struct lu_env *env);

This exits and finalises the environment’s context le_ctx and sets le_ses to NULL.

9.3. lu_env_refill()

int  lu_env_refill(struct lu_env *env);

This refills the environment’s context le_ctx and also *le_ses if it is non-NULL. Returns 0 on success.

10. lu_context

lu_context objects act as containers for data that is local to some execution context (e.g. per-thread). The data stored in the lu_context is accessed via pre-defined keys (of type lu_context_key). As lu_context is used to store data for both server-side and client-side stacks, the keys are partitioned into sets that are identified by tag bits. A given context will be initialised to only hold values for those keys that have corresponding tag bits.

struct lu_context {
         * lu_context is used on the client side too. Yet we don't want to
         * allocate values of server-side keys for the client contexts and
         * vice versa.
         * To achieve this, set of tags in introduced. Contexts and keys are
         * marked with tags. Key value are created only for context whose set
         * of tags has non-empty intersection with one for key. Tags are taken
         * from enum lu_context_tag.
        __u32                  lc_tags;
         * Pointer to the home service thread. NULL for other execution
         * contexts.
        struct ptlrpc_thread  *lc_thread;
         * Pointer to an array with key values. Internal implementation
         * detail.
        void                 **lc_value;
        enum lu_context_state  lc_state;
         * Linkage into a list of all remembered contexts. Only
         * `non-transient' contexts, i.e., ones created for service threads
         * are placed here.
        struct list_head       lc_remember;
         * Version counter used to skip calls to lu_context_refill() when no
         * keys were registered.
        unsigned               lc_version;
         * Debugging cookie.
        unsigned               lc_cookie;

The object’s fields are:


the context key tag bits that this context supports. Only context keys that have one or more matching bits will have a value stored in this context. The context tag values are defined in lu_context_tag which also defines some flags used by the context manipulation code.

enum lu_context_tag {
         * Thread on md server
        LCT_MD_THREAD = 1 << 0,
         * Thread on dt server
        LCT_DT_THREAD = 1 << 1,
         * Context for transaction handle
        LCT_TX_HANDLE = 1 << 2,
         * Thread on client
        LCT_CL_THREAD = 1 << 3,
         * A per-request session on a server, and a per-system-call session on
         * a client.
        LCT_SESSION   = 1 << 4,

         * Set when at least one of keys, having values in this context has
         * non-NULL lu_context_key::lct_exit() method. This is used to
         * optimize lu_context_exit() call.
        LCT_HAS_EXIT  = 1 << 28,
         * Don't add references for modules creating key values in that context.
         * This is only for contexts used internally by lu_object framework.
        LCT_NOREF     = 1 << 29,
         * Key is being prepared for retiring, don't create new values for it.
        LCT_QUIESCENT = 1 << 30,
         * Context should be remembered.
        LCT_REMEMBER  = 1 << 31,
         * Contexts usable in cache shrinker thread.

pointer to the PTLRPC service thread associated with this context (this appears to be assigned but, otherwise only used in an assertion?)


pointer to the values stored in the context


the state of the context, one of:

enum lu_context_state {
        LCS_INITIALIZED = 1,

linkage into a list of all remembered contexts - this is used to be able to destroy key values in all of the remembered contexts when the key’s module is unloaded


the current version of the context’s keys/values - a global version number (key_set_version) is incremented when keys are registered/deregistered/quiesced/revived - when a context’s values are refilled (with lu_context_refill()) the refill will be skipped if the context’s version is already the same as key_set_version


used for debugging

The following methods are provided to manipulate contexts - all take a pointer to the lu_context being worked on:

10.1. lu_context_init()

int lu_context_init  (struct lu_context *ctx, __u32 tags);

Initialises the context, ctx, as specified by the supplied tags. The context’s state is set to LCS_INITIALIZED. If tags has LCT_REMEMBER set, the context will be "remembered" by adding it to a global list. Finally, the values for all of the keys supported by the context are filled (memory allocated). Returns 0 on success.

10.2. lu_context_fini()

void  lu_context_fini (struct lu_context *ctx);

Finalises the context, ctx, by finalising (destroying) the context’s key values, removing the context from the global remembered list and setting the state to LCS_FINALIZED.

10.3. lu_context_enter()

void lu_context_enter (struct lu_context *ctx);

Sets the context’s state to LCS_ENTERED.

10.4. lu_context_exit()

void lu_context_exit (struct lu_context *ctx);

Sets the context’s state to LCS_LEFT. If the context’s lc_tags field has bit LCT_HAS_EXIT set, the lct_exit() method is invoked for each key that defines that method.

10.5. lu_context_refill()

int lu_context_refill (struct lu_context *ctx);

If the context’s version is different from the global version (key_set_version), the context’s keys are refilled. This only creates values for new keys, it doesn’t change the values of existing keys. Returns 0 on success.

11. lu_context_key

An lu_context_key is associated with each data item (slot) in a context. The data is opaque as far as the context is concerned but it must be smaller than CFS_PAGE_SIZE. The keys are used to gain access to the data.

struct lu_context_key {
         * Set of tags for which values of this key are to be instantiated.
        __u32 lct_tags;
         * Value constructor. This is called when new value is created for a
         * context. Returns pointer to new value of error pointer.
        void  *(*lct_init)(const struct lu_context *ctx,
                           struct lu_context_key *key);
         * Value destructor. Called when context with previously allocated
         * value of this slot is destroyed. \a data is a value that was returned
         * by a matching call to lu_context_key::lct_init().
        void   (*lct_fini)(const struct lu_context *ctx,
                           struct lu_context_key *key, void *data);
         * Optional method called on lu_context_exit() for all allocated
         * keys. Can be used by debugging code checking that locks are
         * released, etc.
        void   (*lct_exit)(const struct lu_context *ctx,
                           struct lu_context_key *key, void *data);
         * Internal implementation detail: index within lu_context::lc_value[]
         * reserved for this key.
        int      lct_index;
         * Internal implementation detail: number of values created for this
         * key.
        atomic_t lct_used;
         * Internal implementation detail: module for this key.
        struct module *lct_owner;
         * References to this key. For debugging.
        struct lu_ref  lct_reference;

The fields are:


the tag bits for this key (see lu_context_tag for possible values)


a constructor function that is called to allocate a new value to be associated with this key


a destructor function that frees the storage associated with this key


if this method is non-NULL, it will be invoked when the context is exited (lu_context_exit() called)


the index used to access the value for this key


the number of instances of this key + 1


the module that defined this key


all the references to this key (used for debugging)

As the lct_init() and lct_fini() methods just allocate or free memory of the required size for a given key, and the code is, essentially the same for every key, the methods may be defined using macros LU_KEY_INIT(), LU_KEY_FINI(), LU_KEY_INIT_FINI(). These macros take a module name and the type of the data to be associated with the key.

Instances of lu_context_key can be defined using macro LU_CONTEXT_KEY_DEFINE() that takes a module name and tag bits.

The following helper functions are provided to manipulate context keys - all take a pointer to the lu_context_key being worked on:

11.1. lu_context_key_register()

int lu_context_key_register(struct lu_context_key *key);

This function adds the context key, key, into the global array of known keys (lu_keys), sets the key’s index, initialises its used count to 1 and increments the global key version counter.

11.2. lu_context_key_deregister()

void  lu_context_key_degister(struct lu_context_key *key);

The context key, key, is first quiesced and then removed from lu_keys. The global key version counter is incremented.

11.3. lu_context_key_get()

void *lu_context_key_get (const struct lu_context *ctx,
                          const struct lu_context_key *key);

Returns the value associated with key in context ctx.

11.4. lu_context_key_quiesce()

void lu_context_key_quiesce (struct lu_context_key *key);

The key, key, is quiesced (destroyed) in all non-transient (remembered) contexts that reference it. The LCT_QUIESCENT bit is set in its lct_tags field to mark it quiesced. The global key version counter is incremented.

11.5. lu_context_key_revive()

void lu_context_key_revive (struct lu_context_key *key);

Clears the LCT_QUIESCENT bit in supplied key, key. The global key version counter is incremented. doesn’t appear to be used anywhere?

12. lu_site

Instances of lu_site hold collections of objects. The objects can be accessed associatively by FID and also via a LRU list. Each lu_site is associated with a device stack that is used to process the object’s held by the site.

struct lu_site {
         * Site-wide lock.
         * lock protecting:
         *        - lu_site::ls_hash hash table (and its linkages in objects);
         *        - lu_site::ls_lru list (and its linkages in objects);
         *        - 0/1 transitions of object lu_object_header::loh_ref
         *        reference count;
         * yes, it's heavy.
        rwlock_t              ls_guard;
         * Hash-table where objects are indexed by fid.
        struct hlist_head    *ls_hash;
         * Bit-mask for hash-table size.
        int                   ls_hash_mask;
         * Order of hash-table.
        int                   ls_hash_bits;
         * Number of buckets in the hash-table.
        int                   ls_hash_size;

         * LRU list, updated on each access to object. Protected by
         * lu_site::ls_guard.
         * "Cold" end of LRU is lu_site::ls_lru.next. Accessed object are
         * moved to the lu_site::ls_lru.prev (this is due to the non-existence
         * of list_for_each_entry_safe_reverse()).
        struct list_head      ls_lru;
         * Total number of objects in this site. Protected by
         * lu_site::ls_guard.
        unsigned              ls_total;
         * Total number of objects in this site with reference counter greater
         * than 0. Protected by lu_site::ls_guard.
        unsigned              ls_busy;

         * Top-level device for this stack.
        struct lu_device     *ls_top_dev;

         * Wait-queue signaled when an object in this site is ultimately
         * destroyed (lu_object_free()). It is used by lu_object_find() to
         * wait before re-trying when object in the process of destruction is
         * found in the hash table.
         * If having a single wait-queue turns out to be a problem, a
         * wait-queue per hash-table bucket can be easily implemented.
         * \see htable_lookup().
        cfs_waitq_t           ls_marche_funebre;

        /** statistical counters. Protected by nothing, races are accepted. */
        struct {
                __u32 s_created;
                __u32 s_cache_hit;
                __u32 s_cache_miss;
                 * Number of hash-table entry checks made.
                 *       ->s_cache_check / (->s_cache_miss + ->s_cache_hit)
                 * is an average number of hash slots inspected during single
                 * lookup.
                __u32 s_cache_check;
                /** Races with cache insertions. */
                __u32 s_cache_race;
                 * Races with object destruction.
                 * \see lu_site::ls_marche_funebre.
                __u32 s_cache_death_race;
                __u32 s_lru_purged;
        } ls_stats;

         * Linkage into global list of sites.
        struct list_head      ls_linkage;
        struct lprocfs_stats *ls_time_stats;

The fields are:


a site-wide lock that protects access to the hash table, the LRU list and manipulation of object’s loh_ref value


the hash table that stores the site’s objects - indexed by FID


one less than ls_hash_size


the order of the hash table


the number of buckets in the hash table (1 << ls_hash_bits)


LRU list that orders the site’s objects - updated on each lookup of an object


total number of objects contained in the site


total number of "busy" objects contained in the site - an object is busy if its loh_ref value is greater than 0


pointer to the top-level lu_device in the site’s device stack


a wait queue that is signalled whenever one of the site’s objects is destroyed - a thread will be added to this queue if it tries to lookup an object that is currently being destroyed


statistics counters used to record performance of hashing


linkage into a global list of all sites


related to reporting site times via procfs

Related functions:

12.1. lu_site_init()

int lu_site_init (struct lu_site *s, struct lu_device *d);

Initialise site s. The site’s top-level device is set to d. This function sets up all of the site’s data items (list heads, hashtable, etc.). It increments the top-level device’s reference count and sets its ld_site field to s. Returns 0 on success.

12.2. lu_site_init_finish()

int lu_site_init_finish (struct lu_site *s);

Called when the site’s device stack has been completed. Links the site into the global list.

12.3. lu_site_fini()

void lu_site_fini (struct lu_site *s);

Finalise site s. Frees up the hashtable (which must be empty) and decrements the top-level device’s refrence count. Unlinks the site from the global list.