NVMe Linux driver series one: host side [fabrics.c]<66>

define

static LIST_HEAD(nvmf_transports);
static DECLARE_RWSEM(nvmf_transports_rwsem);

static LIST_HEAD(nvmf_hosts);
static DEFINE_MUTEX(nvmf_hosts_mutex);

static struct nvmf_host *nvmf_default_host;

This code defines some global data structures and variables for managing information related to the NVMe over Fabrics (NVMF) transport layer. Here’s what each variable does:

  • nvmf_transports: This is a linked list header used to store registered NVMF transport layer information. Each transport layer is added to this linked list in the form of a specific structure.

  • nvmf_transports_rwsem: This is a read-write semaphore (semaphore) used to protect concurrent access to the nvmf_transports linked list. It ensures that when adding, deleting or querying transport layer information, only one thread can perform modification operations at the same time.

  • nvmf_hosts: This is another linked list header used to store registered NVMF host (host) information. Each host is also added to this linked list in the form of a specific structure.

  • nvmf_hosts_mutex: This is a mutex lock used to protect concurrent access to the nvmf_hosts linked list. Similar to nvmf_transports_rwsem, it ensures thread safety when modifying host information under concurrent circumstances.

  • nvmf_default_host: This is a pointer to the default NVMF host. The NVMF host is the host that handles connection requests. This variable is used to store the default host information.

These global data structures and variables are used to manage transport layer and host registration information in the NVMe over Fabrics subsystem. In this subsystem, this information can be used to perform operations such as registration, deregistration, and lookup of the transport layer and host.

nvmf_host_alloc

static struct nvmf_host *nvmf_host_alloc(const char *hostnqn, uuid_t *id)
{<!-- -->
struct nvmf_host *host;

host = kmalloc(sizeof(*host), GFP_KERNEL);
if (!host)
return NULL;

kref_init( & amp;host->ref);
uuid_copy( & amp;host->id, id);
strscpy(host->nqn, hostnqn, NVMF_NQN_SIZE);

return host;
}

This code defines a function nvmf_host_alloc, which is used to allocate and initialize a nvmf_host structure. This structure represents an NVMF host (host), which stores information about the host.

Here are the key parts of this function:

  • hostnqn: Indicates the NQN (NVMe Qualified Name) of the NVMF host to be allocated, which is the unique identifier of the host.

  • id: Indicates a UUID (Universally Unique Identifier) used to identify the host.

  • The function first allocates a block of memory by calling kmalloc, with a size of sizeof(*host) bytes, which is used to store the nvmf_host structure.

  • Then use kref_init to initialize the reference counter of the host structure.

  • Use uuid_copy to copy the incoming id to the host->id field.

  • Use strscpy to copy the incoming hostnqn into the host->nqn field. This is the NQN field of the host.

Finally, the function returns a pointer to the allocated nvmf_host structure. If memory allocation fails, the function returns NULL.

This function is used to allocate and initialize a host object in the NVMF subsystem, which can then be added to the host linked list for management.

nvmf_host_add

static struct nvmf_host *nvmf_host_add(const char *hostnqn, uuid_t *id)
{<!-- -->
struct nvmf_host *host;

mutex_lock( & amp;nvmf_hosts_mutex);

/*
* We have defined a host as how it is perceived by the target.
* Therefore, we don't allow different Host NQNs with the same Host ID.
* Similarly, we do not allow the usage of the same Host NQN with
* different Host IDs. This'll maintain unambiguous host identification.
*/
list_for_each_entry(host, & amp;nvmf_hosts, list) {<!-- -->
bool same_hostnqn = !strcmp(host->nqn, hostnqn);
bool same_hostid = uuid_equal( & amp;host->id, id);

if (same_hostnqn & amp; & amp; same_hostid) {<!-- -->
kref_get( & amp;host->ref);
goto out_unlock;
}
if (same_hostnqn) {<!-- -->
pr_err("found same hostnqn %s but different hostid %pUb\\
",
hostnqn, id);
host = ERR_PTR(-EINVAL);
goto out_unlock;
}
if (same_hostid) {<!-- -->
pr_err("found same hostid %pUb but different hostnqn %s\\
",
id, hostnqn);
host = ERR_PTR(-EINVAL);
goto out_unlock;
}
}

host = nvmf_host_alloc(hostnqn, id);
if (!host) {<!-- -->
host = ERR_PTR(-ENOMEM);
goto out_unlock;
}

list_add_tail( & amp;host->list, & amp;nvmf_hosts);
out_unlock:
mutex_unlock( & amp;nvmf_hosts_mutex);
return host;
}

This code defines a function nvmf_host_add, which is used to add a new host to the NVMF host list. This function checks against the host’s NQN and ID to ensure the host is unique.

Here are the key parts of this function:

  • hostnqn: Indicates the NQN of the NVMF host to be added.

  • id: Represents a UUID used to identify the host.

  • Inside the function, the mutex lock of nvmf_hosts_mutex is first acquired to ensure that no race conditions will occur when modifying the host list.

  • Use list_for_each_entry to loop through the host list to check whether the same host exists. The uniqueness of the host is determined based on the NQN and ID.

    • If a host with the same NQN and ID is found, the same host already exists, the reference count is incremented, the mutex is unlocked and the found host is returned.

    • If a host with the same NQN but a different ID is found, there is a host with the same NQN but a different ID, which is not allowed. The function returns an error code and unlocks the mutex.

    • If a host with the same ID but a different NQN is found, a host with the same ID but a different NQN exists, which is also not allowed. The function returns an error code and unlocks the mutex.

  • If the same host is not found, call the nvmf_host_alloc function to allocate and initialize a new host object.

  • The new host object is then added to the host linked list.

  • Finally unlocks the mutex and returns the newly added host object.

If there is a memory allocation failure, etc., the function will return an appropriate error code. This function is used to add a new host object in the NVMF subsystem and check the validity of the host’s NQN and ID.

nvmf_host_default

static struct nvmf_host *nvmf_host_default(void)
{<!-- -->
struct nvmf_host *host;
char nqn[NVMF_NQN_SIZE];
uuid_t id;

uuid_gen( &id);
snprintf(nqn, NVMF_NQN_SIZE,
"nqn.2014-08.org.nvmexpress:uuid:%pUb", &id);

host = nvmf_host_alloc(nqn, & amp;id);
if (!host)
return NULL;

mutex_lock( & amp;nvmf_hosts_mutex);
list_add_tail( &host->list, &nvmf_hosts);
mutex_unlock( & amp;nvmf_hosts_mutex);

return host;
}

This code defines a function nvmf_host_default that creates and adds a default NVMF host. The NQN of the default host is automatically generated from the UUID and follows a specific format.

Here are the key parts of this function:

  • The function first generates a UUID that uniquely identifies the default host.

  • Use snprintf to format the UUID as the NQN of the default host. NQN is a string that follows a specific format like "nqn.2014-08.org.nvmexpress:uuid:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx".

  • Call the nvmf_host_alloc function to allocate and initialize a new host object, using the NQN and UUID generated above.

  • Obtain the mutex lock nvmf_hosts_mutex and add the newly created host object to the host list.

  • Finally the mutex is unlocked and the newly created default host object is returned.

The function of this function is to create a default NVMF host object, add it to the host list, and return a pointer to the host object. The default host’s NQN is based on an automatically generated UUID and follows a specific format.

nvmf_host_destroy & amp; nvmf_host_put


static void nvmf_host_destroy(struct kref *ref)
{<!-- -->
struct nvmf_host *host = container_of(ref, struct nvmf_host, ref);

mutex_lock( & amp;nvmf_hosts_mutex);
list_del( & amp;host->list);
mutex_unlock( & amp;nvmf_hosts_mutex);

kfree(host);
}

static void nvmf_host_put(struct nvmf_host *host)
{<!-- -->
if (host)
kref_put( & amp;host->ref, nvmf_host_destroy);
}

This code defines the release and reference count management functions associated with NVMF host objects.

  • The nvmf_host_destroy function is a callback function that releases the host object when its reference count reaches zero. This function obtains the pointer of the host object from the struct kref passed to kref_put, then deletes the host object from the host linked list, and releases the memory of the host object.

  • The nvmf_host_put function is used to decrement the reference count of the host object. If the host object exists (non-null), decrement the reference count and call nvmf_host_destroy to release the host object when the reference count reaches zero.

Together, these functions manage the lifecycle of host objects, ensuring that associated resources and memory are properly released when they are no longer needed. This is important to avoid memory leaks and resource exhaustion.

nvmf_get_address


/**
 * nvmf_get_address() - Get address/port
 * @ctrl: Host NVMe controller instance which we got the address
 * @buf: OUTPUT parameter that will contain the address/port
 * @size: buffer size
 */
int nvmf_get_address(struct nvme_ctrl *ctrl, char *buf, int size)
{<!-- -->
int len = 0;

if (ctrl->opts->mask & NVMF_OPT_TRADDR)
len + = scnprintf(buf, size, "traddr=%s", ctrl->opts->traddr);
if (ctrl->opts->mask & NVMF_OPT_TRSVCID)
len + = scnprintf(buf + len, size - len, "%strsvcid=%s",
(len) ? "," : "", ctrl->opts->trsvcid);
if (ctrl->opts->mask & NVMF_OPT_HOST_TRADDR)
len + = scnprintf(buf + len, size - len, "%shost_traddr=%s",
(len) ? "," : "", ctrl->opts->host_traddr);
if (ctrl->opts->mask & NVMF_OPT_HOST_IFACE)
len + = scnprintf(buf + len, size - len, "%shost_iface=%s",
(len) ? "," : "", ctrl->opts->host_iface);
len + = scnprintf(buf + len, size - len, "\\
");

return len;
}
EXPORT_SYMBOL_GPL(nvmf_get_address);

This code defines a function called nvmf_get_address, which is used to get the address and port information of the NVMe controller, and format it as a string and store it in the output buffer. This function determines which information to obtain by examining various flag bits in the controller options, and then formats the information into a string containing the address, port, and other relevant information.

The parameters of the function include:

  • ctrl: A pointer to an NVMe controller instance with options for address and port information to get.
  • buf: Output buffer for storing formatted address/port information.
  • size: The size of the output buffer.

Function implementation steps:

  1. Use conditional statements to examine the different flags of the controller options to determine which information to obtain.
  2. Use the scnprintf function to format the different information parts one by one and append them to the output buffer.
  3. After formatting all the information, append the string containing the formatted information to the end of the output buffer, and return the total length of the string.

Finally, the function will return the total length of the formatted information.

This function is marked EXPORT_SYMBOL_GPL, indicating that it can be used by kernel modules as well as GPL-licensed code for the kernel. This way, other modules and kernel components can call this function to get the address and port information of the NVMe controller.

nvmf_reg_read32

/**
 * nvmf_reg_read32() - NVMe Fabrics "Property Get" API function.
 * @ctrl: Host NVMe controller instance maintaining the admin
 * queue used to submit the property read command to
 * the allocated NVMe controller resource on the target system.
 * @off: Starting offset value of the targeted property
 * register (see the fabrics section of the NVMe standard).
 * @val: OUTPUT parameter that will contain the value of
 * the property after a successful read.
 *
 * Used by the host system to retrieve a 32-bit capsule property value
 * from an NVMe controller on the target system.
 *
 * ("Capsule property" is an "PCIe register concept" applied to the
 * NVMe fabrics space.)
 *
 * Return:
 * 0: successful read
 * > 0: NVMe error status code
 * < 0: Linux errno error code
 */
int nvmf_reg_read32(struct nvme_ctrl *ctrl, u32 off, u32 *val)
{<!-- -->
struct nvme_command cmd = {<!-- --> };
union nvme_result res;
int ret;

cmd.prop_get.opcode = nvme_fabrics_command;
cmd.prop_get.fctype = nvme_fabrics_type_property_get;
cmd.prop_get.offset = cpu_to_le32(off);

ret = __nvme_submit_sync_cmd(ctrl->fabrics_q, &cmd, &res, NULL, 0,
NVME_QID_ANY, 0, 0);

if (ret >= 0)
*val = le64_to_cpu(res.u64);
if (unlikely(ret != 0))
dev_err(ctrl->device,
"Property Get error: %d, offset %#x\\
",
ret > 0 ? ret & amp; ~NVME_SC_DNR : ret, off);

return ret;
}
EXPORT_SYMBOL_GPL(nvmf_reg_read32);

This code defines a function named nvmf_reg_read32 to read a 32-bit property value from the target system’s NVMe controller via the NVMe Fabrics “Property Get” API. This function is implemented by submitting a property read command on the admin queue. This function is used by the host system to retrieve property values from the target system’s NVMe controller.

The parameters of the function include:

  • ctrl: Pointer to the NVMe controller instance that manages the admin queue for submitting attribute read commands to the allocated NVMe controller resource on the target system.
  • off: The starting offset value of the attribute register to be read (see the fabrics section of the NVMe standard).
  • val: Output parameter, used to store the value of the attribute after successful reading.

Function implementation steps:

  1. Create an empty NVMe command structure cmd.
  2. Fill the cmd structure with information related to attribute reading, including command type, offset, etc.
  3. Call the __nvme_submit_sync_cmd function to submit commands on the admin queue in a synchronous manner and obtain the returned results.
  4. If the command executes successfully (ret >= 0), the attribute value in the returned result is parsed and stored in val.
  5. If the command execution fails (ret != 0), an error message is printed in the device log.

Function return value description:

  • 0: Successfully read the attribute.
  • > 0: NVMe error status code, part of which is NVME_SC_DNR (Device Not Ready).
  • < 0: Linux errno error code.

Finally, this function is marked EXPORT_SYMBOL_GPL, indicating that it can be used by kernel modules as well as the kernel's GPL-licensed code. This way, other modules and kernel components can call this function to read the property values of the target system's NVMe controller.

nvmf_reg_read64

/**
 * nvmf_reg_read64() - NVMe Fabrics "Property Get" API function.
 * @ctrl: Host NVMe controller instance maintaining the admin
 * queue used to submit the property read command to
 * the allocated controller resource on the target system.
 * @off: Starting offset value of the targeted property
 * register (see the fabrics section of the NVMe standard).
 * @val: OUTPUT parameter that will contain the value of
 * the property after a successful read.
 *
 * Used by the host system to retrieve a 64-bit capsule property value
 * from an NVMe controller on the target system.
 *
 * ("Capsule property" is an "PCIe register concept" applied to the
 * NVMe fabrics space.)
 *
 * Return:
 * 0: successful read
 * > 0: NVMe error status code
 * < 0: Linux errno error code
 */
int nvmf_reg_read64(struct nvme_ctrl *ctrl, u32 off, u64 *val)
{<!-- -->
struct nvme_command cmd = {<!-- --> };
union nvme_result res;
int ret;

cmd.prop_get.opcode = nvme_fabrics_command;
cmd.prop_get.fctype = nvme_fabrics_type_property_get;
cmd.prop_get.attrib = 1;
cmd.prop_get.offset = cpu_to_le32(off);

ret = __nvme_submit_sync_cmd(ctrl->fabrics_q, &cmd, &res, NULL, 0,
NVME_QID_ANY, 0, 0);

if (ret >= 0)
*val = le64_to_cpu(res.u64);
if (unlikely(ret != 0))
dev_err(ctrl->device,
"Property Get error: %d, offset %#x\\
",
ret > 0 ? ret & ~NVME_SC_DNR : ret, off);
return ret;
}
EXPORT_SYMBOL_GPL(nvmf_reg_read64);

This code defines a function named nvmf_reg_read64 to read a 64-bit property value from the target system’s NVMe controller via the NVMe Fabrics “Property Get” API. This function is similar to the previous nvmf_reg_read32 function, and it is also implemented by submitting attribute read commands on the admin queue.

The parameters and description of the function are the same as the previous function:

  • ctrl: Pointer to the NVMe controller instance that manages the admin queue for submitting attribute read commands to the allocated NVMe controller resource on the target system.
  • off: The starting offset value of the attribute register to be read (see the fabrics section of the NVMe standard).
  • val: Output parameter used to store the 64-bit value of the attribute after a successful read.

The implementation steps of the function are similar to the previous function:

  1. Create an empty NVMe command structure cmd.
  2. Fill the information related to attribute reading into the cmd structure, including command type, offset, etc. The difference from nvmf_reg_read32 is that the attrib field of attribute read is set to 1 here.
  3. Call the __nvme_submit_sync_cmd function to submit the command on the admin queue synchronously and get the return result.
  4. If the command executes successfully (ret >= 0), the attribute value in the returned result is parsed and stored in val.
  5. If the command execution fails (ret != 0), print an error message in the device log.

The return value specification of the function is the same as the previous function:

  • 0: Successfully read the attribute.
  • >0: NVMe error status code, part of which is NVME_SC_DNR (Device Not Ready).
  • < 0: Linux errno error code.

Likewise, this function is also marked as EXPORT_SYMBOL_GPL, indicating that it can be used by kernel modules as well as the kernel's GPL-licensed code.

nvmf_reg_write32

/**
 * nvmf_reg_write32() - NVMe Fabrics "Property Write" API function.
 * @ctrl: Host NVMe controller instance maintaining the admin
 * queue used to submit the property read command to
 * the allocated NVMe controller resource on the target system.
 * @off: Starting offset value of the targeted property
 * register (see the fabrics section of the NVMe standard).
 * @val: Input parameter that contains the value to be
 * written to the property.
 *
 * Used by the NVMe host system to write a 32-bit capsule property value
 * to an NVMe controller on the target system.
 *
 * ("Capsule property" is an "PCIe register concept" applied to the
 * NVMe fabrics space.)
 *
 * Return:
 * 0: successful write
 * > 0: NVMe error status code
 * < 0: Linux errno error code
 */
int nvmf_reg_write32(struct nvme_ctrl *ctrl, u32 off, u32 val)
{<!-- -->
struct nvme_command cmd = {<!-- --> };
int ret;

cmd.prop_set.opcode = nvme_fabrics_command;
cmd.prop_set.fctype = nvme_fabrics_type_property_set;
cmd.prop_set.attrib = 0;
cmd.prop_set.offset = cpu_to_le32(off);
cmd.prop_set.value = cpu_to_le64(val);

ret = __nvme_submit_sync_cmd(ctrl->fabrics_q, & amp;cmd, NULL, NULL, 0,
NVME_QID_ANY, 0, 0);
if (unlikely(ret))
dev_err(ctrl->device,
"Property Set error: %d, offset %#x\\
",
ret > 0 ? ret & amp; ~NVME_SC_DNR : ret, off);
return ret;
}
EXPORT_SYMBOL_GPL(nvmf_reg_write32);

This code defines a function named nvmf_reg_write32 to write a 32-bit property value to the NVMe controller of the target system via the NVMe Fabrics “Property Write” API. Similar to the previous property reading function, this function is also implemented by submitting a property writing command on the admin queue.

The parameters and description of the function are the same as the previous function:

  • ctrl: Pointer to the NVMe controller instance that manages the admin queue for submitting attribute write commands to the allocated NVMe controller resource on the target system.
  • off: The starting offset value of the attribute register to be written (see the fabrics section of the NVMe standard).
  • val: Input parameter, containing the 32-bit value of the attribute to be written.

The implementation steps of the function are similar to the previous function:

  1. Create an empty NVMe command structure cmd.
  2. Fill the cmd structure with information related to writing properties, including the command type, offset, and value to be written.
  3. Call the __nvme_submit_sync_cmd function to submit commands on the admin queue in a synchronous manner and obtain the returned results.
  4. If the command execution fails (ret != 0), an error message is printed in the device log.

The return value description of the function is the same as the previous function:

  • 0: The attribute was written successfully.
  • > 0: NVMe error status code, part of which is NVME_SC_DNR (Device Not Ready).
  • < 0: Linux errno error code.

Likewise, this function is marked EXPORT_SYMBOL_GPL, indicating that it can be used by kernel modules as well as the kernel's GPL-licensed code.

nvmf_log_connect_error


/**
 * nvmf_log_connect_error() - Error-parsing-diagnostic print out function for
 * connect() errors.
 * @ctrl: The specific /dev/nvmeX device that had the error.
 * @errval: Error code to be decoded in a more human-friendly
 * printout.
 * @offset: For use with the NVMe error code
 * NVME_SC_CONNECT_INVALID_PARAM.
 * @cmd: This is the SQE portion of a submission capsule.
 * @data: This is the "Data" portion of a submission capsule.
 */
static void nvmf_log_connect_error(struct nvme_ctrl *ctrl,
int errval, int offset, struct nvme_command *cmd,
struct nvmf_connect_data *data)
{<!-- -->
int err_sctype = errval & amp; ~NVME_SC_DNR;

if (errval < 0) {<!-- -->
dev_err(ctrl->device,
"Connect command failed, errno: %d\\
", errval);
return;
}

switch (err_sctype) {<!-- -->
case NVME_SC_CONNECT_INVALID_PARAM:
if (offset >> 16) {<!-- -->
char *inv_data = "Connect Invalid Data Parameter";

switch (offset & amp; 0xffff) {<!-- -->
case (offsetof(struct nvmf_connect_data, cntlid)):
dev_err(ctrl->device,
"%s, cntlid: %d\\
",
inv_data, data->cntlid);
break;
case (offsetof(struct nvmf_connect_data, hostnqn)):
dev_err(ctrl->device,
"%s, hostnqn "%s"\\
",
inv_data, data->hostnqn);
break;
case (offsetof(struct nvmf_connect_data, subsysnqn)):
dev_err(ctrl->device,
"%s, subsysnqn "%s"\\
",
inv_data, data->subsysnqn);
break;
default:
dev_err(ctrl->device,
"%s, starting byte offset: %d\\
",
inv_data, offset & 0xffff);
break;
}
} else {<!-- -->
char *inv_sqe = "Connect Invalid SQE Parameter";

switch (offset) {<!-- -->
case (offsetof(struct nvmf_connect_command, qid)):
dev_err(ctrl->device,
"%s, qid %d\\
",
inv_sqe, cmd->connect.qid);
break;
default:
dev_err(ctrl->device,
"%s, starting byte offset: %d\\
",
inv_sqe, offset);
}
}
break;
case NVME_SC_CONNECT_INVALID_HOST:
dev_err(ctrl->device,
"Connect for subsystem %s is not allowed, hostnqn: %s\\
",
data->subsysnqn, data->hostnqn);
break;
case NVME_SC_CONNECT_CTRL_BUSY:
dev_err(ctrl->device,
"Connect command failed: controller is busy or not available\\
");
break;
case NVME_SC_CONNECT_FORMAT:
dev_err(ctrl->device,
"Connect incompatible format: %d",
cmd->connect.recfmt);
break;
case NVME_SC_HOST_PATH_ERROR:
dev_err(ctrl->device,
"Connect command failed: host path error\\
");
break;
case NVME_SC_AUTH_REQUIRED:
dev_err(ctrl->device,
"Connect command failed: authentication required\\
");
break;
default:
dev_err(ctrl->device,
"Connect command failed, error wo/DNR bit: %d\\
",
err_sctype);
break;
}
}

This code defines a function called nvmf_log_connect_error for error analysis and printing when a connection error occurs. This function will output the connection error information to the log of the device according to the incoming error code, offset value and related command data.

The parameters of the function are as follows:

  • ctrl: A pointer to an NVMe controller instance, representing the specific device where the connection error occurred.
  • errval: The error code of the connection error, which may be the NVMe error status code, Linux's errno error code, or other error codes.
  • offset: The offset value related to certain error codes, used to indicate the specific location of the error.
  • cmd: A pointer to the NVMe command structure, representing the SQE part (Submission Queue Entry) of the connection command.
  • data: A pointer to the connection data structure, representing the "Data" part of the connection data.

The function first checks the value of errval to determine what type of error occurred. Then perform appropriate handling and log output based on the error type. Here are the main steps of the function:

  1. If errval is a negative number, it means that the connection command failed, and the dev_err function will be used to output the error message to the device log.

  2. If errval is non-negative, parsing and output are performed based on the error type. The following are possible error types:

    • NVME_SC_CONNECT_INVALID_PARAM: Invalid connection parameters.
      • If offset >> 16 is true, it means that a parameter in the connection data is invalid. According to the offset value, output specific invalid parameters.
      • If offset >> 16 is false, it means that a parameter in the connection command is invalid. According to the offset value, output specific invalid parameters.
    • NVME_SC_CONNECT_INVALID_HOST: Connection to the subsystem is not allowed.
    • NVME_SC_CONNECT_CTRL_BUSY: The controller is busy or unavailable.
    • NVME_SC_CONNECT_FORMAT: The connection format is incompatible.
    • NVME_SC_HOST_PATH_ERROR: The host path is wrong.
    • NVME_SC_AUTH_REQUIRED: When authentication is required.
    • Other error types: Output error information to the device log.

In short, this function is used to parse and output connection error information to the device's log to facilitate troubleshooting and debugging connection problems.