In the Linux kernel, the corresponding struct inode is obtained based on the file inode number.

Article directory

  • Preface
  • 1. Introduction
  • 2. iget_locked
    • 2.1 Introduction
    • 2.2 Used in the kernel
    • 2.3 LKM demo
  • 3. ext4_iget
    • 3.1 Introduction
    • 3.2 LKM demo

Foreword

Please refer to the file inode number and struct inode structure:
Linux file path, directory entry, inode number association
Linux file system struct inode structure analysis

1. Introduction

In Linux, every file and directory is associated with a unique inode number. The inode number is the unique identifier of the inode in the file system and is used to represent the metadata of the file or directory.

This means that inode numbers cannot be repeated in the file system.

In the same file system, the inode number of the file is unique and will not be repeated. Each file system maintains an independent inode table, in which the inode number is unique in the file system.

However, inode numbers can be repeated between different file systems. When different file systems are mounted on the same system, they each maintain independent inode number spaces. Therefore, files in different file systems can have the same inode number, but they are unique within their respective file systems.

This means that if there are two files with the same inode number in different file systems, then they are actually different files and belong to different file systems.

It should be noted that the file system implementation may limit the range of inode numbers or allocation strategies, depending on the design and implementation of the file system.

So when multiple file systems are involved, inode numbers can indeed be repeated in different file systems. This is because each file system has its own space of inode numbers that are used to uniquely identify files and directories within that file system.

When different file systems are mounted on the same operating system, each file system manages its own inode number independently. This means that even if files with the same inode number exist in two file systems, they are still considered different files and have no relation to each other.

This situation may occur under the following circumstances:
(1) Multiple independent physical disks or partitions: The file system on each disk or partition has its own inode number space, and they are independent of each other.

(2) File system image: If the same file system image is mounted multiple times, an independent inode number space will be created for each mount.

(3) Network File System (NFS): When using NFS to mount a remote file system, the inode numbers of the local file system and the remote file system are independent of each other.

This design allows different file systems to coexist within the same operating system and be accessed through mount points. Each file system has its own inode number space, ensuring uniqueness within its respective range.

It should be noted that although inode numbers can be repeated in different file systems, relying on inode numbers to uniquely identify files is generally discouraged. In a cross-filesystem or cross-system environment, using a file path or other unique identifier is more reliable and portable.

2. iget_locked

Next, we will introduce how to obtain the struct inode structure object based on the file system (i.e. super block) + file inode number. The flow chart is as follows:

2.1 Introduction

Usually from the previous knowledge, we can know that files in different file systems can have the same inode number. They are unique in their respective file systems. Therefore, we obtain the corresponding struct inode structure according to the file inode number in kernel mode programming. body, then you need to clarify the file system and inode number to obtain its corresponding struct inode structure. You cannot obtain its struct inode structure through the inode number alone. You can use the iget_locked function.

/**
 * iget_locked - obtain an inode from a mounted file system
 * @sb: super block of file system
 * @ino: inode number to get
 *
 * Search for the inode specified by @ino in the inode cache and if present
 * return it with an increased reference count. This is for file systems
 * where the inode number is sufficient for unique identification of an inode.
 *
 * If the inode is not in cache, allocate a new inode and return it locked,
 * hashed, and with the I_NEW flag set. The file system gets to fill it in
 * before unlocking it via unlock_new_inode().
 */
struct inode *iget_locked(struct super_block *sb, unsigned long ino)
{<!-- -->
struct hlist_head *head = inode_hashtable + hash(sb, ino);
struct inode *inode;
again:
spin_lock( & amp;inode_hash_lock);
inode = find_inode_fast(sb, head, ino);
spin_unlock( & amp;inode_hash_lock);
if (inode) {<!-- -->
if (IS_ERR(inode))
return NULL;
wait_on_inode(inode);
if (unlikely(inode_unhashed(inode))) {<!-- -->
iput(inode);
goto again;
}
return inode;
}

inode = alloc_inode(sb);
if (inode) {<!-- -->
struct inode *old;

spin_lock( & amp;inode_hash_lock);
/* We released the lock, so.. */
old = find_inode_fast(sb, head, ino);
if (!old) {<!-- -->
inode->i_ino = ino;
spin_lock( & amp;inode->i_lock);
inode->i_state = I_NEW;
hlist_add_head( & amp;inode->i_hash, head);
spin_unlock( & amp;inode->i_lock);
inode_sb_list_add(inode);
spin_unlock( & amp;inode_hash_lock);

/* Return the locked inode with I_NEW set, the
* caller is responsible for filling in the contents
*/
return inode;
}

/*
* Uhhuh, somebody else created the same inode under
* us. Use the old inode instead of the one we just
* allocated.
*/
spin_unlock( & amp;inode_hash_lock);
destroy_inode(inode);
if (IS_ERR(old))
return NULL;
inode = old;
wait_on_inode(inode);
if (unlikely(inode_unhashed(inode))) {<!-- -->
iput(inode);
goto again;
}
}
return inode;
}
EXPORT_SYMBOL(iget_locked);

The iget_locked function is usually used in file systems to obtain an inode (index node) from a mounted file system.

The iget_locked function provides quick access to inode objects based on the inode number and superblock, the combination of which is unique system-wide.

The implementation process of this function is as follows:
First, the hash function is used to calculate the hash value based on the given inode number ino and the superblock sb of the file system. This hash value is used to locate the corresponding hash table entry in the inode cache.

The function acquires the spin lock inode_hash_lock to lock the inode cache while performing necessary operations.

It searches the hash linked list (indicated by the hlist_head pointer head) to find the inode associated with the given inode number ino. It uses the find_inode_fast function to search.

If the inode is found in the cache, the function checks whether it is a valid inode by using the IS_ERR function. If the inode is invalid, NULL is returned. Otherwise, it uses the wait_on_inode function to wait for any pending operations on the inode to complete.

If an inode is found to be unhashed (i.e. not in the hash table), it will free the inode and go back to the beginning and search again.

If the inode is not found in the cache, proceed to allocate a new inode using the alloc_inode function. This will create a new struct inode data structure and initialize it.

After allocating the inode, the function releases the spin lock and reacquires it to perform another search to find the inode. This is necessary because the lock is released during the alloc_inode call, and another thread may be adding the inode to the cache during this time.

If no other thread has added an inode with the same inode number, the fields of the newly allocated inode will be set. Set the i_ino field to the given inode number ino, set the i_state field to I_NEW to indicate that the inode is new and not yet fully initialized, and then use hlist_add_head to add the inode to the hash table. Use inode_sb_list_add to add an inode to the superblock’s inode list.

Then, the spin lock and inode cache lock are released, and the locked inode is returned to the caller with the I_NEW flag set. The caller is responsible for filling in the contents of the inode.

If another thread adds an inode with the same inode number during the spin lock release period, the function chooses to use the existing inode instead of the newly allocated inode. It uses wait_on_inode to wait for any pending operations on existing inodes to complete. If the existing inode is unhashed, it is freed and the search is resumed at the beginning.

Finally, the function returns the obtained inode (whether newly allocated or existing) to the caller.

/*
 * find_inode_fast is the fast path version of find_inode, see the comment at
 * iget_locked for details.
 */
static struct inode *find_inode_fast(struct super_block *sb,
struct hlist_head *head, unsigned long ino)
{<!-- -->
struct inode *inode = NULL;

repeat:
hlist_for_each_entry(inode, head, i_hash) {<!-- -->
if (inode->i_ino != ino)
continue;
if (inode->i_sb != sb)
continue;
spin_lock( & amp;inode->i_lock);
if (inode->i_state & amp; (I_FREEING|I_WILL_FREE)) {<!-- -->
__wait_on_freeing_inode(inode);
goto repeat;
}
if (unlikely(inode->i_state & amp; I_CREATING)) {<!-- -->
spin_unlock( & amp;inode->i_lock);
return ERR_PTR(-ESTALE);
}
__iget(inode);
spin_unlock( & amp;inode->i_lock);
return inode;
}
return NULL;
}

The function of this function is to quickly find the inode matching the given inode number ino and super block sb in the given hash linked list head.

The function uses the hlist_for_each_entry macro to iterate through each inode in the hash linked list. For each inode, it first checks if the inode number and superblock match the given one, if not, then continues traversing the next inode.

If a matching inode is found, the function will acquire the inode’s spin lock and then perform a series of checks:
● First, it checks whether the status of the inode is I_FREEING or I_WILL_FREE, which indicates that the inode is being released or is about to be released. If this is the case, the function will call the __wait_on_freeing_inode function to wait for the inode to be completely freed, and then start the search again from the beginning.

● Next, the function checks whether the inode’s status is in the I_CREATING state, which indicates that the inode is in the process of being created. If this is the case, the function releases the inode’s spin lock and returns a pointer to a -ESTALE error, indicating that the inode has expired.

● Finally, if the inode’s status is normal, the function will call the __iget function to increment the inode’s reference count, then release the inode’s spin lock, and return a pointer to the inode.

If no matching inode is found in the entire hash list, the function returns NULL.

This function is used to quickly find inodes in the inode cache to speed up access operations to inodes in the file system.

Used in 2.2 kernel

This function is generally a specific file system function call. It obtains its corresponding struct inode structure from the inode cache according to the file inode number and super block. If it is not found from the inode cache, then the specific file system function will call its own file. The system acquisition function obtains the original inode structure on the disk, such as ext4 inode, and then uses the value of ext4 inode to initialize the struct inode structure object, and then adds the struct inode structure object to the inode cache to speed up the search next time.

For example, minix file system:

/*
 * The global function to read an inode.
 */
struct inode *minix_iget(struct super_block *sb, unsigned long ino)
{<!-- -->
struct inode *inode;

inode = iget_locked(sb, ino);
if (!inode)
return ERR_PTR(-ENOMEM);
if (!(inode->i_state & amp; I_NEW))
return inode;

if (INODE_VERSION(inode) == MINIX_V1)
return V1_minix_iget(inode);
else
return V2_minix_iget(inode);
}

First, the iget_locked function is called to obtain the corresponding struct inode structure from the inode cache according to the file inode number and super block to speed up the search for the struct inode structure. If it is not found in the cache, then a struct inode structure is allocated. Call the function V1_minix_iget of the specific file system to read the minix_inode in the minix file system, that is, the minix inode structure on the disk, use the minix inode structure members on the disk to initialize the struct inode structure object, and add the struct inode structure object to inode cache to speed up the search next time.

/*
 * The minix V1 function to read an inode.
 */
static struct inode *V1_minix_iget(struct inode *inode)
{<!-- -->
struct buffer_head * bh;
struct minix_inode * raw_inode;
struct minix_inode_info *minix_inode = minix_i(inode);
int i;

//Get the inode node on the disk from the minix disk file system based on the super block and inode number: minix_inode
raw_inode = minix_V1_raw_inode(inode->i_sb, inode->i_ino, & amp;bh);
if (!raw_inode) {<!-- -->
iget_failed(inode);
return ERR_PTR(-EIO);
}

//Initialize the struct inode structure object according to the inode node on the disk: minix_inode
inode->i_mode = raw_inode->i_mode;
i_uid_write(inode, raw_inode->i_uid);
i_gid_write(inode, raw_inode->i_gid);
set_nlink(inode, raw_inode->i_nlinks);
inode->i_size = raw_inode->i_size;
inode->i_mtime.tv_sec = inode->i_atime.tv_sec = inode->i_ctime.tv_sec = raw_inode->i_time;
inode->i_mtime.tv_nsec = 0;
inode->i_atime.tv_nsec = 0;
inode->i_ctime.tv_nsec = 0;
inode->i_blocks = 0;
for (i = 0; i < 9; i + + )
minix_inode->u.i1_data[i] = raw_inode->i_zone[i];
minix_set_inode(inode, old_decode_dev(raw_inode->i_zone[0]));
brelse(bh);
unlock_new_inode(inode);
return inode;
}

This code is a function used to read an inode in the Minix V1 file system.

Function analysis:
Call the “minix_V1_raw_inode” function to obtain the original inode structure of the Minix V1 version: minix_inode through the given super_block, inode number and a pointer used to pass the buffer head pointer.

Assign the fields in the obtained original inode structure: minix_inode to the target inode structure. This includes the inode mode (i_mode), user ID (i_uid), group ID (i_gid), number of links (i_nlinks), file size (i_size), modification time (i_mtime), access time (i_atime), creation time (i_ctime) wait.

Assign the data block address in the original inode structure: minix_inode to the data block address of the target inode structure. In the Minix V1 file system, the data block address is stored in the u.i1_data array of the minix_inode structure.

Use the “minix_set_inode” function to set the device number in the target inode structure to the first data block address in the original inode structure.

Returns the struct inode structure object.

For example, the ext2 file system:

struct inode *ext2_iget (struct super_block *sb, unsigned long ino)
{<!-- -->
struct ext2_inode_info *ei;
struct buffer_head * bh = NULL;
struct ext2_inode *raw_inode;
struct inode *inode;
long ret = -EIO;
int n;
uid_t i_uid;
gid_t i_gid;

inode = iget_locked(sb, ino);
if (!inode)
return ERR_PTR(-ENOMEM);
if (!(inode->i_state & amp; I_NEW))
return inode;

ei = EXT2_I(inode);
ei->i_block_alloc_info = NULL;

raw_inode = ext2_get_inode(inode->i_sb, ino, & amp;bh);
if (IS_ERR(raw_inode)) {<!-- -->
ret = PTR_ERR(raw_inode);
 goto bad_inode;
}

inode->i_mode = le16_to_cpu(raw_inode->i_mode);
i_uid = (uid_t)le16_to_cpu(raw_inode->i_uid_low);
i_gid = (gid_t)le16_to_cpu(raw_inode->i_gid_low);
if (!(test_opt (inode->i_sb, NO_UID32))) {<!-- -->
i_uid |= le16_to_cpu(raw_inode->i_uid_high) << 16;
i_gid |= le16_to_cpu(raw_inode->i_gid_high) << 16;
}
i_uid_write(inode, i_uid);
i_gid_write(inode, i_gid);
set_nlink(inode, le16_to_cpu(raw_inode->i_links_count));
inode->i_size = le32_to_cpu(raw_inode->i_size);
inode->i_atime.tv_sec = (signed)le32_to_cpu(raw_inode->i_atime);
inode->i_ctime.tv_sec = (signed)le32_to_cpu(raw_inode->i_ctime);
inode->i_mtime.tv_sec = (signed)le32_to_cpu(raw_inode->i_mtime);
inode->i_atime.tv_nsec = inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec = 0;
...
}

2.3 LKM demo

Next we give an LKM example to obtain the corresponding struct inode structure based on the inode number:

#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>

#include <linux/fs.h>
#include <linux/fs_struct.h>


/* Module parameter */
static unsigned long ino = 1837047;

module_param(ino, ulong, 0);


static int __init hello_init(void)
{<!-- -->
    struct fs_struct *fs;
    struct path pwd;

    //unsigned long ino = 1837047;
    struct inode *inode;

    struct super_block *sb;
    
    fs = current->fs;
    get_fs_pwd(fs, & amp;pwd);
    /* The root of the dentry tree */
    sb = pwd.dentry->d_sb;

//Find the struct inode object from the inode cache
    inode = iget_locked(sb, ino);

    if(inode)
        printk("inode num = %ld\
", inode->i_ino);

path_put( & amp;pwd);
\t
    return -1;
}
 
 
module_init(hello_init);
 
MODULE_LICENSE("GPL");
ls -il 3.txt
1837252

# insmod hello.ko ino=1837252
insmod: ERROR: could not insert module hello.ko: Operation not permitted

#dmesg-c
inode num = 1837252

This just looks for the struct inode from the inode cache. If it is not in the inode cache, then it will go to the disk to find the corresponding disk inode structure.

3. ext4_iget

3.1 Introduction

Let’s take a brief look at how the ext4 file system obtains its struct inode structure based on the inode number:

// linux-5.4.18/fs/ext4/ext4.h

typedef enum {<!-- -->
EXT4_IGET_NORMAL = 0,
EXT4_IGET_SPECIAL = 0x0001, /* OK to get a system inode */
EXT4_IGET_HANDLE = 0x0002 /* Inode # is from a handle */
} ext4_iget_flags;

extern struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
ext4_iget_flags flags, const char *function,
unsigned int line);

#define ext4_iget(sb, ino, flags) \
__ext4_iget((sb), (ino), (flags), __func__, __LINE__)

The ext4 file system obtains the struct inode structure through the ext4_iget macro. The macro definition ext4_iget calls the __ext4_iget function to obtain the struct inode structure corresponding to the specified inode number.

This macro definition accepts three parameters:

sb: Pointer to the super_block structure, indicating the super block to be operated.
ino: Indicates the inode number to be obtained.
flags: represents the flag bit of type ext4_iget_flags, used to specify the behavior of the __ext4_iget function.

The macro definition passes arguments to the __ext4_iget function and in the last two arguments func and LINE the function name and line number on which the macro is called. This information is often used for debugging purposes so that when an error occurs, the specific call location can be traced.

For the third parameter flags:

typedef enum {<!-- -->
EXT4_IGET_NORMAL = 0,
EXT4_IGET_SPECIAL = 0x0001, /* OK to get a system inode */
EXT4_IGET_HANDLE = 0x0002 /* Inode # is from a handle */
} ext4_iget_flags;

Enumeration type ext4_iget_flags, used to represent flags or options for the ext4_iget function.

The ext4_iget_flags enumeration type contains three members:

EXT4_IGET_NORMAL: Indicates normal iget operation. It is used to indicate that the iget function should get the inode of an ordinary file or directory.

EXT4_IGET_SPECIAL: Indicates that system-level inode can be iget. System-level inodes are special-purpose inodes used for various system-level operations or file types (such as superblocks or log files).

EXT4_IGET_HANDLE: used when the inode number is obtained from a handle. A handle is an identifier used to access an inode across file systems and processes, often used to implement advanced file system functionality.

struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
ext4_iget_flags flags, const char *function,
unsigned int line)
{<!-- -->
struct ext4_iloc iloc;
struct ext4_inode *raw_inode;
struct ext4_inode_info *ei;
struct inode *inode;
journal_t *journal = EXT4_SB(sb)->s_journal;
long ret;
loff_t size;
int block;
uid_t i_uid;
gid_t i_gid;
projid_t i_projid;

...

//Find struct inode from inode cache
inode = iget_locked(sb, ino);
if (!inode)
return ERR_PTR(-ENOMEM);
if (!(inode->i_state & amp; I_NEW))
//If it is found in the inode cache and it is not the latest allocation, then return directly
return inode;

ei = EXT4_I(inode);
iloc.bh = NULL;

//Not found in inode cache, read ext4_inode of ext4 file system
ret = __ext4_get_inode_loc(inode, & amp;iloc, 0);
if (ret < 0)
goto bad_inode;
raw_inode = ext4_raw_inode( & amp;iloc);
\t
//Initialize the newly allocated struct inode according to the ext4_inode of the ext4 file system
//For example, initialize i_mode, i_uid, i_gid
inode->i_mode = le16_to_cpu(raw_inode->i_mode);
i_uid = (uid_t)le16_to_cpu(raw_inode->i_uid_low);
i_gid = (gid_t)le16_to_cpu(raw_inode->i_gid_low);
...
i_uid_write(inode, i_uid);
i_gid_write(inode, i_gid);
\t
...
\t
//Initialize i_op and i_fop according to the file type of the inode
if (S_ISREG(inode->i_mode)) {<!-- -->
inode->i_op = & amp;ext4_file_inode_operations;
inode->i_fop = & amp;ext4_file_operations;
ext4_set_aops(inode);
} else if (S_ISDIR(inode->i_mode)) {<!-- -->
inode->i_op = & amp;ext4_dir_inode_operations;
inode->i_fop = & amp;ext4_dir_operations;
} else if (S_ISLNK(inode->i_mode)) {<!-- -->
/* VFS does not allow setting these so must be corrupt */
if (IS_APPEND(inode) || IS_IMMUTABLE(inode)) {<!-- -->
ext4_error_inode(inode, function, line, 0,
"iget: immutable or append flags "
"not allowed on symlinks");
ret = -EFSCORRUPTED;
goto bad_inode;
}
if (IS_ENCRYPTED(inode)) {<!-- -->
inode->i_op = &ext4_encrypted_symlink_inode_operations;
ext4_set_aops(inode);
} else if (ext4_inode_is_fast_symlink(inode)) {<!-- -->
inode->i_link = (char *)ei->i_data;
inode->i_op = & amp;ext4_fast_symlink_inode_operations;
nd_terminate_link(ei->i_data, inode->i_size,
sizeof(ei->i_data) - 1);
} else {<!-- -->
inode->i_op = & amp;ext4_symlink_inode_operations;
ext4_set_aops(inode);
}
inode_nohighmem(inode);
} else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {<!-- -->
inode->i_op = & amp;ext4_special_inode_operations;
if (raw_inode->i_block[0])
init_special_inode(inode, inode->i_mode,
old_decode_dev(le32_to_cpu(raw_inode->i_block[0])));
else
init_special_inode(inode, inode->i_mode,
new_decode_dev(le32_to_cpu(raw_inode->i_block[1])));
} else if (ino == EXT4_BOOT_LOADER_INO) {<!-- -->
make_bad_inode(inode);
} else {<!-- -->
ret = -EFSCORRUPTED;
ext4_error_inode(inode, function, line, 0,
"iget: bogus i_mode (%o)", inode->i_mode);
goto bad_inode;
}
}

The __ext4_iget function first calls the iget_locked function to obtain the struct inode object from the inode cache according to the super block and the inode number of the file. If it is found from the inode cache, it directly returns the struct inode object. If it is not found, allocates a struct inode object and then calls The __ext4_get_inode_loc function and the ext4_raw_inode function read the ext4 inode on the disk according to the ext4 file system, initialize the struct inode object according to the ext4 inode on the disk, and then add the struct inode object to the inode cache to speed up the next inode search.

3.2 LKM demo

Next we give an LKM example to obtain the corresponding struct inode structure based on the inode number:

#include 
#include 
#include 

#include 
#include 

#include 

/* Module parameter */
static unsigned long ino = 1837047;

module_param(ino, ulong, 0);

typedef enum {<!-- -->
EXT4_IGET_NORMAL = 0,
EXT4_IGET_SPECIAL = 0x0001, /* OK to get a system inode */
EXT4_IGET_HANDLE = 0x0002 /* Inode # is from a handle */
} ext4_iget_flags;

struct inode *(*my__ext4_iget)(struct super_block *sb, unsigned long ino,
ext4_iget_flags flags, const char *function,
unsigned int line);

static int __init hello_init(void)
{
    struct fs_struct *fs;
    struct path pwd;

    //unsigned long ino = 1837047;
    struct inode *inode;

    struct super_block *sb;
    
    fs = current->fs;
    get_fs_pwd(fs, & amp;pwd);
    /* The root of the dentry tree */
    sb = pwd.dentry->d_sb;

    my__ext4_iget = (void *)kallsyms_lookup_name("__ext4_iget");

    inode = my__ext4_iget(sb, ino, EXT4_IGET_NORMAL, __func__, __LINE__);

    if(inode)
        printk("inode num = %ld\
", inode->i_ino);

path_put( & amp;pwd);

    return -1;
}
 
 
module_init(hello_init);
 
MODULE_LICENSE("GPL");