Linux driver development – PCI device driver

Table of Contents

1. Introduction to PCI protocol

2. PCI and PCI-e

3. Linux PCI driver

4. PCI device driver examples

5. Bus device driver development exercises


1. Introduction to PCI protocol

PCI (Peripheral Component Interconnect, Peripheral Component Interconnect) local bus is a bus standard developed by Intel and several other companies. It was originally intended to replace buses such as ISA and was used to solve the problem of graphical interface displays at that time. Bandwidth issue. Compared with ISA bus, its biggest features are high bandwidth, burst transmission and plug and play (hot plug). In the PCI 3.0 specification, the clock rates of the PCI local bus are three standard rates: 33MHZ, 66MHz and 133MHz, and the supported data bit widths are 32-bit and 64-bit. Therefore, the minimum data transfer rate is 33MHz x 32bit = 132MB/s, which is 132M bytes per second, which fully meets the requirements of the graphics card at the time. Burst transmission refers to the multiplexing of the address bus and data bus. The address is sent first at the beginning of the transmission, and then several bytes of data are continuously transmitted. The advantage of this is that it can reduce the number of chip pins, and one transmission cycle can Complete the transmission of several bytes. Plug and play is similar to the USB mentioned earlier. The devices on the bus store configuration information. During the initialization process, the host will actively obtain this information to allocate the resources it needs. This will be discussed in more detail later. introduce. With the development of PCI local bus, its application fields are becoming more and more extensive. Now independent network cards, sound cards, data acquisition cards, etc. in PCs all use PCI local bus. Later, a serial standard, PCI-Express, was introduced, whose transmission rate is quite high. In the PCI-Express 3.0 specification, its transmission rate can reach 8GT/s, which is 8G transmissions per second. Because of its widespread use, PCI or PCI-Express local buses are also used in some embedded systems. The following is a brief introduction to what driver developers need to be concerned about in the PCI3.0 specification. The following figure is the connection block diagram of the PCI system (quoted from the PCI3.0 specification).


The processor (Processor) is connected to PCI LocalBus #0 through the Host/PCI Bridge (Bridge). On this local bus, there are sound card (Audio), dynamic video (Motion Video), graphics card ( Graphics), network card (LAN) and SCSI controller, etc. Through the PCI-to-PCI Bridge, PCI Local Bus #1 has been expanded, and other PCI functional devices are connected to this bus. In addition, there is a PCI-ISA bridge that converts the PCI bus to a traditional ISA bus.

The PCI local bus is also a master-slave structure. In the PCI specification, the master device is called the initiator and the slave device is called the target. The transmission is initiated by the master device and the slave device responds. A PCI device must realize the function of the target, but it can also realize the function of the initiator. That is to say, a device can be a master device at one time and a slave device at another time. And there are multiple master devices allowed on a bus, and the arbiter determines which master device can obtain control of the bus. Below we only discuss PCI slave devices.
PCI defines three physical address spaces, including memory address space, I/O address space and configuration address space.

The configuration address space is required. This address space is used to configure the device hardware. In order to better understand the access of these three address spaces, let’s first take a look at PCI’s typical write transfer timing diagram, as shown in the figure below (quoted from the PCI3.0 specification).


When the initiator wants to write to the target, it will first pull FRAME low. In the next first clock cycle, the AD bus is the sending address, and C/BE is the bus command, which is used to determine a more specific write operation. , DEVSEI is the confirmation signal sent by the selected target. In the next few cycles, the AD bus is the data to be written, and the C/BE is the byte enable, which is used to determine which byte is valid. IRDY and TRDY are the preparation signals of the initiator and target respectively. When either one is invalid, a waiting period will be automatically inserted. During the last data period, FRAME is invalid, but the transfer is finally completed when IRDY is invalid after FRAME is invalid. The PCI read transfer operation is basically similar to the write operation, except that the direction of the data is opposite. The bus commands involved above are shown in the figure below (quoted from the PCI3.0 specification).

I/O read, I/O write, memory read, memory write, configuration read and configuration write are the reading and writing of the three physical address spaces we mentioned earlier. Let’s first look at how the configuration space is addressed. The address structure is shown in the figure below.

The PCI specification defines two types of configuration space addresses, Type 0 is used to select a device on the bus, and Type 1 is used to pass the request to another bus. The meanings of the various fields in the address are as follows.
Bus Number: 8-bit bus address, select one of the 256 PCI local buses.
Device Number: A 5-digit device address that selects one of 32 physical devices on a bus
Function Number: 3-bit function address, select one of the 8 functions on a physical device. In other words, PCI devices are similar to USB devices. A physical device can have multiple functions, so Implement multiple logical devices.
Register Number:Used to select a 32-bit register in the configuration space.

In the PCI specification, each register in the configuration space has a specific definition. The entire configuration space has 64 bytes. We do not need to care about the meaning of each register in the configuration space. The following lists the most important registers (other registers Please refer to the PCI specification for the definition and address).

Vendor ID: 16 digits, hardware vendor ID.
Device ID: 16 digits, device ID.
Class Code: 24 bits, the category of the peripheral, such as mass storage device controller class, network controller class, display controller class, etc. A value of 0 indicates that it does not belong to a specific class.
Subsystem Vendor ID: 16 digits, subsystem vendor ID.
Subsystem ID:16 digits, subsystem ID.

Base Address Registers: 32 bits. During the computer startup process, all PCI devices will be checked. One of the important operations is to obtain the memory space and I/O space used by them, and then Each space is assigned a base address, which is stored in the base address register. There are a total of 6 such base address registers in the configuration space, which are referred to as bar in the Linux driver.
The above ID and Class are used to match the driver, and the base address is used for the driver to perform resource acquisition and mapping operations, which will be described in more detail later. With the base address register, the problem of accessing memory space and IO space is easily solved, because we only need to issue the corresponding memory address or I/O address to access the corresponding space.

2. PCI and PCI-e

PCI and PCIe are both interface standards used in computers to connect devices, but there are some important differences between them.

PCI (Peripheral Component Interconnect) is an early computer bus standard, which is designed to connect various high-speed peripheral devices, such as graphics cards, sound cards, network cards, etc. The PCI bus is a shared bus, which means all devices share the same bandwidth. Therefore, performance degradation or conflicts may occur when multiple devices attempt to use the bus at the same time.

In contrast, PCIe (Peripheral Component Interconnect Express) is a more modern computer bus standard that is also widely used to connect various high-speed devices. Unlike PCI, PCIe is a point-to-point interconnect protocol, which means each device has its own dedicated connection and does not share bandwidth with other devices. This allows PCIe to significantly outperform PCI in performance, especially in high-bandwidth applications and multi-device environments.

In general, PCIe is superior to PCI in terms of performance, flexibility, and scalability, which is why most computers and devices today use PCIe interfaces. However, it should be noted that due to the complexity and cost of PCIe, PCI may still be used in some low-bandwidth and low-cost applications.

PCI-e is compatible with PCI at the software level. PCI is parallel transmission, and PCI-e is serial point-to-point transmission.

3. Linux PCI driver

Below we will only discuss PCI slave devices. PCI devices are represented by the struct pci_dev structure. This structure has many members, which are not listed here. You can refer to the include/linux/pci.h file in the kernel source code. Inside you will find the ID, class and other members we mentioned earlier, as well as the IRQ line used by the device. The device ID also has a struct pci_device_id structure. The driver usually defines such an array to represent the supported device list, similar to the previous USB device list. The main functions and macros related to PCI device structure are as follows.

int pci_enable_device(struct pci_dev *dev);
void pci_disable_device(struct pci_dev *dev);
pci_resource_start(dev, bar)
pci_resource_end(dev, bar)
pci_resource_flags(dev, bar)
pci_resource_len(dev,bar)
int pci_request_regions (struct pci_dev *pdev, const char *res_name);
void pci_release_regions(struct pci_dev *pdev);

pci_enable_device: Enable PCI device. The device must be enabled before operating the PCI device.

pci_disable_device: Disable PCI devices.
pci_resource_start: Get the resource start address recorded in the barth base address register in dev.

pci_resource_end: Get the resource end address recorded in the barth base address register in dev.
pci_resource_flags: Get the resource flag recorded in the bar base address register in dev, whether it is a memory resource or an IO resource.
pci_resource_len: Get the resource size recorded in the barth base address register in dev.
pci_request_regions: Apply for memory resources and I/0 resources in the PCI device pdev, named res_name.

pci_release_regions: Release the memory resources and IO resources in the PCI device pdev.

The struct pci_driver structure is used in the kernel to represent the PCI device driver. The related main function macros are as follows

void pci_unregister_driver(struct pci_driver *dev);
pci_register_driver(driver)
void pci_set_drvdata(struct pci_dev *pdev, void *data);
void *pci_get_drvdata(struct pci_dev *pdev);

pci_register_driver: Register PCI device driver
pci_unregister_driver: Unregister the PCI device driver dev.
pci_set_drvdata: Save the data pointer to the PCI device pdev.
pci_get_drvdata: Get the saved pointer from PCI device pdev

The main functions of PCI device configuration space access are as follows.

int pci_read_config_byte(const struct pci_dev *dev, int where, u8 *val);
int pci_read_config_word(const struct pci_dev *dev, int where, u16 *val);
int pei_read_config_dword(const struct pci_dev *dev, int where, u32 *val);
int pci_write_config_byte(const struct pci_dev *dev, int where, u8 val);
int pci_write_config_word(const struct pci_dev *dev, int where, u16 val);
int pei_write_config_dword(const struct pci_dev *dev, int where, u32 val);

The above functions implement read and write operations on bytes, words (16 bits) and double words (32 bits) of the configuration space respectively.

4. PCI device driver examples

The PCI device used here is the CH368EVT evaluation board of Nanjing Qinheng Company. The evaluation board uses a CH368 PCI-Express interface chip designed by the company. Although it is a PCI-Express protocol, the two are compatible in terms of driver, but PCI -Express has higher speed and can support more functions. The reason for choosing this evaluation board is that it is cheap, completely domestically produced, and can fully verify the read and write operations of the three spaces.
Use the four LEDs L1~L4 to display the status of the D3~DO bits of the I/O data port. The light on represents 1, and the light off represents 0.
The configuration space definition of CH368 is shown in the figure below.


The manufacturer ID and device ID are what we are more concerned about. The ID in the driver’s device list must be consistent with the one here. The first base address register is the base address of the I/O address space, has 232 bytes, and is defined as shown in the figure below. In addition, the memory space of CH368 is 32KB.

The Linux driver code of this device is as follows. In order to highlight the core of the PCI driver as much as possible, no code related to concurrency control has been added.

#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>

#include <linux/cdev.h>
#include <linux/fs.h>
#include <linux/slab.h>
#include <linux/pci.h>
#include <linux/io.h>
#include <linux/ioport.h>
#include <linux/uaccess.h>

#include "ch368.h"

#define CH368_MAJOR 256
#define CH368_MINOR 11
#define CH368_DEV_NAME "ch368"

struct ch368_dev {
void __iomem *io_addr;
void __iomem *mem_addr;
unsigned long io_len;
unsigned long mem_len;
struct pci_dev *pdev;
struct cdev cdev;
dev_t dev;
};

static unsigned int minor = CH368_MINOR;

static int ch368_open(struct inode *inode, struct file *filp)
{
struct ch368_dev *ch368;

ch368 = container_of(inode->i_cdev, struct ch368_dev, cdev);
filp->private_data = ch368;

return 0;
}

static int ch368_release(struct inode *inode, struct file *filp)
{
return 0;
}

static ssize_t ch368_read(struct file *filp, char __user *buf, size_t count, loff_t *f_ops)
{
int ret;
struct ch368_dev *ch368 = filp->private_data;

count = count > ch368->mem_len ? ch368->mem_len : count;
ret = copy_to_user(buf, ch368->mem_addr, count);

return count-ret;
}

static ssize_t ch368_write(struct file *filp, const char __user *buf, size_t count, loff_t *f_ops)
{
int ret;
struct ch368_dev *ch368 = filp->private_data;

count = count > ch368->mem_len ? ch368->mem_len : count;
ret = copy_from_user(ch368->mem_addr, buf, count);

return count-ret;
}

static long ch368_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
{
union addr_data ad;
struct ch368_dev *ch368 = filp->private_data;

if (_IOC_TYPE(cmd) != CH368_MAGIC)
return -ENOTTY;

if (copy_from_user( & amp;ad, (union addr_data __user *)arg, sizeof(union addr_data)))
return -EFAULT;

switch (cmd) {
case CH368_RD_CFG:
if (ad.addr > 0x3F)
return -ENOTTY;
pci_read_config_byte(ch368->pdev, ad.addr, & amp;ad.data);
if (copy_to_user((union addr_data __user *)arg, & amp;ad, sizeof(union addr_data)))
return -EFAULT;
break;
case CH368_WR_CFG:
if (ad.addr > 0x3F)
return -ENOTTY;
pci_write_config_byte(ch368->pdev, ad.addr, ad.data);
break;
case CH368_RD_IO:
ad.data = ioread8(ch368->io_addr + ad.addr);
if (copy_to_user((union addr_data __user *)arg, & amp;ad, sizeof(union addr_data)))
return -EFAULT;
break;
case CH368_WR_IO:
iowrite8(ad.data, ch368->io_addr + ad.addr);
break;
default:
return -ENOTTY;
}

return 0;
}

static struct file_operations ch368_ops = {
.owner = THIS_MODULE,
.open = ch368_open,
.release = ch368_release,
.read = ch368_read,
.write = ch368_write,
.unlocked_ioctl = ch368_ioctl,
};

static int ch368_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
int ret;

unsigned long io_start;
unsigned long io_end;
unsigned long io_flags;
unsigned long io_len;
void __iomem *io_addr = NULL;

unsigned long mem_start;
unsigned long mem_end;
unsigned long mem_flags;
unsigned long mem_len;
void __iomem *mem_addr = NULL;

struct ch368_dev *ch368;

ret = pci_enable_device(pdev);
if(ret)
goto enable_err;

io_start = pci_resource_start(pdev, 0);
io_end = pci_resource_end(pdev, 0);
io_flags = pci_resource_flags(pdev, 0);
io_len = pci_resource_len(pdev, 0);

mem_start = pci_resource_start(pdev, 1);
mem_end = pci_resource_end(pdev, 1);
mem_flags = pci_resource_flags(pdev, 1);
mem_len = pci_resource_len(pdev, 1);

if (!(io_flags & amp; IORESOURCE_IO) || !(mem_flags & amp; IORESOURCE_MEM)) {
ret = -ENODEV;
goto res_err;
}

ret = pci_request_regions(pdev, "ch368");
if (ret)
goto res_err;

io_addr = ioport_map(io_start, io_len);
if (io_addr == NULL) {
ret = -EIO;
goto ioport_map_err;
}

mem_addr = ioremap(mem_start, mem_len);
if (mem_addr == NULL) {
ret = -EIO;
goto ioremap_err;
}

ch368 = kzalloc(sizeof(struct ch368_dev), GFP_KERNEL);
if (!ch368) {
ret = -ENOMEM;
goto mem_err;
}
pci_set_drvdata(pdev, ch368);

ch368->io_addr = io_addr;
ch368->mem_addr = mem_addr;
ch368->io_len = io_len;
ch368->mem_len = mem_len;
ch368->pdev = pdev;

ch368->dev = MKDEV(CH368_MAJOR, minor + + );
ret = register_chrdev_region (ch368->dev, 1, CH368_DEV_NAME);
if (ret < 0)
goto region_err;

cdev_init( & amp;ch368->cdev, & amp;ch368_ops);
ch368->cdev.owner = THIS_MODULE;
ret = cdev_add( & amp;ch368->cdev, ch368->dev, 1);
if (ret)
goto add_err;

return 0;

add_err:
unregister_chrdev_region(ch368->dev, 1);
region_err:
kfree(ch368);
mem_err:
iounmap(mem_addr);
ioremap_err:
ioport_unmap(io_addr);
ioport_map_err:
pci_release_regions(pdev);
res_err:
pci_disable_device(pdev);
enable_err:
return ret;
}

static void ch368_remove(struct pci_dev *pdev)
{
struct ch368_dev *ch368 = pci_get_drvdata(pdev);

cdev_del( & amp;ch368->cdev);
unregister_chrdev_region(ch368->dev, 1);
iounmap(ch368->mem_addr);
ioport_unmap(ch368->io_addr);
kfree(ch368);
pci_release_regions(pdev);
pci_disable_device(pdev);
}

static struct pci_device_id ch368_id_table[] =
{
{0x1C00, 0x5834, 0x1C00, 0x5834, 0, 0, 0},
{0,}
};
MODULE_DEVICE_TABLE(pci, ch368_id_table);

static struct pci_driver ch368_driver = {
.name = "ch368",
.id_table = ch368_id_table,
.probe = ch368_probe,
.remove = ch368_remove,
};

module_pci_driver(ch368_driver);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("name <e-mail>");
MODULE_DESCRIPTION("CH368 driver");
#ifndef _CH368_H
#define _CH368_H

union addr_data {
unsigned char addr;
unsigned char data;
};

#define CH368_MAGIC 'c'

#define CH368_RD_CFG _IOWR(CH368_MAGIC, 0, union addr_data)
#define CH368_WR_CFG _IOWR(CH368_MAGIC, 1, union addr_data)
#define CH368_RD_IO _IOWR(CH368_MAGIC, 2, union addr_data)
#define CH368_WR_IO _IOWR(CH368_MAGIC, 3, union addr_data)

#endif

Lines 19 to 27 of the code are the definition of the device structure, including the io_addr and mem_addr pointer members that save the mapped IO address and memory address, the io_len and mem_len members that save the IO address space size and memory address space size, and the PCI pdev pointer member of the device structure. The PCI device is implemented as a character device, so has cdev and dev members.
Lines 224 to 242 of the code are the definition, registration and unregistration of the PCI driver structure. ch368_id_table is a list of devices supported by the driver, and the ID numbers in it must be consistent with the ID numbers in the above figure.
When a matching PCI device is detected, the ch368_probe function is automatically called. Line 134 of the code first enables the PCI device, and lines 138 to 146 of the code obtain the physical address, flag, and length information of IO and memory respectively. Lines 148 to 151 of the code determine the resource type information in the obtained flag. If it is not the same as expected, the device detection fails. Lines 153 to 167 of the code apply for the resources declared by the PCI device, then map them and obtain the corresponding virtual addresses. Lines 169 to 180 of the code allocate the memory space of the struct ch368_dev structure, initialize each member accordingly, and use the pci_set_drvdata function

The bus device driver micro stores the structure address in the PCI device structure, so that it can be obtained from the PCI device structure later. The code after this function is the registration operation related to the character device. The work done by ch368 remove is opposite to that of h368 probe function.

ch368_open and ch368_release do not do much work, ch368_read and ch368_write are for reading and writing the memory space, because there is no corresponding external device in this memory space, so it has no practical significance. The more practical operation is in ch368_ioctl. The CH368_RD_CFG command is used to read data in the configuration space, and the CH368_WR_CFG command is used to write data to the configuration space. CH368_RD_IO and CH368_WR_IO respectively read and write the I/0 space. Union addr_data is used to transmit address and return data, which is similar to the ADC driver example.
The test code for the application layer is as follows

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <errno.h>

#include "ch368.h"

int main(int argc, char *argv[])
{
int i;
int fd;
int ret;
union addr_data ad;
unsigned char id[4];

fd = open("/dev/ch368", O_RDWR);
if (fd == -1)
goto fail;

for (i = 0; i < sizeof(id); i + + ) {
ad.addr = i;
if (ioctl(fd, CH368_RD_CFG, & amp;ad))
goto fail;
id[i] = ad.data;
}

printf("VID: 0x x x, DID: 0x x x\
", id[1], id[0], id[3], id[2]);

i = 0;
ad.addr = 0;
while (1) {
ad.data = i + + ;
if (ioctl(fd, CH368_WR_IO, & amp;ad))
goto fail;
i %= 15;
sleep(1);
}
fail:
perror("pci test");
exit(EXIT_FAILURE);
}

The above code first reads the first 4 bytes of the configuration space after opening the device. According to the PCI specification, these 4 bytes are exactly the manufacturer ID and device ID. Next, in the while loop, 0~15 are written to the first byte of the I/0 space in sequence, so that the four LED lights on the PCI device will be lit according to this rule. As mentioned before, the 4 LEDs reflect the status of the lower 4 bits of the data written to the I/O space. The light corresponding to the data bit is 1 is lit, and the light corresponding to the data bit 0 is turned off.
Use the following commands to compile and test. It should be noted that you need to have a physical machine with a Linux system installed, and the physical machine must have a corresponding PCIE slot to insert the device and test it.

5. Bus device driver development exercises

1. The I2C bus protocol stipulates that ( ) is used to respond.
[A] Data sender
[B] Data recipient

2. The I2C bus protocol stipulates that all accesses are initiated by ( )
[B] Slave device
[A] Main device

3.SPI is ( ) bus.
[B] Asynchronous
[A] Synchronization

4.SPI bus is ( )

[B] Half duplex
[A] Simplex

[C] Full duplex

5.SPI bus is ( ).

[A] Single owner
[B] Multiple masters

6. USB transmission types are divided into (
[B] Isochronous transmission
[A] Control transfer
[D] Block transfer
[C] Interrupt transfer

7. The USB interface is composed of multiple ( ).
[A] Configuration
[B] Pipeline

[C] Endpoint

8.PCI configuration space includes ( ) information.
[B] Device ID
[A] Manufacturer ID
[D] Address space size
[C] Base address

Answer: B A A A C ABCD C ABCD