data structure sequence table

1. Understand the sequence table

A sequence table is a linear structure that uses a storage unit with a continuous physical address to store data elements in sequence. Generally, it is stored in an array. The addition, deletion, checking, and modification of data are completed on the array.

1.1 distinction

Using simple language to describe the sequence list is actually an array, but it is also different from an array.

For arrays, you only need to find the corresponding subscript to save the data to the location you want to store it. That is, you can store it in any location without crossing the boundary.

For sequential tables, you cannot store them arbitrarily. You need to strictly abide by the order. Data must be inserted from the head of the table backwards to comply with the definition of a sequential table.

1.2 Types of sequence tables

1.2.1 Static sequence table

Use fixed-length arrays to store elements.

#define N 8 //Macro definition variables

typedef int SLDataType;

typedef struct SeqList
{

    SLDataType arr[N]; //fixed length array
    size_t size; //Number of valid data

}SeqList;

Use a macro to define a global variable as a storage space for sequential table data. Create an array in the structure to store the data, and size records the number of valid data.

So in practice, static sequence tables are not practical and have many problems. When you know how much space is needed to store data, N can be defined as the exact size. But in real life, many times you don’t know how much space is needed. When N is defined small, it will be wasted; if N is large, it will be wasted.

So the static sequence table is used for understanding.

1.2.2 Dynamic sequence table

Use dynamically allocated array storage

typedef int SLDataType;

typedef struct SeqList
{

    SLDataType* arr; //points to a dynamically opened array
    size_t size; //Number of valid data
    size_t capacity //The size of the capacity space

}SeqList;

Dynamic can use malloc and realloc to apply for space according to needs, and can also be expanded with the continuous insertion of data. Therefore, compared with static, the advantages of dynamic are obviously greater than static, and the application scenarios are wider.

2. Implementation of sequence table

2.1 Creation of structure

In sequence tables, structures are created to better manage data, including arrays and other variables.

typedef int SLDataType;

typedef struct SeqList
{

    SLDataType* arr; //points to a dynamically opened array
    size_t size; //Number of valid data
    size_t capacity //The size of the capacity space

}SL;

The size and capacity variables are introduced here: size records the number of valid data, which is the next position of the last data; and capacity is used to determine whether to expand the capacity. How much capacity is expanded at a time is a question, and expansion will cost a certain amount. , one at a time or how many? So you can expand a little more at a time. So how much to expand at a time, there are no strict regulations in the data structure. It follows a specific analysis of the specific situation, but under normal circumstances it will be expanded by 2 times or 1.5 times.

2.2 Common function implementation

Generally, 3 files are created here to implement the sequence table:

The header file of SeqList.h is used to declare functions and structures.

The source file of SeqList.c is used to implement the functions of the function.

The source file of test.c is used for calling and testing.

2.2.1 Initialization

First, let’s look at a classic negative teaching material

void TestSL1()
{
SL sl;
SLInit(sl);
}


void SLInit(SL s)
{
    s.a = NULL;
    //Function implementation
}

Here, the actual parameters passed to SLInit in the TestSL1 function have a copy relationship with the formal parameters accepted by SLInit, that is, the formal parameters are copies of the actual parameters and will not change the actual parameters. So if passing by value is not possible, then pass the address.

void TestSL1
{
    SL sl;
    SLInit( & amp;sl); //Get address
}

void SLInit(SL* psl)
{
    psl->a = NULL;
    //Function implementation
}

1. Create sequence table

void SLInit(SL* psl)
{

    psl->a = NULL; //First point the created array to NULL
    psl->size = 0; //Set the number of data to 0
    psl->capacity = 0; //Capacity is 0

}

2. Destruction sequence table

void SLDestory(SL* psl)
{
    if(psl->a != NULL)
    {
        free(psl->a); //Manually release space
        psl->a = NULL;
        psl->size = 0;
        psl->capacity = 0;
    }
}

2.2.2 Functional implementation – insertion and deletion of head and tail

1.Tail insertion

When there is enough space, you can put the data directly and let the size increase automatically. The code is as follows:

void SLPushBack(SL* psl , SLDataType x)
{

    SLCheckCapacity(psl) //Expansion function, subsequent code will also use this function for expansion

    psl->a[psl->size] = x; //Assign x to the location where the subscript is size in the a array
    psl->size + + ; //increase automatically
}

So the real question is, how to continue tail plugging when the space is insufficient? Therefore, expansion is required. Before expansion, it is also necessary to judge whether the space is insufficient. However, during initialization, the space capacity is set to 0, and trinocular is used here. Operator to deal with the situation when the initial value is 0, the sum is not 0, and the sum is not 0. In order to make the code more concise, the code that implements the expansion can be encapsulated into a function for each function to call.

void SLCheckCapacity(SL* psl)
{
  if (psl->size == psl->capacity) //Determine whether the space is full
    {

    int NewCapacity = psl->capacity == 0 ? 4 : psl->capacity * 2;
    //Ternary operator to solve the situation of initialization to 0 and not to 0,
    
    SLDataType* tmp = (SLDataType*)realloc(psl->a, sizeof(SLDataType) * NewCapacity);
    //Use intermediate variables to receive the newly opened address to avoid returning a null pointer when the development fails.
    //Two parameters of realloc: old address, how big to expand the capacity


    if (tmp == NULL) //Display error message
    {
      perror("realloc file");
      return;
    }
       psl->a = tmp;
       psl->capacity = NewCapacity;
    }
              
}

Next, you can use realloc to expand the capacity. realloc has in-situ expansion and off-site expansion. The former will determine whether the subsequent space of the space has been allocated to other places. Otherwise, the latter will be used. The latter will find a sufficient space and store the space that has been stored. The data is copied to the past, the original space is released, and finally the new address is returned. It should be noted that the former is more efficient than the latter. You can compare the return values of the two, that is, whether the addresses are the same to judge.

Determine the expansion method of realloc
int* p1 = (int*)malloc(40);
printf("%p\
", p1);

int* p2 = (int*)realloc(p1, 800); //The parameters of realloc are: old address, how big to expand the capacity
printf("%p\
", p2);

2.Head plug

When expanding the head plug, it cannot expand forward, but can only expand backward. Therefore, the head plug can only move the existing data backward, and then put the data that needs the head plug in.

void SLPushFront(SL* psl, SLDataType x)
{
    SLCheckCapacity(psl);

    int end = psl->size - 1; //Locate the last data
    while (end>=0)
    {
        psl->a[end + 1] = psl->a[end]; //Move data backward
        --end;
    }
    psl->a[0] = x;
    psl->size + + ;
}

It should be noted that there is no need to call the header frequently. When n pieces of data need to be header inserted, the time complexity used is O(n2), because when n, the number of times the data needs to be moved backward is an equal difference. Sequence. And inserting n numbers by tail insertion is O(n). So if you don’t need head insertion, you don’t need it.

3. Tail deletion

For tail deletion, as the name implies, it deletes the last data, so you only need to reduce the size by 1 to complete.

void SLPopBack(SL* psl)
{
    psl->size--;
}

You may have this question: Only decrement is performed and the size is moved forward, but the last data is not completely deleted. Will it affect subsequent insertions?

The answer is no. Because when the size is moved forward, the number of valid data is reduced by one, that is, only the data before the size can be accessed. When subsequent head-to-tail insertions are performed, the data will be overwritten and no data will be generated. Impact. At the same time, the space cannot be released, because the space applied for in the memory cannot be paid back in installments, and can only be paid in one lump sum.

When the size keeps decrementing and exceeds the sequence table and becomes -1 (or even smaller), no error will actually be reported because the program does not access it when executing. In this case, continue to insert 4 data at the end. , and then print it out, you will find that only 3 data are printed. The reason is that size is -1, and the valid data in the sequence table is the previous data starting from subscript 0 to size. When inserting data, size will be -1 The first inserted data is stored in the location, then incremented, and then the data is stored. Data with a size of -1 will not be accessed during final printing, so the program will not report an error.

You can debug this process by yourself and observe the changes in size and how the data is stored.

The above situation is that continued deletion after the sequence table has been deleted results in a negative size. How to avoid this? There are two ways to judge, write in the function.

Gentle inspection:

//Gentle inspection
if(psl->size == 0)
{
    return;
}

Violence check:

#include<assert.h>

assert(psl->size > 0); //Assert

4. Header deletion

Head deletion only needs to move the following data forward, overwrite the data that needs to be deleted, and then reduce the size. It should be noted that the data cannot be moved from back to front, as this will overwrite all the data as the last data, so Only from front to back.

void SLPopFront(SL* psl)
{
    assert(psl->size > 0);
    int begin = 1;

    while (begin < psl->size)
    {
        psl->a[begin - 1] = psl->a[begin]; //Move data
         + + begin;
    }

    psl->size--;
}

2.2.3 Insertion and deletion at any subscript position

1.Insert

Here, insert at the position of pos, but it should be noted that in accordance with the requirements of the sequence table to ensure the continuity of the data, it cannot exceed size. And the range of moving data is from pos to the previous data of size, which can also be equal to size. In this way, there is no need to move the data and it becomes tail-inserted data.

void SLInsert(SL* psl, int pos, SLDataType x)
{
    assert(psl); //Assert that the pointer is not null
    assert(pos > 0 & amp; & amp; pos <= psl->size);

    SLCheckCapacity(psl); //Check whether expansion is needed

    //Move data
    int end = psl->size - 1;
    while (end >= pos)
    {
        psl->a[end + 1] = psl->a[end];
        --end;
    }

    psl->a[pos] = x;
    psl->size + + ;
}

2.Delete

Different from insertion, pos here cannot be equal to size, because size is an invalid position. Therefore, it is the same as head deletion logic, and needs to be moved from front to back to cover.

void SLErase(SL* psl, int pos)
{
    assert(psl);
    assert(pos > 0 & amp; & amp; pos < psl->size);

    //Move coverage
    int begin = pos + 1;
    while (begin < psl->size)
    {
        psl->a[begin - 1] = psl->a[begin];
        begin++;
    }
    psl->size--;
}

2.3 Frequently Asked Questions

Out-of-bounds reading will basically not report an error, but out-of-bounds writing may report an error. This is because out-of-bounds checking is a spot check, just like checking for drunk driving. It will design some mark bits after the array to check whether these positions have been modified.

Three. Summary

For programmers, it is very clear how the program is used, but when it is used by others, they may not know how to operate and pass parameters. When the user sees that the parameter of the function is a pointer, he may pass in a null pointer. , this will cause the program to end abnormally when encountering a null pointer, so it can be asserted before the start of each function that psl is not a null pointer, which can effectively avoid error transmission.

Here is an explanation of why the array should start from 0 as the subscript of the first data instead of starting from 1. Because the essence of array access is a pointer, the array name is the address of the first element.

int a[10];

//The array name a is the address of the first element.

a[i] == *(a + i);

a[0] == *(a + 0);

This ends the explanation of the sequence table. You are welcome to leave comments and corrections in the comment area.