Application of dynamic programming in image compression

1. Basic principles

Image compression is a technique that reduces the size of image files in order to save storage space or speed up transfer.

Image representation

In computers, images are usually represented by a series of pixels. Each pixel has a grayscale value, usually ranging from 0 to 255, so 8 bits are needed to represent a pixel.

Image compression storage format

The purpose of image compression is to reduce the number of bits required to store or transmit an image. One approach described in this paragraph is to split the image into several contiguous segments of pixels, with all pixels in each segment represented by the same number of bits. Each pixel segment also requires some additional information to store, such as how many bits are used to represent the pixels in this segment, and how many pixels there are in this segment.

Storage space calculation

For each pixel segment, the number of bits that need to be stored is the number of pixels in the segment multiplied by the number of bits per pixel, plus some additional bits. The storage space required for the entire image is the sum of the storage space required for all pixel segments.

Dynamic programming solution

Dynamic programming is a problem-solving method that decomposes the problem into a series of interrelated sub-problems, first solves the sub-problems, and then uses the solutions to the sub-problems to construct a solution to the original problem.
In this image compression problem, define

[

]

s[i]

s[i] is the previous

Number of storage bits required for optimal segmentation of i pixels. Optimal segmentation means the fewest number of bits of storage required. This problem satisfies the optimal substructure property, that is, the optimal solution of the entire problem can be obtained by combining the optimal solutions of its subproblems.

Specifically, for the former

The optimal segmentation of i pixels can be obtained by considering the previous

i-k

The optimal segmentation of i?k pixels, plus the

i-k+1

i?k + 1 pixel to the

A segment of i pixels is obtained. By looping through all possible

k value, you can find the segmentation method that minimizes the total number of storage bits.

Specific calculation formula

[

]

min

{

[

]

max

(

)

}

s[i] = \min_k \{ s[i-k] + k \cdot b_{\text{max}}(i-k + 1, i) + 11 \}

s[i]=kmin?{s[i?k] + k?bmax?(i?k + 1,i) + 11}

in,

max

(

)

b_{\text{max}}(i-k + 1, i)

bmax?(i?k + 1,i) means starting from the

i-k+1

i?k + 1 pixel to the

The number of bits corresponding to the maximum gray value in this segment of i pixels, plus some additional bits to store the necessary information.

Through this method, the optimal segmentation of the entire image can be calculated step by step, thereby achieving image compression.

2. Main implementation code (refer to computer algorithm design and analysis textbook)

1. Various basic functions

int length(int i);
- Function: This function calculates the number of digits required in the binary representation of the number i.
- Parameter: an integer i.
- Return value: An integer representing the number of digits required for i in binary.
void Compress(int n, int p[], int s[], int l[], int b[]);
- Function: This function calculates the optimal compression method of the image based on the given image grayscale array p. It populates the s, l, and b arrays, where s represents the cumulative minimum storage for each pixel position Space, l represents the optimal segment length for each pixel position, and b represents the maximum number of bits for each pixel position.
- Parameters:
  - n: The length of the grayscale array p.
  - p[]: Grayscale array of the image.
  - s[]: The cumulative minimum storage space for storing each pixel position.
  - l[]: Stores the optimal segment length for each pixel position.
  - b[]: The maximum number of bits to store each pixel position.
- Return value: None.
void Traceback(int n, int & amp; i, int s[], int l[]);
- Function: This function finds the optimal segmentation position of the image through the backtracking method.
- Parameters:
  - n: The last pixel position to be traced back.
  - i: Reference parameter, used to record the number of segments.
  - s[]: The cumulative minimum storage space for storing each pixel position.
  - l[]: Stores the optimal segment length for each pixel position.
- Return value: None.
void Output(int s[], int l[], int b[], int n);
- Function: This function outputs the compression results of the image, including the length of each segment and the number of storage bits required.
- Parameters:
  - s[]: The cumulative minimum storage space for storing each pixel position.
  - l[]: Stores the optimal segment length for each pixel position.
  - b[]: The maximum number of bits to store each pixel position.
  - n: The length of the grayscale array.
- Return value: None.

2. Specific code implementation

//Import the standard input and output library
#include <iostream>
// Use the standard namespace so that we can use cout, cin, etc. directly without writing std::cout, std::cin.
using namespace std;

// Define N as 7, which represents the length of the grayscale sequence of the image.
const int N = 7;

// function declaration
int length(int i);
void Compress(int n,int p[],int s[],int l[],int b[]);
void Traceback(int n,int & amp; i,int s[],int l[]);
void Output(int s[],int l[],int b[],int n);

int main()
{<!-- -->
    //Define the grayscale array p of the image, and the subscripts start counting from 1.
    int p[] = {<!-- -->0,10,12,15,255,1,2};
    // Initialize the storage space array s, the segment length array l, and the number of storage bits required for each segment array b.
    int s[N],l[N],b[N];

    cout << "The grayscale sequence of the image is:" << endl;

    // Output the original grayscale sequence
    for(int i=1; i<N; i + + )
    {<!-- -->
        cout << p[i] << " ";
    }
    cout << endl;

    //Call compression function
    Compress(N-1, p, s, l, b);
    // Output the compressed result
    Output(s, l, b, N-1);
    return 0;
}

void Compress(int n, int p[], int s[], int l[], int b[])
{<!-- -->
    // Here we set two constants: the maximum number of pixels per segment is 256, and the header information of each segment requires 11 bits to store.
    int Lmax = 256, header = 11;
    
    // We set s[0] to 0, which means that at the beginning, there are no pixels, so the required storage space is 0.
    s[0] = 0;
    
    // This loop means we want to look at every pixel, starting with the first one.
    for(int i=1; i<=n; i + + )
    {<!-- -->
        // For each pixel, we need to calculate how many bits it requires to represent. This number is stored in the b array.
        b[i] = length(p[i]);
        
        // To facilitate calculations, we assume that the current pixel is the pixel that requires the most number of bits.
        int bmax = b[i];
        
        //Initial assumption: If only this one pixel is used as a segment, then the required storage space is all the previous ones plus this pixel.
        s[i] = s[i-1] + bmax;
        
        // And let's first assume that this pixel is a segment alone.
        l[i] = 1;

        // Then we need to consider adding more pixels to this segment to see if we can save more space.
        for(int j=2; j<=i & amp; & amp; j<=Lmax; j + + )
        {<!-- -->
            // We check whether there are new pixels that require more bits after adding new pixels.
            if(bmax < b[i-j + 1])
            {<!-- -->
                // If so, we update the maximum number of digits.
                bmax = b[i-j + 1];
            }
            
            // Next we have to see whether storage space can be saved if j pixels are added as a segment.
            if(s[i] > s[i-j] + j*bmax)
            {<!-- -->
                // If it can, we update the required storage space and the length of this segment.
                s[i] = s[i-j] + j*bmax;
                l[i] = j;
            }
        }
        
        // Finally, add the 11-bit header information at the beginning of each segment.
        s[i] + = header;
    }
}


// Calculate the number of digits required for the binary representation of the number i
int length(int i)
{<!-- -->
    //Here, we initialize k to 1. Because any non-zero number requires at least 1 digit to represent.
    int k = 1;

    // We divide i by 2 because we want to know how many times this number is divisible by 2.
    // This is because every time we divide by 2, we are actually removing the last digit of the binary representation of the number.
    i = i/2;

    // When i is greater than 0, we continue to execute the loop body.
    while(i > 0)
    {<!-- -->
        // Each time through the loop, we increase k by 1 because we have removed another binary digit of i.
        k++;

        // Continue dividing by 2 to prepare for the next loop.
        i = i/2;
    }

    // Finally, we return k, which is the number of binary digits required for number i.
    return k;
}


// Backtrack to find the specific location of the optimal segment
void Traceback(int n, int & amp; i, int s[], int l[])
{<!-- -->
    if(n==0)
        return;
    Traceback(n-l[n], i, s, l);
    //Storage segment location
    s[i + + ] = n-l[n];
}

void Output(int s[], int l[], int b[], int n)
{<!-- -->
    // Output the minimum storage space
    cout << "The minimum space after image compression is:" << s[n] << endl;
    int m = 0;
    Traceback(n, m, s, l);
    s[m] = n;
    cout << "Divide the original grayscale sequence into" << m << "Segment sequence segment" << endl;

    // Output the length of each segment and the required number of storage bits
    for(int j=1; j<=m; j + + )
    {<!-- -->
        l[j] = l[s[j]];
        b[j] = b[s[j]];
    }
    for(int j=1; j<=m; j + + )
    {<!-- -->
        cout << "Segment length:" << l[j] << ", required number of storage digits:" << b[j] << endl;
    }
}