Two-dimensional discrete cosine transform for grayscale images

1. Introduction

Discrete Cosine Transform, referred to as DCT transform, can convert spatial domain signals into frequency domain. In this topic, it converts two-dimensional pixel values into two-dimensional frequency signals.

2.Formula

2.1 Two-dimensional discrete cosine transform (2D DCT):

Given an input matrix F of N×M (usually M is equal to N, N represents the horizontal length), its discrete cosine transform result G can be calculated by the following formula:

$F(u,v) = \frac{2}{\sqrt{MN}} \cdot C(u) \cdot C(v) \sum_{x=0}^{N-1}\ sum_{y=0}^{M-1}f(x,y)\cos\left[\frac{(2x + 1)u\pi}{2N}\right]\cos\left[\frac{( 2y + 1)v\pi}{2M}\right]$

Among them, $f(x,y)$ is in the original image Grayscale value at pixel$(x,y)$; $F(u,v)$ is an element in the transformed coefficient matrix, representing the weights of different frequency components;

$C(u),C(v)$ is the orthogonal normalization coefficient,
When $u,v=0$ , $C(u)=C(v)=\frac{1}{\sqrt{2}}$ , otherwise $C(u)=C(v)=1$ .
The formula of two-dimensional inverse discrete cosine transform (IDCT):
$f(x,y) = \frac{2}{\sqrt{MN}} \sum_{u=0}^{N-1}\sum_{v=0}^{M-1} C(u) \cdot C(v)F(u,v)\cos\left[\frac{(2x + 1)u\pi}{2N}\right]\cos\left[\frac{(2y + 1)v\pi}{2M}\right]$

These formulas describe the principles of two-dimensional discrete cosine transform and inverse transform, which are used to convert the input image from the spatial domain to the frequency domain, and realize signal compression or feature extraction by adjusting the coefficients. In practical applications, the corresponding library functions or algorithms can be used to perform discrete cosine transform and inverse transform operations.

2.2 Formula Interpretation

2.3 Formula usage

Due to the accuracy of the cos function, the original matrix cannot be obtained by using the above formulas for forward and inverse transformations. The relevant conversion codes will be attached at the end of the article. Therefore, the matrix method is used to convert it. The relevant mathematical principles are not explained.

In this way, the idct transformation is performed.

In this way the inverse dct transform is performed.

3. Visually display changes

Using the example in the blog: DCT transformation and quantization table_dct transformation base table-CSDN blog

The initial 8*8 matrix is as follows (representing the pixel values of the image):
Perform dct transformation to obtain the frequency domain signal and then perform dct The original image signal was obtained by inverse transformation, and it was found that there was not much loss.

4. Frequency domain digital meaning

All 8*8 black and white pictures can be seen as the superposition of color blocks composed of these 64 basic cosine waves. The upper frequency of each block is determined, and the corresponding number indicates its influence. It can be compared to the corresponding amplitude before each frequency of Fourier transform. The larger the corresponding amplitude, the greater the influence. Preserving the low-frequency signal can retain the main information of the picture.
Reference video: [jpeg-dct] https://www.bilibili.com/video/BV17v411p7gV?vd_source=53f17186c8480e141f43fa1db7910b6c

5. Code sharing

DCT transform and IDCT transform

import numpy as np

def dct(matrix):
    N = matrix.shape[0]
    M = matrix.shape[1]

    # Construct coefficient matrix
    C = np.ones((N, M))
    for i in range(0, N):
        C[i, 0] = 1 / np.sqrt(N)
    for j in range(1, M):
        for i in range(0, N):
            C[i, j] = np.sqrt(2 / N) * np.cos((np.pi / N) * (i + 0.5) * j)
    # Calculate DCT transform
    dct_data = np.dot(np.dot(C, matrix), np.transpose(C))
    return np.around(dct_data, 0)


def idct(dct_data):
    N = dct_data.shape[0]
    M = dct_data.shape[1]

    # Construct coefficient matrix
    C = np.ones((N, M))
    for i in range(0, N):
        C[i, 0] = 1 / np.sqrt(N)
    for j in range(1, M):
        for i in range(0, N):
            C[i, j] = np.sqrt(2 / N) * np.cos((np.pi / N) * (i + 0.5) * j)

    # Calculate the inverse transformation
    idct_data = np.dot(np.dot(np.transpose(C), dct_data), C)

    return np.around(idct_data,0)

Quantization and inverse quantization (JPEG compression is used and has nothing to do with DCT changes)

def quantify(dct_data):
    N = dct_data.shape[0]
    M = dct_data.shape[1]
    quantify_data =np.zeros(dct_data.shape)
    quantization_matrix = np.array([[16, 11, 10, 16, 24, 40, 51, 61],
                               [12, 12, 14, 19, 26, 58, 60, 55],
                               [14, 13, 16, 24, 40, 57, 69, 56],
                               [14, 17, 22, 29, 51, 87, 80, 62],
                               [18, 22, 37, 56, 68, 109, 103, 77],
                               [24, 35, 55, 64, 81, 104, 113, 92],
                               [49, 64, 78, 87, 103, 121, 120, 101],
                               [72, 92, 95, 98, 112, 100, 103, 99]])

    for j in range(0, M):
        for i in range(0, N):
            quantify_data[i][j]=dct_data[i][j]//quantization_matrix[i][j]
    return quantify_data

def i_quantify(dct_data):
    N = dct_data.shape[0]
    M = dct_data.shape[1]
    quantify_data = np.zeros(dct_data.shape)
    quantization_matrix = np.array([[16, 11, 10, 16, 24, 40, 51, 61],
                                    [12, 12, 14, 19, 26, 58, 60, 55],
                                    [14, 13, 16, 24, 40, 57, 69, 56],
                                    [14, 17, 22, 29, 51, 87, 80, 62],
                                    [18, 22, 37, 56, 68, 109, 103, 77],
                                    [24, 35, 55, 64, 81, 104, 113, 92],
                                    [49, 64, 78, 87, 103, 121, 120, 101],
                                    [72, 92, 95, 98, 112, 100, 103, 99]])

    for j in range(0, M):
        for i in range(0, N):
            quantify_data[i][j] = dct_data[i][j] * quantization_matrix[i][j]
    return quantify_data

Filter function (can selectively filter to high frequency filtering)

def fittler(dct_data):
    N = dct_data.shape[0]
    M = dct_data.shape[1]
    for i in range(0, N):
        for j in range(M-i, M):
            if abs(dct_data[i][j]) <= 1:
                dct_data[i][j] = 0
    return dct_data

Complete block DCT transformation and inverse transformation through list slicing operations in python

import cv2 as cv
# img = cv.imread(r"D:\py_code\2023_10\1.jpg",cv.IMREAD_GRAYSCALE)
img = np.array([[-76, -73, -67, -62, -58, -67, -64, -55],
                  [-65, -69, -73, -38, -19, -43, -59, -56],
                  [-66, -69, -60, -15, 16, -24, -62, -55],
                  [-65, -70, -57, -6, 26, -22, -58, -59],
                  [-61, -67, -60, -24, -2, -40, -60, -58],
                  [-49, -63, -68, -58, -51, -60, -70, -53],
                  [-43, -57, -64, -69, -73, -67, -63, -45],
                  [-41, -49, -59, -60, -63, -52, -50, -34]])
size=8
print(img.shape)
print(img)
dct_data = np.zeros(img.shape)
img_dct = np.zeros(img.shape)
for i in range(img.shape[0]//size):
    for j in range(img.shape[1]//size):
        sub_dct_data = dct(img[i*size:(i + 1)*size,j*size:(j + 1)*size])
        sub_dct_data = fittler(sub_dct_data)
        for u in range(size):
            for v in range(size):
                dct_data[i*size + u][j*size + v]=sub_dct_data[u][v]
print(dct_data)


for i in range(img.shape[0]//size):
    for j in range(img.shape[1]//size):
        sub_idct_data = idct(dct_data[i*size:(i + 1)*size,j*size:(j + 1)*size])
        for u in range(size):
            for v in range(size):
                img_dct[i*size + u][j*size + v]=sub_idct_data[u][v]

print(img_dct)

The overall code file is a black and white Lina diagram

import numpy as np

def dct(matrix):
    N = matrix.shape[0]
    M = matrix.shape[1]

    # Construct coefficient matrix
    C = np.ones((N, M))
    for i in range(0, N):
        C[i, 0] = 1 / np.sqrt(N)
    for j in range(1, M):
        for i in range(0, N):
            C[i, j] = np.sqrt(2 / N) * np.cos((np.pi / N) * (i + 0.5) * j)
    # Calculate DCT transform
    dct_data = np.dot(np.dot(C, matrix), np.transpose(C))
    return np.around(dct_data, 0)


def idct(dct_data):
    N = dct_data.shape[0]
    M = dct_data.shape[1]

    # Construct coefficient matrix
    C = np.ones((N, M))
    for i in range(0, N):
        C[i, 0] = 1 / np.sqrt(N)
    for j in range(1, M):
        for i in range(0, N):
            C[i, j] = np.sqrt(2 / N) * np.cos((np.pi / N) * (i + 0.5) * j)

    # Calculate the inverse transformation
    idct_data = np.dot(np.dot(np.transpose(C), dct_data), C)

    return np.around(idct_data,0)
def quantify(dct_data):
    N = dct_data.shape[0]
    M = dct_data.shape[1]
    quantify_data =np.zeros(dct_data.shape)
    quantization_matrix = np.array([[16, 11, 10, 16, 24, 40, 51, 61],
                               [12, 12, 14, 19, 26, 58, 60, 55],
                               [14, 13, 16, 24, 40, 57, 69, 56],
                               [14, 17, 22, 29, 51, 87, 80, 62],
                               [18, 22, 37, 56, 68, 109, 103, 77],
                               [24, 35, 55, 64, 81, 104, 113, 92],
                               [49, 64, 78, 87, 103, 121, 120, 101],
                               [72, 92, 95, 98, 112, 100, 103, 99]])

    for j in range(0, M):
        for i in range(0, N):
            quantify_data[i][j]=dct_data[i][j]//quantization_matrix[i][j]
    return quantify_data

def i_quantify(dct_data):
    N = dct_data.shape[0]
    M = dct_data.shape[1]
    quantify_data = np.zeros(dct_data.shape)
    quantization_matrix = np.array([[16, 11, 10, 16, 24, 40, 51, 61],
                                    [12, 12, 14, 19, 26, 58, 60, 55],
                                    [14, 13, 16, 24, 40, 57, 69, 56],
                                    [14, 17, 22, 29, 51, 87, 80, 62],
                                    [18, 22, 37, 56, 68, 109, 103, 77],
                                    [24, 35, 55, 64, 81, 104, 113, 92],
                                    [49, 64, 78, 87, 103, 121, 120, 101],
                                    [72, 92, 95, 98, 112, 100, 103, 99]])

    for j in range(0, M):
        for i in range(0, N):
            quantify_data[i][j] = dct_data[i][j] * quantization_matrix[i][j]
    return quantify_data

def fittler(dct_data):
    N = dct_data.shape[0]
    M = dct_data.shape[1]
    for i in range(0, N):
        for j in range(M-i, M):
            if abs(dct_data[i][j]) <= 1:
                dct_data[i][j] = 0
    return dct_data

import cv2 as cv
# img = cv.imread(r"D:\py_code\2023_10\1.jpg",cv.IMREAD_GRAYSCALE)
img = np.array([[-76, -73, -67, -62, -58, -67, -64, -55],
                  [-65, -69, -73, -38, -19, -43, -59, -56],
                  [-66, -69, -60, -15, 16, -24, -62, -55],
                  [-65, -70, -57, -6, 26, -22, -58, -59],
                  [-61, -67, -60, -24, -2, -40, -60, -58],
                  [-49, -63, -68, -58, -51, -60, -70, -53],
                  [-43, -57, -64, -69, -73, -67, -63, -45],
                  [-41, -49, -59, -60, -63, -52, -50, -34]])
size=8
print(img.shape)
print(img)
dct_data = np.zeros(img.shape)
img_dct = np.zeros(img.shape)
for i in range(img.shape[0]//size):
    for j in range(img.shape[1]//size):
        sub_dct_data = dct(img[i*size:(i + 1)*size,j*size:(j + 1)*size])
        sub_dct_data = fittler(sub_dct_data)
        for u in range(size):
            for v in range(size):
                dct_data[i*size + u][j*size + v]=sub_dct_data[u][v]
print(dct_data)


for i in range(img.shape[0]//size):
    for j in range(img.shape[1]//size):
        sub_idct_data = idct(dct_data[i*size:(i + 1)*size,j*size:(j + 1)*size])
        for u in range(size):
            for v in range(size):
                img_dct[i*size + u][j*size + v]=sub_idct_data[u][v]

print(img_dct)
cv.imwrite("6.jpg",img_dct)

Filter out some high-frequency details through a filter to complete the compression of the image.

6. Principle of information hiding based on DCT transformation

Before and after the DCT transformation matrix, high-frequency signals are filtered out in the middle, which has little impact on picture quality. We complete information hiding by changing the coefficients of the high-frequency signal area.