[Image segmentation based on Kmeans, Kmeans++ and binary K-means algorithm] Data Mining Experiment 3

Article directory

I. Project task requirements
II. Principle description
- KMeans
- KMeans++
- Bipartite K-means
- Evaluation index-silhouette coefficient
III. Dataset description
IV. Specific implementation process
V. Result analysis
VI, complete code
VII. Deep learning and image segmentation (supplement)
- CNN
- - 1. Convolutional Layer:
  - 2. Activation Function:
  - 3. Pooling Layer:
  - 4. Fully Connected Layer:
  - working principle:
- Split effect

I. Project task requirements

Image segmentation is an important part of image processing and computer vision and has been widely used in real life. For example:
- In medicine, it is used to measure tissue volume in medical images, three-dimensional reconstruction, surgical simulation, etc.;
- In remote sensing images, it can segment targets in synthetic aperture radar images, extract different cloud systems and backgrounds in remote sensing cloud images, locate roads and forests in satellite images, etc.
Image segmentation can also be used as preprocessing to transform the initial image into several more abstract and easier-to-computer-process forms. It not only retains the important feature information in the image, but also effectively reduces the useless data in the image and improves subsequent image processing. accuracy and efficiency. For example:
- In terms of communication, the outline structure, regional content, etc. of the target can be extracted in advance to ensure that useful information is not lost and the image can be compressed in a targeted manner to improve network transmission efficiency;
- In the field of transportation, it can be used for contour extraction, recognition or tracking of vehicles, pedestrian detection, etc.
In general, anything related to target detection, extraction, and recognition requires the use of image segmentation technology. Therefore, it is of great significance to conduct in-depth research and discussion of image segmentation, whether from the perspective of image segmentation technology and algorithms, or from the impact on image processing, computer vision, and practical applications.
Clustering technology is an important part of image segmentation technology. Among them, feature space clustering methods, such as Kmeans algorithm and its corresponding variants, are examples of partitioning and clustering methods. Image segmentation using feature space clustering method is to represent the pixels in the image space with corresponding feature space points, segment the feature space according to their aggregation in the feature space, and then map them back to the original image space to obtain the segmentation result.

Task description

This experiment uses the grayscale, color, texture, shape and other features of the image to divide the image into several non-overlapping regions and make these features within the same region. Showing similarities, there are obvious differences between different regions. Regions withunique properties in the segmented image can then be extracted for different studies.
Experiment content: Use three clustering methods: Kmeans, Kmeans + + andBichotomous K-means to segment the image, and write the experiment Result analysis.

Main task requirements

Briefly describe the three algorithm ideas and implementation principles.
Each group member prepares a picture they like (the types of pictures are diverse)
Write an analysis of the experimental results:
- Description of the experimental operating environment. Such as development platform, programming language, parameter adjustment, etc.
- Comparison and analysis of the image segmentation effects of three methods under different initial cluster numbers.
  - The comparison table style is as follows, and the segmentation renderings can be pasted into the corresponding positions;
  - Use silhouette coefficient (or other evaluation index) to evaluate the clustering effect of the three methods (see reference website);
  - Analyze and explain the experimental results based on the results of the comparison table (sample table below) and the evaluation results.

II. Principle description

KMeans

KMeans + +

Bichotomous K-means

Evaluation index-silhouette coefficient

III. Dataset description

IV, specific implementation process

V. Result analysis

VI, complete code

import numpy as np
import PIL.Image as Image
from sklearn.cluster import KMeans, Birch
from sklearn.metrics import silhouette_score, pairwise_distances
import matplotlib.pyplot as plt

#Set Matplotlib's text renderer and fonts that support Chinese characters
plt.rcParams['font.sans-serif'] = ['SimHei'] # or ['Microsoft YaHei']
plt.rcParams['axes.unicode_minus'] = False # Used to display the normal display of negative signs (-)

# Todo: Prepare data set------------------------------------------------- ----------------
# todo: Define function loadData to process images
# constant
NORMALIZATION_FACTOR = 256

# todo: function to load data
def load_data(file_path):
    with open(file_path, 'rb') as f:
        img = Image.open(f)
        m, n = img.size
        data = []
        for i in range(m):
            for j in range(n):
                x, y, z = img.getpixel((i, j))
                data.append([x/NORMALIZATION_FACTOR, y/NORMALIZATION_FACTOR, z/NORMALIZATION_FACTOR])
    return np.asarray(data), m, n

# Todo: Function to calculate contour coefficient--------------------------------------------- ------------------
def silhouette_coefficient(X, labels):
    cluster_centers = [np.mean(X[labels == i], axis=0) for i in range(len(set(labels)))]
    distances = pairwise_distances(X, cluster_centers)
    a = np.array([np.mean(distances[labels == i, i]) for i in range(len(set(labels)))])
    b = np.array([np.min(distances[labels != i, i]) for i in range(len(set(labels)))])
    s = (b - a) / np.maximum(a, b)
    return np.mean(s)

# Todo: Image segmentation using KMeans, KMeans + + and binary K-means ---------------------------------- --------------------------
#Load images and process them
image_paths = ["data/demo3/input/img1.jpg",
               "data/demo3/input/img2.jpg",
               "data/demo3/input/img3.jpg",
               "data/demo3/input/img4.jpg"]

k_values = [2, 3, 4]

#Initialize the contour coefficient list
kmeans_silhouette_scores = []
kmeans_plus_silhouette_scores = []
birch_silhouette_scores = []

print("Image segmentation using KMeans, KMeans++ and bisection K-means")
# Split the image and calculate the outline and save the segmented image
for k in k_values:
    kmeans_scores = []
    kmeans_plus_scores = []
    birch_scores = []

    for idx, path in enumerate(image_paths):
        img_data, rows, cols = load_data(path)

        # Calculate the silhouette coefficient of KMeans clustering
        kmeans_labels = KMeans(n_clusters=k, init='random', n_init=10).fit_predict(img_data)
        kmeans_silhouette_avg = silhouette_coefficient(img_data, kmeans_labels.flatten())
        kmeans_scores.append(kmeans_silhouette_avg)

        # Calculate the silhouette coefficient of KMeans++ clustering
        kmeans_plus_labels = KMeans(n_clusters=k, init='k-means + + ', n_init=10).fit_predict(img_data)
        kmeans_plus_silhouette_avg = silhouette_coefficient(img_data, kmeans_plus_labels.flatten())
        kmeans_plus_scores.append(kmeans_plus_silhouette_avg)

        # Calculate the silhouette coefficient of bipartite K-means clustering
        birch_labels = Birch(n_clusters=k, threshold=0.01, branching_factor=50).fit_predict(img_data)
        birch_silhouette_avg = silhouette_coefficient(img_data, birch_labels.flatten())
        birch_scores.append(birch_silhouette_avg)

        print(f"Using K-means clustering with {<!-- -->k} cluster centers, the silhouette coefficient of the image {<!-- -->idx + 1} is: {<!-- -- >kmeans_silhouette_avg}")
        print(f"Using K + + mean clustering of {<!-- -->k} cluster centers, the silhouette coefficient of the image {<!-- -->idx + 1} is: {<!-- -->kmeans_plus_silhouette_avg}")
        print(f"The silhouette coefficient of image {<!-- -->idx + 1} using bipartite K-means clustering with {<!-- -->k} cluster centers is: {<!-- --> ->birch_silhouette_avg}")

        # Save the split image
        kmeans_output_path = f"data/demo3/output/KMeans{<!-- -->k}-{<!-- -->idx + 1}.jpg"
        kmeans_plus_output_path = f"data/demo3/output/KMeans + + {<!-- -->k}-{<!-- -->idx + 1}.jpg"
        birch_output_path = f"data/demo3/output/Birch{<!-- -->k}-{<!-- -->idx + 1}.jpg"

        segmented_img_kmeans = np.reshape(kmeans_labels, (rows, cols))
        segmented_img_kmeans = (segmented_img_kmeans * 255 / k).astype(np.uint8)
        segmented_img_kmeans = Image.fromarray(segmented_img_kmeans)
        segmented_img_kmeans.save(kmeans_output_path)

        segmented_img_kmeans_plus = np.reshape(kmeans_plus_labels, (rows, cols))
        segmented_img_kmeans_plus = (segmented_img_kmeans_plus * 255 / k).astype(np.uint8)
        segmented_img_kmeans_plus = Image.fromarray(segmented_img_kmeans_plus)
        segmented_img_kmeans_plus.save(kmeans_plus_output_path)

        segmented_img_birch = np.reshape(birch_labels, (rows, cols))
        segmented_img_birch = (segmented_img_birch * 255 / k).astype(np.uint8)
        segmented_img_birch = Image.fromarray(segmented_img_birch)
        segmented_img_birch.save(birch_output_path)


    # Calculate the average silhouette coefficient
    mean_kmeans_score = np.mean(kmeans_scores)
    mean_kmeans_plus_score = np.mean(kmeans_plus_scores)
    mean_birch_score = np.mean(birch_scores)

    #Storage average contour coefficient
    kmeans_silhouette_scores.append(mean_kmeans_score)
    kmeans_plus_silhouette_scores.append(mean_kmeans_plus_score)
    birch_silhouette_scores.append(mean_birch_score)

# Todo: Visualize the silhouette coefficients of KMeans, KMeans + + and bipartite K-means clustering-------------------------------- ----------------------------
print("[Average contour coefficient]")
print(f"K means: {<!-- -->kmeans_silhouette_scores}")
print(f"K + + means: {<!-- -->kmeans_plus_silhouette_scores}")
print(f"Bichotomous K-means: {<!-- -->birch_silhouette_scores}")


plt.figure(figsize=(10, 6))
plt.plot(k_values, kmeans_silhouette_scores, marker='o', label='KMeans')
plt.plot(k_values, kmeans_plus_silhouette_scores, marker='o', label='KMeans + + ')
plt.plot(k_values, birch_silhouette_scores, marker='o', label='Birch')
plt.xlabel('Number of cluster centers')
plt.ylabel('Contour coefficient')
plt.title('Average silhouette coefficients of KMeans, KMeans + + and Birch clustering')
plt.xticks(k_values)
plt.legend()
plt.grid(True)
plt.show()

# Todo: Visualize clustering results---------------------------------------------- ------------------
plt.figure(figsize=(12, 8))
for i, k in enumerate(k_values):
    plt.subplot(1, len(k_values), i + 1)
    for idx, path in enumerate(image_paths):
        segmented_img = Image.open(f"data/demo3/output/KMeans{<!-- -->k}-{<!-- -->idx + 1}.jpg")
        plt.imshow(segmented_img, cmap='gray')
        plt.axis('off')
        plt.title(f'KMeans clustering results\
(number of clusters: {<!-- -->k})')

plt.show()

plt.figure(figsize=(12, 8))
for i, k in enumerate(k_values):
    plt.subplot(1, len(k_values), i + 1)
    for idx, path in enumerate(image_paths):
        segmented_img = Image.open(f"data/demo3/output/KMeans + + {<!-- -->k}-{<!-- -->idx + 1}.jpg")
        plt.imshow(segmented_img, cmap='gray')
        plt.axis('off')
        plt.title(f'KMeans + + clustering results\
(number of clusters: {<!-- -->k})')

plt.show()

plt.figure(figsize=(12, 8))
for i, k in enumerate(k_values):
    plt.subplot(1, len(k_values), i + 1)
    for idx, path in enumerate(image_paths):
        segmented_img = Image.open(f"data/demo3/output/Birch{<!-- -->k}-{<!-- -->idx + 1}.jpg")
        plt.imshow(segmented_img, cmap='gray')
        plt.axis('off')
        plt.title(f'Birch clustering results\
(number of clusters: {<!-- -->k})')

plt.show()

VII, deep learning and image segmentation (supplement)

CNN

Convolutional neural network (CNN) is a deep learning model specifically designed to process grid-like data such as images and videos. CNN has achieved great success in tasks such as image recognition, object detection, and image generation. Its core feature is that it can automatically learn features from data without manually designing a feature extractor.

The following are the main components and working principles of CNN:

1. Convolutional Layer:

The convolutional layer is the core of CNN. It uses convolution operations to extract features in images. The convolution operation slides over the input image through a small window (convolution kernel), calculates the values within each window, and then generates an output feature map. This operation can capture local features in the image and is therefore very suitable for processing image data.

2. Activation Function:

The activation function introduces nonlinear properties, allowing the network to learn complex patterns. Commonly used activation functions include ReLU (Rectified Linear Unit) function, Sigmoid function and TanH function.

3. Pooling Layer:

The pooling layer is used to reduce the spatial size of the feature map while retaining important features. The most common pooling operation is Max Pooling, which selects the maximum value in each region as the output, thereby reducing the size of the feature map.

4. Fully Connected Layer:

The fully connected layer converts the feature maps extracted by the previous convolutional layer and pooling layer into the final output of the network. Each neuron in the fully connected layer is connected to all neurons in the previous layer, and features are combined and classified by learning weights.

How it works:

Input data: The input of CNN is usually a three-dimensional array, representing the height, width and number of channels of the image (for example, RGB images have three channels, grayscale images have only one channel).
Convolution and activation: The input data passes through the convolution layer and activation function to obtain a series of feature maps, each feature map representing a different feature.
Pooling: The feature map is downsampled through the pooling layer to reduce the spatial size while retaining important features.
Fully connected: The output of the pooling layer is expanded into a one-dimensional vector, which is input to one or more fully connected layers to perform tasks such as classification or regression.

The main advantage of CNN is that it can automatically learn the spatial structure features in the input data without manually designing a feature extractor. This makes it perform very well on tasks such as image recognition.