[Image Fusion] Gaussian of Difference: A simple and effective universal image fusion method [used to fuse infrared and visible light images, multi-focus images, multi-modal medical images and multi-exposure images] (Matlab code implementation)

Welcome to this blog

Advantages of bloggers:Blog content should be as thoughtful and logical as possible for the convenience of readers.

Motto:He who travels a hundred miles is half as good as ninety.

The directory of this article is as follows:

Table of Contents

1 Overview

2 Operation results

3 References

4 Matlab code and articles

1 Overview

literature:

Individual analysis of images obtained from a single source using different camera settings or spectral bands (whether from one or multiple sensors) is difficult. To solve this problem, images are often combined to create a single image that contains all the unique information from each source image, a process called image fusion. This paper proposes a simple and efficient pixel-based image fusion method, which uses Gaussian filtering to weight the edge information of each pixel in all source images in proportion to the distance to adjacent images. The proposed Gaussian Difference (GD) method is evaluated using multi-modal medical images, multi-sensor visible and infrared images, multi-focus images and multi-exposure images, and utilizes objective fusion quality metrics with existing state-of-the-art fusion methods A comparison was made. The parameters of the GD method are further enhanced by employing the pattern search (PS) algorithm, resulting in an adaptive optimization strategy. Extensive experiments show that the proposed GD fusion method outperforms other methods in average ranking in terms of objective quality indicators and CPU time consumption.

The purpose of image fusion is to combine complementary information from multiple source images into a unified image [1, 2, 3, 4]. In multimodal medical image fusion, two or more images from different imaging modalities are combined together [5]. Magnetic resonance (MR) and computed tomography (CT) are two different medical imaging modalities with complementary strengths and weaknesses. CT images have high spatial resolution, making bones more visible, while MR images have high contrast resolution, showing soft tissues such as organs [6]. Visible and infrared image fusion is a computing technique that includes combined information from infrared and visible spectrum images to improve the visibility of objects and enhance the contrast of images, especially for enhanced night vision, remote sensing, and full-color sharpening [ 7,8,9,10,11,12]. Multiple exposure image fusion involves integrating multiple images, each captured at different exposure levels, to generate high dynamic range (HDR) images. HDR images preserve details in dark and bright areas, thereby improving image quality, increasing visual fidelity, and improving image analysis in computer vision tasks [13,14]. Multi-focus image fusion is employed to merge multiple images with different focal length levels into a single composite image [15, 16, 17, 18, 19]. This improves overall clarity, enhances depth of field, and enhances visual perception [20]. These advantages enable more accurate analysis and interpretation of fused images in computer vision applications.

Image fusion methods in the literature can basically be divided into two categories: pixel domain and transform domain [21]. Pixel domain (or spatial domain) techniques combine source images directly using their grayscale or color pixel values. The most famous example of this technique is the arithmetic mean of source images. Arithmetic averaging can be used to combine multi-sensor and multi-focus images, but the biggest disadvantage of this method is the reduced image contrast [22]. The basic idea of multi-scale, transformation-based image fusion methods is to apply multi-resolution decomposition to each source image, combine the decomposition results with various rules to create a unified representation, and finally apply the inverse multi-resolution transformation [23]. Well-known examples of these methods include Principal Component Analysis (PCA), Discrete Wavelet Transform (DWT), Laplacian Pyramid (LP) and other pyramid-based transforms [24]. In recent years, several image fusion algorithms based on machine learning and deep learning methods have been proposed [3, 25, 26, 27, 28]. These methods are robust and exhibit excellent performance. However, the training phase requires powerful high-performance computing systems and large amounts of input training data. Furthermore, trained models can be very time-consuming for real-time applications [29].

Pixel level, feature level and decision level are the three levels at which image fusion can be performed. Pixel-level fusion directly integrates the original data of the source image to produce a fused image that is more informative for both computer processing and human visual perception. Compared with other fusion methods, this method strives to improve the visual quality and computational efficiency of fused images. Li et al. proposed a pixel-based method by calculating the pixel visibility of each pixel in the source image [30]. Yang and Li proposed a multi-focus image fusion method based on spatial frequency and morphological operators [31]. Typically, in pixel-level image fusion, weights are determined based on the activity levels of various pixels [32]. In these studies, neural networks [33] and support vector machines [34] were used to select pixels with the most significant activity, using wavelet coefficients as input features. Ludusan and Lavialle proposed a variational pixel image fusion method based on error estimation theory and partial differential equations to alleviate image noise [35]. In [36], a multi-exposure image fusion technique was introduced that involves two main stages: computing image features, including local contrast, brightness and color differences, to generate weight maps and further improving them using recursive filtering. Subsequently, a fused image is formed by combining the source images using a weighted sum based on these refined weight maps. In addition to the many available pixel-level methods, region-based spatial methods using patches [37] or adaptive regions [38, 39] have also been proposed to outperform existing methods.

Within the framework of an image fusion algorithm based on anisotropic diffusion filtering (ADF), a weighted layer is formed through image smoothing, using an edge protection method. These weight map layers undergo subsequent processing before applying fusion rules to achieve the final output [40]. Kumar introduced the Crossed Binary Filter (CBF) method, which takes into account the grayscale similarity and geometric proximity of adjacent pixels without the need for anti-aliasing. The source images are combined based on a weighted average using weights calculated from the detailed images extracted from the source images using the CBF method [41]. The fourth-order partial differential equation (FDPE) method first applies differential equations to each source image to obtain an approximate image. Then, use PCA to get the best weights for the detailed image, which are then combined to get the final detailed image. The final approximation of the image is obtained by averaging the set of approximate images. Subsequently, the fused image is calculated by merging the final approximation with the detailed image [42]. Contextual enhancement-based (GFCE) methods preserve visible details in the input image and background scenes. Therefore, it can successfully transfer important infrared information to synthetic images [43]. The gradient transfer fusion (GTF) method based on gradient transmission and total variation (TV) minimization attempts to preserve appearance information and thermal radiation simultaneously [44]. The hybrid multiscale decomposition method (HMSD) uses a combination of bilateral filters and a general Gaussian method to decompose the source image into very distant texture details and edge features. This shift allows us to better capture important very sensitive infrared spectral features and separate fine texture details from large edges [45]. The Infrared Feature Extraction and Visual Information Preservation (IFEVIP) method provides a simple, fast but effective fusion of infrared and visible light images. First, the infrared background is reconstructed using quadtree decomposition and Bessel interpolation; subsequently, bright infrared features are extracted by subtracting the reconstructed background from the infrared image, and then a refinement process is performed to reduce redundant background information [46]. The multi-resolution singular value decomposition (MSVD) method is an image fusion technique based on a process similar to the wavelet transform, involving filtering the signal independently using low-pass and high-pass finite impulse response (FIR) filters, and then combining each filter The output is decimated twice to achieve first-level decomposition [47]. The VSMWLS method aims to enhance the transmission of important visual details while minimizing the inclusion of irrelevant infrared (IR) details or noise in the merged image, which the method represents proposed a multi-scale fusion technique that combines visual saliency map (VSM) and weighted least squares map (WLS) optimization [48]. Liu et al. proposed a method based on deep convolutional neural network (CNN) for infrared-visible image fusion [49] and multi-focus image fusion [50]. They successfully used Siamese convolutional networks to integrate pixel activity information from two source images to construct a weight map, solving the key issues of activity level measurement and weight allocation in image fusion [49]. On the other hand, traditional image fusion techniques are sometimes difficult to perform satisfactorily since focus estimation and image fusion are two different problems. Liu et al. proposed a deep learning method that avoids the need for separate focus estimation by learning a direct mapping between source images and focus maps [50].

2 Running results

Part of the code:

% Gaussian of differences: a simple and efficient general image fusion method



function fuseimage = GD(images,ver)
%ver=1: GD5
%ver=2: GD10
%ver=3: GD15
%ver=4: GDPSQABF
%ver=5: GDPSQCD
%ver=6: GDPSQCV
if ver==1
    k=5;
    fuseimage = mfiltw(images,k);
elseif ver==2
    k=10;
    fuseimage = mfiltw(images,k);
elseif ver==3
    k=15;
    fuseimage = mfiltw(images,k);
elseif ver==4
    fitmetric="Qabf";
    [fuseimage]=mfiltw_opt(images,fitmetric);
elseif ver==5
    fitmetric="Qcb";
    [fuseimage]=mfiltw_opt(images,fitmetric);
elseif ver==6
    fitmetric="Qcv";
    [fuseimage]=mfiltw_opt(images,fitmetric);
end

% Gaussian of differences: a simple and efficient general image fusion method

function fuseimage = GD(images,ver)
%ver=1: GD5
%ver=2: GD10
%ver=3: GD15
%ver=4: GDPSQABF
%ver=5: GDPSQCD
%ver=6: GDPSQCV
if ver==1
k=5;
fuseimage = mfiltw(images,k);
elseif ver==2
k=10;
fuseimage = mfiltw(images,k);
elseif ver==3
k=15;
fuseimage = mfiltw(images,k);
elseif ver==4
fitmetric=”Qabf”;
[fuseimage]=mfiltw_opt(images,fitmetric);
elseif ver==5
fitmetric=”Qcb”;
[fuseimage]=mfiltw_opt(images,fitmetric);
elseif ver==6
fitmetric=”Qcv”;
[fuseimage]=mfiltw_opt(images,fitmetric);
end

3 References

Some of the content in the article is quoted from the Internet. The source will be indicated or cited as a reference. It is inevitable that there will be some unfinished information. If there is anything inappropriate, please feel free to contact us to delete it.