© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
Medical images are affected by various complications such as noise and deficient contrast. To increase the quality of an image, it is highly important to increase the contrast and eliminate noise. In the field of image processing, image enhancement is one of the essential methods for recovering the visual aspects of an image. However, segmentation of the medical images such as brain magnetic resonance imaging (MRI) and lungs computed tomography (CT) scans properly is a difficult task. In this article, a novel hybrid method is proposed for the enhancement and segmentation of lung images. The suggested article includes two steps. In the 1st step, lung images were enhanced. During enhancement, images were gone through many steps such as de-hazing, complementing, channel stretching, course illumination, and image fusion by principal component analysis (PCA). In the second step, the modified U-Net model was applied to segment the images. We evaluated the entropy of input and output images, peak signal-to-noise ratio (PSNR), gradient magnitude similarity deviation (GMSD), and multi-scale contrast similarity deviation (MCSD) after the enhancement process and compare results with existing adaptive gamma correction with weighted distribution correction (AGCWD) method. During segmentation, we used both original and enhanced images and calculated the Dice-coefficient. We found that the Dice-coefficient was 0.9695 for the original images and 0.9797 for the enhanced images.
enhancement, segmentation, dehazing, illumination, lungs images, U-Net
To illuminate Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) medical images, large-scale training and talents are needed. The reason is that segmentation of body parts and lesions is implemented layer by layer [1]. If doctors perform segmentation manually, this process will be hectic and time-consuming. Moreover, different doctors’ subjective judgments can create discrimination. However, automatically segmenting images is also difficult; for most medical applications, automatic segmentation is an undetermined difficulty because of the broad range associated with image modalities, encoding parameters, and organic volatility.
Before segmentation, images should be of high visual quality otherwise images may be misdiagnosed. Generally, CT and MRI images have poor contrast; therefore, enhancement techniques are applied to improve the contrast. There are many methods for enhancing the images to achieve better visual clarification, perception, and investigation. The first category of image enhancing algorithms is based on histograms. In histogram-based algorithms; there may be equalization processes for histogram, adaptive histograms, BI-histograms, and contrast limited adaptive histograms as well as other operations, such as gray level grouping [2]. For upgrading the contrast and brightness in images, the histogram equalization method can be applied. This approach expands the intensity range of the image. However, in all situations, histogram equalization does not work properly. This approach may add noise to the output images. Adaptive histogram equalization works differently from normal histogram equalization. Many histograms are computed for different separate sections of the image and the intensity range is distributed according to the distinct section. To enhance the local contrast, adaptive histogram equalization is very effective. It enhances the intensity of edges in every section of an image. A bi-histogram is a brightness preserving bi-histogram method in which the input image is disjointed into two parts by calculating the mean [3]. By doing these two disjointed ranges of the histogram are achieved. After obtaining two subhistograms, equalization process is applied. By using this method, original brightness of the image may be maintained to a definite degree. The next method of histogram equalization is contrast-limited adaptive histogram equalization. This method first converts an image from RGB (red, green, or blue) to HSV (hue, saturation, or value) color channel. After that, the value component was computed without disturbing the hue or saturation. In this method, cropped pixels that are cropped from the original histogram are dispersed to each gray-range. By this process, every pixel value is decreased to prescribed maxima. In the final step, the altered image in the HSV color channel is transformed into the RGB color channel.
Image enhancement can be performed by increasing contrast but in this technique, there are many side-effects. To overcome this, histogram equalization can be applied, but it does not conserve the initial brightness of the image. Chenigaram et al. [4] suggested a method for increasing the dimmed image brightness via adaptive gamma correction weighted distribution correction technique (AGCWD). This technique generally changes the histograms and is used for histogram equalization. After enhancing the images segmentation was applied, that is recognized as the Online Region-based Active Contour Model (ORACM). They applied their model for brain image segmentation and calculated mean square error (MSE) and peak signal to noise ratio (PSNR).
A new method for segmenting the lung region was suggested by Abdullah et al. [5]. They used a thresholding-based technique for enhancing and segmenting the lung images. They compared their proposed technique with the modified watershed segmentation technique and achieved an improvement in accuracy of 0.02% to 3.5% in quantitative study.
Gupta et al. [6] suggested a novel hybrid model for refining segmentation effectiveness in brain MR images. They used an adaptive filter for eliminating the noise in the input images. After that, they combined extended K-means clustering with Fuzzy C-means clustering to generate a hybrid segmentation model. They applied this model to single-channel T1 MR images to identify malignant tumors and multiform benign lesions. They also eliminated the constraint of prefixed cluster size. They also performed non-linear operations to eliminate non-tumor tissues. Many statistical parameters such as entropy, smoothness, mean, and standard deviation were evaluated. The authors achieved 98% segmentation effectiveness in their work.
Many authors suggested models based on U-Net architecture. U-Net is frequently utilized for image segmentation. Yin et al. [1] proposed an improved U-Net architecture to solve many issues that generally occur during the segmentation of medical images. The author identified U-Net network dimensions, an enhanced model and variables along with kernel size. In this research, the authors presented the loss functions, assessment parameters, and modules generally used for medical image segmentation. Skourt et al. [7] suggested obtaining lung CT images by using the U-Net architecture. By using U-Net it is not compulsory to provide unnecessary data in lung CT images and correct segmentation can be attained with a 0.9502 Dice-coefficient index. Shamim et al. [8] proposed a segmentation technique to determine ground glass haziness or regions of interest (ROIs) in CT images created by the novel coronavirus. They used modified U-Net method to categorize the ROI at pixel level. The authors obtained results in terms of accuracy (93.29%), precision (93.67%), F1-score (93.34%), etc., and achieved an increase in each parameter compared to other U-Net models. Lee et al. [9] suggested a patchwise U-Net model for brain image segmentation. They used this model to eliminate the shortcoming of the standard U-Net model. In this model, the parts from an MRI scan were split into patches (nonoverlapped) and input into the U-Net model along with the related patches of true information to train the model. They achieved a Dice similarity coefficient of 0.93 and performed 3% and 10% better than did the traditional U-Net and the Seg-Net-based models respectively. The authors used two databases the Internet Brain Segmentation Repository (IBSR) and the Open Access Series of Imaging Studies (OASIS) in their models. Saood and Hatem [10] utilized two models SegNet and U-Net for binary segmentation to distinguish between physically fit and rotten lung tissue, and multiclass segmentation to determine the kind of infection on the lung. They used 72 images to train the model, 10 images for validation, and 18 images for testing.
Gite et al. [11] suggested U-Net++ for lung segmentation from X-ray images. If classification techniques work on segmented images of lungs in place of X-ray images, identifying tuberculosis (TB) is easier and more accurate. They achieved 98% accuracy by applying U-Net++. Nazir et al. [12] used lung images for cancer detection. They suggested a technique for image fusion that depends on the multi-resolution rigid registration (MRR), discrete wavelet transform and principal component analysis (DWT-PCA) techniques. According to these authors, the MRR technique is highly precise compared to SRR. After applying the MRR, the images are enhanced via the DWT-PCA fusion method. They used ResNet-18 for image classification. They achieved 98.2% accuracy. Riaz et al. [13] suggested an upgraded hybrid neural network through the coalition of two models, MobileNetV2 and U-Net. They applied this model for segmentation of malignant lung tumors from CT images of the lungs. The authors used the Medical Segmentation Decathlon (MSD) 2018 challenge dataset and achieved 87.9 Dice score. Surono et al. [14] utilized U-Net architecture for segmentation of lung images. They worked on a database which has dissimilar resolutions for every image. They performed experiments for several training and testing data ratios and compared the effectiveness of the method on a single resolution database with that on a multi-resolution database. Mique and Malicdem [15] suggested a model based on the Res-Net architecture for semantic segmentation of lung images. They used dataset of 562 chest X-ray images and lung mask images. The ratio of training to test data was 70:30. The authors achieved a Dice-coefficient of 0.986.
Abdullah et al. [16] performed a comparison-based study among 3 segmentation methods. They compared the segmentation results with the manual segmentation results from an oncologist who was applied for the detection of lung cancer. In this study, K-means clustering, thresholding via Otsu, and watershed segmentation were used to segment the lung images. Among these techniques, the watershed segmentation method achieved the best results, and the accuracy of this method was 99.85%. Khan et al. [17] also utilized K-means clustering for segmenting brain images and deep learning models for classifying the brain tumor.
We propose a model in which images are enhanced first, after which the segmentation process is applied. For implementation the MATLAB platform is used. For image enhancement, images are gone through two phases. In the first phase, complement, dehazing, and complement operations are applied and in the second phase, illumination and reflectance are calculated from the images. These parameters are subsequently utilized for enhancing the images. Finally, both phase outputs are combined by utilizing the PCA based image fusion algorithm to generate the final enhanced images. After enhancing the images, a modified U-Net is applied for segmentation. We applied this model to both types of datasets (the original image dataset and enhanced image dataset). For U-Net implementation we used Python language on the Jupyter notebook.
Table 1. Preprocessing, enhancement, segmentation techniques, and performance parameters of different research papers
Author |
Dataset |
Preprocessing/ Enhancement Technique |
Segmentation Technique |
Segmentation Accuracy |
Recall |
Precision |
F-score |
Others Parameters |
Chinegeram et al. [4] |
4 brain images |
Adaptive gamma correction via weighted distribution (AGCWD) |
Online Region based Active Contour Model (ORACM) |
N/A |
N/A |
N/A |
N/A |
Iterations, CPU time, Total Area Covered |
Abdullah et al. [5] |
Advanced Medical & Dental Institute (AMDI), Universiti Sains Malaysia (USM), Kepala Batas, Pulau Pinang. A total of 1,155 soft tissues density images from 5 subjects |
N/A |
New Segmentation method based on thresholding, masking, and enhancement |
99.9% |
99.8% |
99.9% |
99.74% |
N/A |
Skourt et al. [7] |
Lung Image Database Consortium image collection (LIDC-IDRI) |
N/A |
U-Net architecture |
N/A |
N/A |
N/A |
N/A |
Dice-coefficient index (0.9502) |
Shamim et al. [8] |
COVID-19 CT image dataset |
N/A |
Modified U-Net Model |
93.29% |
93.01% |
93.67% |
93.34% |
Dice-coefficient (92.46%) |
Lee et al. [9] |
Open Access Series of Imaging Studies (OASIS) and Internet Brain Segmentation Repository (IBSR) |
N/A |
Patch wise U-Net Model |
N/A |
N/A |
N/A |
N/A |
Dice similarity coefficient (0.93) |
Saood and Hatem [10] |
Collection of the Italian Society of Medical and Interventional Radiology |
N/A |
SegNet and U-Net |
SegNet- 95% U-Net- 91% |
SegNet- 95.6% U-Net- 96.4% |
|
SegNet- 0.861
U-Net- 0.856 |
Dice score for SegNet- 0.749 Dice score for U-Net- 0.733 |
Gite et al. [11] |
Montgomery and Shenzhen datasets |
Image Normalization |
U-Net++ |
98% |
99.32% |
96.85% |
N/A |
Dice score- 0.9796 |
Riaz et al. [13] |
Medical Segmentation Decathlon Challenge (MSD) |
Image Normalization |
MobileNetV2 and U-Net |
N/A |
86.02% |
93% |
N/A |
Dice score- 0.8793 |
Surono et al. [14] |
NSCLC-Radiomics dataset |
N/A |
U-Net |
94.47% |
N/A |
N/A |
N/A |
N/A |
Mique and Malicdem [15] |
National Institute of Health (NIH) |
N/A |
U-Net |
N/A |
N/A |
N/A |
N/A |
Dice score- 0.9496 |
Hua et al. [18] |
BrainWeb |
N/A |
Improved Multi-View Fuzzy C-Means Clustering Algorithm |
80.32% |
N/A |
N/A |
N/A |
Dice-coefficient- 89.61% for 0% noise |
Geetha Pavani et al. [19]
|
Montgomery dataset |
Arithmetic Mean Filter |
Chan-Vese active contour |
95.5% |
93.3% |
N/A |
N/A |
Area Under Curve Score- 95% |
Hofmanninger et al. [20] |
VISCERAL Anatomy3 (VISC-36), LTRC (LTRC-36), and (LCTSC-36) |
N/A |
U-Net Method |
N/A |
N/A |
N/A |
N/A |
Dice-coefficient (0.97) |
1.1 Performance parameters and experimental results of various segmentation methods applied to variety of datasets
Table 1 lists the different preprocessing, enhancement and segmentation techniques used in different research papers. U-Net or its modified versions are commonly used in different research papers. Most of the related research papers show their results with respect to segmentation accuracy, F1-score, precision and recall. In some papers, other parameters such as Dice-coefficient, number of iterations, and CPU time are used.
This section explains how lung images are preprocessed and enhanced. During this process, images are gone through many functions. In this methodology, our aim to extract illumination and reflectance of a low illumination image, further enhance illumination and reconstruct the final image. Once images are preprocessed and enhanced, the segmentation method is applied to the enhanced images. The complete architecture is shown in Figure 1.
Figure 1. Complete architecture of methodology
2.1 Image preprocessing for enhancement
In this model we used lung images for enhancement. Our proposed methodology contains two phases. In the first phase, images were passed through three steps (complement, dehazing, and complement) to enhance the quality of the images. In the 2nd phase, to preserve the naturalness we estimated illumination and reflectance of images; and finally enhanced the images by using these estimated values. The images are divided into three channels (RGBs) and these three channels are passed into modified Gaussian filters individually. After this we calculated the average of all the weights. This process improved the visual quality of the images. We multiplied the magnified illumination by the reflectance to obtain the next updated picture. Finally, the outputs of both parts were combined with PCA based image fusion. The suggested enhancement methodology is shown in Figure 2.
Figure 2. Suggested model for image enhancement
2.1.1 Phase-1: Complement, dehazing, and complementing
To enhance the quality of the images, we computed the complement of the image and then applied dehazing technique. The decreased visibility of images due to atmospheric circumstances can be improved by dehazing. The main purpose of dehazing is to repair the brightness of a scene from a hazy image. We utilized imreduce-haze algorithm which is based on two dissimilar de-hazing algorithms, Simple Dark Channel Prior (DCP) and Approx DCP. Simple DCP algorithm utilizes a per-pixel dark channel to evaluate haze and quad tree decomposition to evaluate the atmospheric light. Approx DCP method utilizes both per-pixel and spatial blocks while estimating the dark channel and does not utilize quad tree decomposition. After the dehazing process we again calculated the complement of the dehazed image. This is the first output image.
2.1.2 Phase-2: Estimation of illumination by using maximum of RGB
The aim of estimation of illumination is to find the intensity, direction, and/or color of the lighting in an image. To calculate the initial coarse illumination, we utilized the maximum Red, Green, Blue (RGB) method. This is a good method for determining true illumination. The formula for the maximum RGB method is as follows in Eq. (1):
$I_{c i}(p, q)=\underset{c \in\{R, G, B\}(p, q) \in \Omega}{{Maximum}}\left({Maximum}(I_c(p, q)\right))$ (1)
where, $I_{c i}(p, q)$ is the coarse illumination, $I_{c}(p, q)$ is the color channel of the image, and $\Omega$ is a local patch centered at (p, q). We applied this formula for all three channels (R, G, B). After calculating the coarse illumination for all three-color channels, we used the maximum formula as per Eq. (2) for calculating the maximum channel from all three-color channels.
$I_m(p, q)={Maximum}\left(I_R(p, q) I_G(p, q) I_B(p, q)\right)$ (2)
where, $I_{m}(p, q)$ is maximum channel illumination. After calculating the illumination for all the R, G, B channels, we applied Gaussian filter as per Eq. (3).
$G(x, y)=A e^{\left(-\frac{x^2+y^2}{2 \sigma^2}\right)}$ (3)
where, parameter $\sigma$ represents the resemblances of the range or intensity and A is a constant that depends on the Gaussian function integration and satisfies the following Eq. (4).
$\iint(x, y) d x\, d y=1$ (4)
The final illumination is calculated as Eq. (5)
$I_f(p, y)=I_m(p, q) \times G(p, q)$ (5)
where, Im is the maximum channel illumination and If is the ultimate illumination of the input image. We applied a multiscale Gaussian method with varied scales to obtain the assessed lighting component scores and we applied weights to this method to retain the constant and original characteristics of the distinct illuminating components. There are 3 standard parameters for all color channels (R, G, B) for a particular image. By utilizing these parameters we calculated the ultimate estimated illumination of the image. The formula for calculating this final illumination is given below in Eq. (6).
$I_f(p, q)=\sum_{i=1}^j \beta_i\left(I_{m i}(p, q) \times G_i(p, q)\right)$ (6)
where, $\beta i$ denotes the R, G, and B channel weight coefficients, which are generated by the Gaussian function, and j is the total number of image channels. After identifying all estimated illumination components taken out on the standard parameters, the value was adjusted to 1/3 for the three standard variables of the Gaussian function.
2.1.3 Estimation of reflectance
The aim to estimate the reflectance in any image is to find the surface intrinsic properties in the image, independent of lighting conditions. The reflectance is considered the ratio of the reflected light transferred by the surface to the incoming light. If we calculate reflectance, the values are in the range of 0 to 1. We divided the input image by the illumination of that image (Eq. (7)) to calculate the reflectance.
$\begin{aligned} & { Reflectance }= { Input\, image/illumination \,of\, the\, image }\end{aligned}$ (7)
2.1.4 Image reconstruction
After calculating the illumination and reflectance we reconstructed the image by using the following formula (Eq. (8)).
Image reconstruction = Final estimated illumination $\times$ Refectance (8)
The image reconstruction is our output2.
Now we combined both outputs (Phase 1 output and Phase 2 output) using PCA based image fusion, and obtained the final image. We used PCA to calculate the weight coefficients (w1 and w2). These weights were used to find the weighted sum of the O1 and O2 output images obtained from phase 1 and phase 2 respectively. This method is used to reduce dimensionality. The formula for fusing the images is given below Eq. (9):
$\mathrm{I}_{\mathrm{Fus}}=\mathrm{w}_1 \mathrm{O}_1+\mathrm{w}_2 \mathrm{O}_2$ (9)
where, IFus is the fused image, w1 and w2 (as per Eq. (10) and Eq. (11)) are weight coefficients. O1 is phase-1 output image and O2 is the phase-2 output image.
$\mathrm{w}_1=\operatorname{EV}(1) /(\mathrm{EV}(1)+\mathrm{EV}(2))$ (10)
$\mathrm{w}_2=\mathrm{EV}(2) /(\mathrm{EV}(1)+\mathrm{EV}(2))$ (11)
where, EV is the eigen vector.
Figure 3 is showing the different stages of enhancement process of one lung image.
Figure 3. Steps of image enhancement
2.2 Image segmentation using the U-Net model
We applied the U-Net model (modified) for segmentation on original images as well as on enhanced images. The suggested model is shown in Figure 4. The U-Net model has a U shape and is designed for semantic segmentation [21]. It is the combination of two paths (contracting path and expansive path). In the contracting path, distinctive CNN architecture is applied. There are multiple layers of two 3×3 convolution operations. Every convolution layer is trailed by a ReLU function and a max pooling process (for down-sampling) with stride 2. During the down-sampling process, the feature channels are twice. Up-sampling of the feature map is done during the expansive path; which is followed by 2×2 up-convolution operations.
Figure 4. Modified U-Net model
We modified the U-Net model by using 19 convolution layers, 4 max pooling layers, 4 concatenate layers and took the image size 256×256. The parameters which are used in this model are described in Table 2.
Table 2. Modified U-Net model parameters
Name of Parameters |
Description |
Convolution Layer |
19 |
Max Pooling Layer |
4 |
Stride |
2 |
Convolution Transpose Layer |
4 |
Concatenation Layer |
4 |
Activation Function |
Relu |
Total Parameters |
7,759,521 |
Trainable Parameters |
7,759,521 |
Optimizer |
Adam |
Image Size |
256×256 |
In this U-Net model, the input size is 256×256. During down-sampling, two convolutional layers are followed by max-pooling layer and during up-sampling convolution transpose layer and concatenate layer are used. Concatenate layer is used for connecting the output of convolution transpose layer and the output of corresponding convolution layer of down-sampling phase. Every concatenate layer is followed by two convolution layers.
We worked on lung images from the different datasets for enhancement. We selected lung images from the dataset which is available on Github [22]. The COVID-19 CT database contains 349 CT images which have clinical findings of COVID-19 extracted from 216 patients. We also selected some images from the IQ-OTH/NCCD - lung cancer database [23]. This database includes 1190 images that show CT scan slices of 110 patients. We selected lung images from the dataset which is available on Kaggle [24] also.
The Kaggle dataset is collection of 2-dimensional and 3-dimensional lung images with respective segmented masks. We used only 2-dimensional lung images from the dataset for enhancement and segmentation.
We applied our enhancement techniques to lung images from different datasets to enhance the quality of the images. After applying the model, we calculated the discrete entropy of the input and enhanced images, peak signal to noise ratio (PSNR), gradient magnitude symmetry deviation (GMSD), and multi-scale contrast similarity deviation (MCSD).
4.1 PSNR
The PSNR is the peak signal-to-noise ratio calculated for two images. It is computed in decibels. This parameter reflects the difference in quality between the original and reconstructed images [25]. The formula is shown in Eq. (12).
$P S N R=10 \,log _{10}\left(\frac{R^2}{M S E}\right)$ (12)
where, R denotes the input image maximal variation and MSE is mean squared error.
4.2 GMSD
In gradient magnitude similarity deviation (GMSD) the pixel-wise gradient magnitude similarity is calculated to determine the image local quality [26]. The quality index was used as the final image quality index after calculating the standard deviation of the complete GMS map. It is better in terms of both efficiency and accuracy.
The GMSD formula is given in Eq. (13):
$\mathrm{GMSD}=\sqrt{\frac{1}{\mathrm{M}} \sum_{\mathrm{i}=1}^{\mathrm{M}}\left(\mathrm{GMS}_{\mathrm{i}}-\mu_{\mathrm{GMS}}\right)^2}$ (13)
where, M denotes the total number of pixels in the image. GMSI is the Gradient Magnitude Similarity (GMS) at pixel i, measures the similarity between the gradient magnitude of the reference and distorted images at that pixel. $\mu_{\mathrm{GMS}}$ is the mean value of the gradient magnitude similarity across all pixels. The term inside the square root is the variance or deviation of the GMS values across the image, which captures local quality variations.
4.3 MCSD
The MCSD searches for contrast features by taking recourse to the multi-scale representation [27]. The purpose behind this is that a multi-scale method integrates image particulars at dissimilar resolutions and contrast is related to the viewing distance.
The MCSD formula is given in Eq. (14):
$\mathrm{MCSD}=\sqrt{\frac{1}{\mathrm{M}} \sum_{\mathrm{i}=1}^{\mathrm{M}}\left(\mathrm{CS}_{\mathrm{i}}-\mu_{\mathrm{CS}}\right)^2}$ (14)
where, M denotes the total number of pixels in the image. CSi is the Contrast Similarity at pixel i, which compares the contrast between the reference and distorted images at each pixel. $\mu_{\mathrm{CS}}$ is the mean contrast similarity across all pixels. The term inside the square root calculates the variance or deviation of contrast similarity across the image.
4.4 AGCWD
The existing model Adaptive Gamma Correction with Weighted Distribution (AGCWD) is an image enhancement technique commonly applied for enhancing the contrast of biomedical images. AGCWD adjusts the gamma correction in dynamic way; that is based on the weighted distribution of pixel intensities, which helps in handling various illumination conditions. During this process, a histogram of pixel intensities is calculated and weighted to highlight specific ranges. The model computes a gamma value which is applied across the image according to the weighted histogram. The calculated gamma correction is then applied to each pixel, which results in an image with refined contrast and visibility.
Table 3. Enhancement using dehazing and estimation of illumination for lung images of different datasets
Input Image |
Output 1 |
Output 2 |
Final Enhanced Image |
|
Lungs-1 |
||||
Lungs-2 |
||||
Lungs-3 |
||||
Lungs-4 |
In Table 3, different dataset lung images' intermediate and final enhanced forms are showing. In the Table 4. we are showing the comparison between the calculated discrete entropy of enhanced images, PSNR, GMSD, and MCSD score for our proposed method as well as for Adaptive Gamma Correction with Weighting Distribution Method (AGCWD). In our proposed enhancement method, Entropy and GMSD score is always better than AGCWD method. High entropy is often desirable for images with complex textures and a lower value of GMSD indicates the improved quality of an image.
Table 4. Evaluation of discrete entropy, PSNR, GMSD, and MCSD after enhancing the lung images and comparison with AGCWD method
|
|
Adaptive Gamma Correction with Weighting Distribution (AGCWD) Method |
Our Proposed Method |
|||||||
S.N. |
Original Images |
Original Image Discrete Entropy |
Enhanced Image Discrete Entropy |
PSNR |
GMSD |
MCSD |
Enhanced Image Discrete Entropy |
PSNR |
GMSD |
MCSD |
Lung-1 |
6.2561 |
5.5019 |
21.56 |
0.9298 |
0.0681 |
6.433 |
14.0999 |
0.8493 |
0.1217 |
|
Lungs-2 |
6.0206 |
5.96 |
13.3257 |
0.9423 |
0.0613 |
6.8206 |
15.019 |
0.8579 |
0.0645 |
|
Lungs-3 |
5.8734 |
5.8101 |
13.6510 |
0.9351 |
0.0626 |
6.7059 |
14.9504 |
0.8749 |
0.0564 |
|
Lungs-4 |
5.8492 |
5.7872 |
14.2557 |
0.9520 |
0.0577 |
6.5474 |
16.2763 |
0.8902 |
0.0605 |
|
Lungs-5 |
5.774 |
5.7433 |
13.2609 |
0.9412 |
0.0679 |
6.6642 |
14.3944 |
0.8521 |
0.0643 |
|
Lungs-6 |
2.1947 |
2.0573 |
47.6525 |
0.9929 |
0.0021 |
3.2907 |
19.1305 |
0.9084 |
0.128 |
|
Lungs-7 |
2.4178 |
2.0583 |
41.39 |
0.9901 |
0.0016 |
2.8421 |
15.0056 |
0.8701 |
0.1509 |
|
Lungs-8 |
2.1608 |
1.5544 |
43.35 |
0.9875 |
0.0037 |
3.0391 |
20.6371 |
0.9044 |
0.0814 |
|
Lungs-9 |
2.7352 |
2.2333 |
42.3278 |
0.9926 |
0.0015 |
3.4020 |
14.0282 |
0.8269 |
0.2040 |
|
Lungs-10 |
2.8059 |
2.32 |
42.0025 |
0.99 |
0.0023 |
3.587 |
14.4643 |
0.8261 |
0.1603 |
4.5 Dice-coefficient
For segmentation accuracy, we calculated Dice-coefficient as per Eq. (15). It is a type of similarity metric which is generally measured in biomedical image segmentation, to estimate the accuracy of segmentation algorithms by comparing the overlap between the predicted and the ground truth segmentation. Overall, it shows the similarity between two sets of images. The value ranges of this coefficient are from 0 to 1.
Dice coefficient $=2 \times \frac{\mathrm{A} \cap \mathrm{P}}{\mathrm{A}+\mathrm{P}}$ (15)
where, A is actual value and P is predicted value.
4.6 Modified U-Net model
After enhancing the images, we applied modified U-Net to the input and enhanced the images for segmentation. Before applying U-Net model, all images are resized in 256×256. A comparison of both types of images (input and enhanced) revealed that the segmentation accuracy was better for the enhanced images.
Figure 5. Loss and Dice-coefficient diagrams for the original images (80:20)
Figure 6. Loss and Dice-coefficient diagrams for the enhanced images (80:20)
The diagrams in Figure 5 and Figure 6 show the relationship between (epochs and entropy) and between (epochs and the Dicecc-coefficient) for the original and enhanced images in the training and test data of the ratio 80:20. The diagrams in Figure 7 and Figure 8 show the relationship between (epochs and entropy) and between (epochs and the Dice-coefficient) for the original and enhanced images in the training and test data of the ratio 75:25. The diagrams in Figure 9 and Figure 10 show the relationships between (epochs and entropy) and between (epochs and the Dice-coefficient) for the original and enhanced images in the training and test data of the ratio 70:30.
Figure 7. Loss and Dice-coefficient diagrams for the original images (75:25)
Figure 8. Loss and Dice-coefficient diagrams for the enhanced images (75:25)
Figure 9. Loss and Dice-coefficient diagrams for the original images (70:30)
Figure 10. Loss and Dice-coefficient diagrams for the enhanced images (70:30)
The U-Net model was applied with input of lung images with their masks; which are available in Kaggle dataset. After that, model was applied on subsequent enhanced images. We applied the training and test data in different ratios such as 80:20, 75:25, and 70:30. After calculating the Dice-coefficient for different ratio of data (training and testing), we took the average of all values. We took 30 epochs for training the model. When the model was applied to the original dataset, the average Dice-coefficient was 0.9695 (Table 5); however, when the model was applied to the enhanced dataset, the Dice-coefficient was 0.9797 (Table 6). We also compare our results with results of different authors’ models. The comparison is shown in Table 7.
Table 5. Segmentation result for original lung images
Data Set |
Training Testing Data Ratio |
Epochs |
Dice-Coefficient |
Kaggle 2D Lung Images |
80:20 |
30 |
0.9706 |
75:25 |
0.9723 |
||
70:30 |
0.9658 |
||
Average |
0.9695 |
Table 6. Segmentation result for enhanced lung images
Data Set |
Training Testing Data Ratio |
Epochs |
Dice-Coefficient |
Kaggle 2D Lung Images |
80:20 |
30 |
0.9783 |
75:25 |
0.9883 |
||
70:30 |
0.9726 |
||
Average |
0.9797 |
Table 7. Result comparison with other works
Author Name |
Data Set |
Model |
Test Size |
Dice-Coefficient Index |
Skourt et al. [7] |
Lung Image Database Consortium image collection (LIDC-IDRI) |
U-Net |
N/A |
0.9502 |
Shamim et al. [8] |
COVID-19 CT image dataset |
Modified U-Net |
20 |
0.9246 |
Saood et al. [10] |
Collection of the Italian Society of Medical and Interventional Radiology |
U-Net |
18 |
0.733 |
Gite et al. [11] |
Montgomery and Shenzhen datasets |
U-Net++ |
20 |
0.9796 |
Riaz et al. [13] |
Medical Segmentation Decathlon Challenge (MSD) |
MobileNetV2 and U-Net |
25 |
0.8793 |
Mique and Malicdem [15] |
National Institute of Health Dataset |
U-Net |
|
0.9496 |
Hofmanninger et al. [20] |
VISCERAL Anatomy3 (VISC-36), LTRC (LTRC-36), and (LCTSC-36) |
U-Net |
20 |
0.97 |
Our Method |
Kaggle Dataset |
U-Net |
25 |
0.9883 (for Enhanced Images in 75:25 ratio) |
This paper reveals that if proper preprocessing and enhancement techniques are applied before segmentation, the segmentation results improve. We proposed model for enhancing images in which complement, dehazing, estimation of illumination by the maximum RGB; and image reflectance were calculated; finally, image fusion by PCA, was applied. We calculated discrete entropy, PSNR, GMSD, MCSD of enhanced images and compare these parameters with existing AGCWD model; we found that our model performs better than AGCWD model especially in entropy and GMSD parameters. Greater entropy is generally preferred as it signifies richer information content and GMSD assesses image quality with high accuracy by evaluating the perceptual quality of a distorted image in relation to a reference image. After enhancement we applied modified U-Net for segmentation on original images and enhanced images of Kaggle Dataset and found that the enhanced images outperform the original images with respect to Dice-coefficient.
As a conclusion, we can say that in this work, some newly developed image preprocessing methods and segmentation methods have been used on particular standard dataset of lung cancer available publicly. The proposed methodology is also need to be tested and validated on new dataset especially Indian geographical location. Though our proposed model is out performing the existing deep learning models for CT scan image based lung cancer classification, still there is a scope to explore some more new image preprocessing methods and segmentation methods to strengthen our proposed model. Another future scope work is this model can also be applied or developed for other kind of medical applications like brain tumor detection, anemia detection, and bone fracture detection etc.
[1] Yin, X.X., Sun, L., Fu, Y., Lu, R., Zhang, Y. (2022). U-Net-based medical image segmentation. Journal of Healthcare Engineering, 2022(1): 4189781. https://doi.org/10.1155/2022/4189781
[2] Rana, S.B., Rana, S.B. (2015). A review of medical image enhancement techniques for image processing. International Journal of Current Engineering and Technology, 5(2): 1282-1286. https://doi.org/10.14741/Ijcet/22774106/5.2.2015.121
[3] Wang, C., Ye, Z. (2005). Brightness preserving histogram equalization with maximum entropy: A variational perspective. IEEE Transactions on Consumer Electronics, 51(4): 1326-1334. https://doi.org/10.1109/TCE.2005.1561863
[4] Chinegeram, K., Kama, R., Reddy, G.R. (2020). Enhancement and segmentation of medical images using AGCWD and ORACM. International Journal of Online and Biomedical Engineering (IJOE), 16(13): 45-57. https://doi.org/10.3991/ijoe.v16i13.18501
[5] Abdullah, M.F., Sulaiman, S.N., Osman, M.K., Karim, N.A., Setumin, S., Isa, I.S. (2022). A new procedure for lung region segmentation from computed tomography images. International Journal of Electrical and Computer Engineering (IJECE), 12(5): 4978-4987. http://doi.org/10.11591/ijece.v12i5.pp4978-4987
[6] Gupta, K.K., Dhanda, N., Kumar, U. (2020). A novel hybrid method for segmentation and analysis of brain MRI for tumor diagnosis. Advances in Science, Technology and Engineering Systems Journal, 5(3): 16-27. http://doi.org/10.25046/aj050303
[7] Skourt, B.A., El Hassani, A., Majda, A. (2018). Lung CT image segmentation using deep neural networks. Procedia Computer Science, 127: 109-113. https://doi.org/10.1016/j.procs.2018.01.104
[8] Shamim, S., Awan, M.J., Mohd Zain, A., Naseem, U., Mohammed, M.A., Garcia-Zapirain, B. (2022). Automatic COVID-19 lung infection segmentation through modified Unet model. Journal of Healthcare Engineering, 2022(1): 6566982. https://doi.org/10.1155/2022/6566982
[9] Lee, B., Yamanakkanavar, N., Choi, J.Y. (2020). Automatic segmentation of brain MRI using a novel patch-wise U-Net deep architecture. Plos One, 15(8): e0236493. https://doi.org/10.1371/journal.pone.0236493
[10] Saood, A., Hatem, I. (2021). COVID-19 lung CT image segmentation using deep learning methods: U-Net versus SegNet. BMC Medical Imaging, 21: 1-10. https://doi.org/10.1186/s12880-020-00529-5
[11] Gite, S., Mishra, A., Kotecha, K. (2023). Enhanced lung image segmentation using deep learning. Neural Computing and Applications, 35(31): 22839-22853. https://doi.org/10.1007/s00521-021-06719-8
[12] Nazir, I., Haq, I.U., AlQahtani, S.A., Jadoon, M.M., Dahshan, M. (2023). Machine learning-based lung cancer detection using multiview image registration and fusion. Journal of Sensors, 2023(1): 6683438. https://doi.org/10.1155/2023/6683438
[13] Riaz, Z., Khan, B., Abdullah, S., Khan, S., Islam, M.S. (2023). Lung tumor image segmentation from computer tomography images using MobileNetV2 and transfer learning. Bioengineering, 10(8): 981. https://doi.org/10.3390/bioengineering10080981
[14] Surono, S., Rivaldi, M., Dewi, D.A., Irsalinda, N. (2023). New approach to image segmentation: U-Net convolutional network for multiresolution CT image lung segmentation. Emerging Science Journal, 7(2): 498-506. https://doi.org/10.28991/ESJ-2023-07-02-014
[15] Mique, E., Malicdem, A. (2020). Deep residual U-Net based lung image segmentation for lung disease detection. IOP Conference Series: Materials Science and Engineering, 803(1): 012004. https://doi.org/10.1088/1757-899X/803/1/012004
[16] Abdullah, M.F., Mansor, M.S., Sulaiman, S.N., Osman, M.K., Marzuki, N.N.S.M., Isa, I.S., Karim, N.K.A., Shuaib, I.L. (2019). A comparative study of image segmentation technique applied for lung cancer detection. In 2019 9th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), Penang, Malaysia, pp. 72-77. https://doi.org/10.1109/ICCSCE47578.2019.9068574
[17] Khan, A. R., Khan, S., Harouni, M., Abbasi, R., Iqbal, S., Mehmood, Z. (2021). Brain tumor segmentation using K-means clustering and deep learning with synthetic data augmentation for classification. Microscopy Research and Technique, 84(7): 1389-1399. https://doi.org/10.1002/jemt.23694
[18] Hua, L., Gu, Y., Gu, X., Xue, J., Ni, T. (2021). A novel brain MRI image segmentation method using an improved multi-view fuzzy c-means clustering algorithm. Frontiers in Neuroscience, 15: 662674. https://doi.org/10.3389/fnins.2021.662674
[19] Geetha Pavani, P., Biswal, B., Sairam, M.V.S., Bala Subrahmanyam, N. (2021). A semantic contour based segmentation of lungs from chest X-rays for the classification of tuberculosis using Naïve Bayes classifier. International Journal of Imaging Systems and Technology, 31(4): 2189-2203. https://doi.org/10.1002/ima.22556
[20] Hofmanninger, J., Prayer, F., Pan, J., Röhrich, S., Prosch, H., Langs, G. (2020). Automatic lung segmentation in routine imaging is primarily a data diversity problem, not a methodology problem. European Radiology Experimental, 4: 1-13. https://doi.org/10.1186/s41747-020-00173-2
[21] Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. arXiv preprint arXiv:1505.04597.
[22] COVID-CTset: A Large COVID-19 CT Scans dataset containing 63849 images from 377 patients. https://github.com/mr7495/COVID-CTset.
[23] The IQ-OTH/NCCD lung cancer dataset. (2023). https://www.kaggle.com/datasets/hamdallak/the-iqothnccd-lung-cancer-dataset, access on Apr., 2025.
[24] 2D & 3D lung segmentation. (2020). https://www.kaggle.com/code/azaemon/2d-3d-lung-segmentation/input, access on Apr., 2025.
[25] Poobathy, D., Chezian, R.M. (2014). Edge detection operators: Peak signal to noise ratio based comparison. IJ Image, Graphics and Signal Processing, 10: 55-61. https://doi.org/10.5815/ijigsp.2014.10.07
[26] Benazir, B.A., Anandraj, D., Arunraj, G., Anitha, D., and Swathi, S. (2017). Improving image quality using gradient magnitude similarity deviation. International Journal of Computer Science and Information Technology Research, 5(2): 18-23. https://www.researchpublish.com/papers/improving-image-quality-using-gradient-magnitude-similarity-deviation.
[27] Wang, T., Zhang, L., Jia, H., Li, B., Shu, H. (2016). Multiscale contrast similarity deviation: An effective and efficient index for perceptual image quality assessment. Signal Processing: Image Communication, 45: 1-9. https://doi.org/10.1016/j.image.2016.04.005