Empirical Investigations to Skin Lesion Detection Using DenseNet Convolutional Neural Network

Empirical Investigations to Skin Lesion Detection Using DenseNet Convolutional Neural Network

Kodepogu Koteswara Rao* Kommuri Rohith Mukkapati Rohith Muttavarapu Saravana Chakradhar Mukthineni Greeshmanth Gaddala Lalitha Kumari Yalamanchili Surekha

Department of CSE, PVP Siddhartha Institute of Technology, Vijayawada 520007, Andhra Pradesh, India

Corresponding Author Email: 
kkrao@pvpsiddhartha.ac.in
Page: 
803-809
|
DOI: 
https://doi.org/10.18280/ts.400242
Received: 
1 December 2022
|
Revised: 
20 January 2023
|
Accepted: 
4 February 2023
|
Available online: 
30 April 2023
| Citation

© 2023 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

The delivery of dermatological services could be completely transformed by the use of teledermatology. Through the use of telecommunications technologies, teledermatology is utilized to communicate medical information to experts to investigate disease. The goal of our research is to identify skin lesions by classifying the image samples of skin lesions that were obtained from various patients. In this work, input data is taken from the “HAM10000” dataset from Kaggle. In the next step, input images are resized using the computer vision library, resizing of images must be done to focus more on the lesion area, splitting of the dataset into training dataset and testing dataset is done. In the next step, 80% of the dataset is used for training and 20% is used for testing. Here we proposed DenseNet Model with five convolutional layers is trained up to 100 epochs by training dataset. The trained DenseNet model is tested on the testing dataset and the accuracy is measured and evaluated. Our experimental investigations emphasize that the detection of skin lesion of input data image.

Keywords: 

DenseNet CNN, lesion, image

1. Introduction

Skin cancers are types of cancer that develop on the skin [1]. They arise as a result of the growth of aberrant cells with the potential to colonize or scattered all across the body. The three primary kinds instances of skin cancer melanoma, squamous-cell cancer, and basal-cell cancer (BCC). Nonmelanoma skin cancer is the term used to describe the first two skin cancers as well as a few less prevalent ones (NMSC) [2]. Basal-cell carcinoma has a sluggish growth rate and has the potential to harm nearby tissue, but it is not likely to metastasize or be fatal. The typical symptom is a hard lump with a scaly top, although it can also turn into an ulcer [3]. The most invasive cancers are melanomas. The mole's size, shape, color, irregular margins, presence of several colors, itchiness, or bleeding are all warning signs. Exposure to UV light from the Sun accounts for more than 90% of instances. Exposure throughout childhood is particularly detrimental for melanomas and basal-cell malignancies [4]. Total exposure is more significant for squamous-cell skin malignancies than the time it occurs. Moles can be the cause of 20-30% of melanomas. Additionally, those with weak immune systems due to HIV or medicines or those with lighter skin are more vulnerable.

Nonmelanoma skin cancer is the most prevalent kind, affecting at least 2-3 million individuals annually [5]. Due to the lack of accurate numbers, this is only a preliminary approximation. About 80% of nonmelanoma skin cancers are basal cell cancers, while 20% are squamous cell cancers. Rarely do deaths from basal-cell and squamous-cell skin malignancies occur.

Approximately [6] 95000 people in the US alone receive a skin cancer diagnosis every day, and two people die from the disease every day on average. By the age of 70, one in two people will have cancer. According to the skin health department's prediction, there will be 196060 recorded instances of melanoma in 2020, 95710 of which will be noninvasive. This is an increase of 2% from the current number. The death rate is however reduced when skin cancer is detected early. To identify melanoma from dermoscopic picture doctors continue to use traditional techniques. The two most widely used techniques are the [7] "ABCDE" rule and "the 7-Point Checklist," which primarily rely on judgments based on asymmetry, border, color, size, evolution, inflammation, and changed feeling. But benign and skin lesions tissues have similar pixels and textures, it can occasionally be challenging to make a diagnosis by sight alone, which results in a high error rate.

Dermoscopy is a popular image-based technique for skin cancer diagnosis in particular [8]. Dermoscopy is an in-vivo device that takes pictures, which are then analyzed by dermatologists in their clinics. Compared to conventional procedures, this imaging technology improves the diagnostic accuracy of skin lesion detection. A skin lesion is a region of your skin that is distinct from the surrounding skin. Lesions on the skin are frequent and can develop as a result of trauma or other skin damage, such as sunburn. Occasionally, they are a symptom of underlying disorders such infections or autoimmune diseases [9]. Although most skin lesions are benign and noncancerous, they can nonetheless be an indication of more serious conditions. In our approach, we use the DenseNet model to classify the skin lesions to produce the results accurately and with in less time.

Benign Skin Lesions:

Benign skin lesions are frequently noncancerous and harmless. These growths on your skin are known as lesions. The majority of benign lesions don't require treatment unless they irritate you or you don't like the way they look. Benign skin lesions include as shown in Figure 1:

  • Acne
  • Birthmarks
  • Body hairs etc.

Figure 1. Benign keratoses

Melanoma Skin Lesions:

Skin cancer refers to malignant lesions of the skin as shown in Figure 2. The most prevalent cancer in the US is skin cancer. Some of the examples are:

  • a wound that is chronic.
  • fresh skin growth.
  • alteration of an existing mole or growth.

Figure 2. Melanoma

Figure 3. Melanocytic nevi

Figure 4. Basal cell carcinoma

To identify skin lesions efficiently as per our experimental investigations with respect to the dataset HAM10000 the popular 7 kinds of skin lesions are as shown in Figures 3-7:

  • Melanocytic nevi
  • Melanoma
  • Benign Keratosis
  • Basal Cell Carcinoma
  • Actinic Keratoses
  • Vascular Lesions
  • Dermatofibroma

Figure 5. Actinic keratoses

Figure 6. Vascular lesions

Figure 7. Dermatofibroma

2. Literature Survey
  • Ghalejoogh et al. [10] presented an automated skin lesion detection system. In order to remove hair from the lesion images pre-processing was employed. Then the lesion image segmentation is done using Otsu thresholding. Feature extraction is done based on the colour, shape and texture. Feature selection is done using Wrapper methods. Hierarchical structure-based stacking is used to classify the skin lesions
  • Al Masani et al. [11] performed the evaluation of skin lesion CAD. He proposed that more images. The accuracy of classification also improved when the model is fed with the segmented lesion images. It proved ResNet-50 provided a better accuracy when compared to other CNN models.
  • Xie et al. [12] presented a model to classify the skin lesion as malignant or benign. The model is deployed to work on the incomplete lesion images present in the dataset. Dimensionality reduction is used to eliminate the noisy features. Dimensionality reduction used is PCA. An ensemble model of BP network and fuzzy network is used for classification.
  • Hekler et al. [13] presented a model combining the human intelligence and artificial intelligence. A CNN model is trained with 11,000 images and 117 doctors were used to classify the skin lesions. This diagnosis is combined into a classifier. The computation time is higher for the model developed.
  • Sae-Lim et al. [14] presented an MobileNet CNN model. MobileNet is used for classification of the skin lesions. The efficiency of the classifier is improved by data up sampling and data augmentation. The model is tested on the HAM10000 dataset. The model gave better accuracy and precision and f-score compared to existing methods.
  • Kim et al. [15] presented an approach for automatic detection of the salient regions. The saliency map is constructed using a linear combination of colours. The trimap is constructed overcomes the limitation of the saliency map. The model performed better on three datasets.
  • Chen et al. [16] presented a method called ESSL which is derived from ELM classifier to classify the multiple classes of skin lesion. ESSL performed better than the SVM. This model can handle the memory issue. The model can classify the skin lesions in large datasets.
  • Tang et al. [17] presented a Global part CNN model. The model treats the global and local information equally. The G-CNN model is used to extract the global information of dermoscopy images. The P-CNN model is used to extract the local information of the lesion images.
  • Afza et al. [18] presented a model for skin lesion detection using statistical normal distribution and optimal feature selection. Statistical normal distribution is used for segmentation of lesion images. Histogram, colour are the extracted features. The selected features are fed into a CNN model. Best accuracy is obtained for cubic function.
  • The existing models tested their accuracy with only one dataset and the models may not work properly for imbalanced datasets. The models take more time for computation. A model is proposed to perform accurately on all the datasets and must work on imbalanced datasets.
3. Problem Statement

Skin cancers are types of cancer that develop on the skin. They arise as a result of the growth of aberrant cells with the potential to colonize or spread throughout the body. The three main kinds of skin cancer are melanoma, squamous-cell cancer, and basal-cell cancer (BCC). Nonmelanoma skin cancer is the term used to describe the first two skin cancers as well as a few less prevalent ones (NMSC). Basal-cell carcinoma has a sluggish growth rate and has the potential to harm nearby tissue. The most invasive cancers are melanomas.

Skin cancer is not the dangerous cancer but the late identification of it can cause death. It can be cured if it is detected at the early stage. The detection technique of skin cancer is known as “Dermoscopy”. Dermoscopy is a tool that detects the skin cancer. To an alternate for the Dermoscopy technique we can use the deep learning algorithm. The deep learning algorithm which is used for image processing and image detection is DenseNet Convolution Neural Network. By using the DenseNet Convolutional Neural Network, we are developing a model which detects the skin lesion type.

The model can be linked with the technology known as “Teledermatology”. Teledermatology is a technique in which the medical services use the telecommunication networks. By using this technology, the patient can know their problem without directly going to doctor. The patient can know about the disease at his home by himself. After that he can contact the doctor for the treatment.

The objective of the proposed research is to develop a low-cost model which can detect the lesion and able classify the lesion type with the help of the image that has been taken by the patient in his mobile in the less time. So, the identification of the cancer can be done at the early stage and the patient can be treated.

  • Dataset:

The dataset used is “HAM10000”. [19] It is the official dataset for skin lesion classification which is available in Kaggle. The HAM10000 dataset contains the 10015 images of the seven different types of the skin lesions. The model is trained with this dataset can be seen in Figure 8.

Figure 8. HAM10000 dataset

Attributes in HAM10000 dataset:

  • lesion_id: lesion_id is the serial number assigned to the skin lesion taken from the patient.
  • image_id: image_id is the id number that is assigned to the image of the skin lesion. There are 10,015 images in the dataset.
  • dx: dx means diagnosis. Diagnosis means the type of skin lesion diagnosed. There are 7 types of skin lesions in the dataset. They are:
  1. Melanocytic nevi represented as ‘nv’ in the dataset.
  2. Melanoma represented as ‘mel’ in the dataset.
  3. Benign Keratosis represented as ‘bkl’ in the dataset.
  4. Basal Cell Carcinoma represented as ‘bcc’ in the dataset.
  5. Actinic keratoses represented as ‘akiec’ in the dataset.
  6. Vascular lesions represented as ‘vasc’ in the dataset.
  7. Dermatofibroma represented as ‘df’ in the dataset.
  • Age: The age of the patient from whom the skin lesion image is taken.
  • Sex: The gender of the person from whom the skin lesion image is taken.
  • Localization: The part of the body from which the image of the skin lesion is collected.
4. Problem Methodology and Solution

4.1 Proposed methodology

The Entire process of the proposed work can be seen in Figure 9. In this work, input data is taken from the “HAM10000” dataset from Kaggle. In the next step, input images are resized using the computer vision library, resizing of images must be done to focus more on the lesion area, splitting of the dataset into training dataset and testing dataset is done. In the next step, 80% of the dataset is used for training and 20% is used for testing. Here we proposed DenseNet Model with five convolutional layers is trained up to 100 epochs by training dataset. The trained DenseNet model is tested on the testing dataset and the accuracy is measured and evaluated. Our experimental investigations emphasize that the detection of skin lesion of input data image.

Figure 9. Block diagram for proposed methodology

4.2 Modules

Importing the required libraries:

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from tensorflow. Keras. Preprocessing. Image import ImageDataGenerator

  • NumPy:

NumPy is a predefined library in python. Numerous mathematical operations can be carried out on arrays with NumPy.

  • Pandas:

Pandas is an open-source library designed primarily for using relational or labelled data quickly and easily.

  • Matplotlib:

For 2D displays of arrays, the Python visualization tool Matplotlib is fantastic.

  • Tensorflow:

TensorFlow, an Open-Source deep learning and machine learning library, is useful for voice search, text-based applications, image identification, and many other things.

  • ImageDataGenerator:

ImageDataGenerator is used to transform the input image like rotating, resizing etc.

Reading the input data:

With the help of pandas, we can read the HAM10000 dataset which is required in our project. HAM10000 (Human Against Machine) is an image dataset which is published by Vienna Medical University. This dataset is adopted for skin lesion classification as seven different types of classes. This dataset contains of about 10,015 skin lesion image samples which are collected from different age groups and different parts of the patients.

Resizing the images in the dataset:

The most crucial step is resizing the photographs. The dataset's original image will be 500 x 500 pixels in size. The lesion area, the healthy area, and the skin's hair can all be seen in this photograph. The photograph needs to be enlarged in order to highlight the lesion region more. The original image has been resized to 100 × 100 in size. Resizing places additional emphasis on the lesion region. The accuracy of the classification is impacted by this step. It is necessary to resize and store in an array all 10,015 photos. With the use of computer vision, the image is resized. The fundamental segmentation technique we use is image resizing. By this step, we focus more on the lesion area by leaving the unwanted skin parts. The output of the entire resizing image can be seen in Figure 10.

Figure 10. Output of resizing the image

Splitting the dataset:

The splitting of dataset as training dataset and testing dataset is done in this step. The training dataset will be used for training the DenseNet CNN model. The trained DenseNet model will be tested on the testing dataset. In order to avoid underfitting and overfitting, 80% of the dataset will be used as training dataset and 20% will be used as testing dataset.

The size of the total dataset is 26,115 images of dimension 100 x 100.

The size of our training set is 20,892 images.

The size of our testing dataset is 5,223 images.

Training DenseNet CNN model:

The architecture of DenseNet is mentioned in the below Figure 11. The two types of layers convolution and pooling are alternative. The number of filters increase as the network moves from left to right. The last stage of the network contains one or more fully connected layers which classifies the output. The architecture of DenseNet consists mainly four types of layers convolutional layer, dense block, pooling layer and fully connected layer.

Figure 11. DenseNet CNN architecture

  • Dense Block: Dense Blocks links all layers directly with matching feature-map sizes. In order to maintain the feed-forward nature, each layer receives extra inputs from all levels that came before it and transmits its own feature-maps to all layers that came after it.
  • Covolution: The use of convolution is to identify the suitable attributes from the image that is taken as the input.
  • Pooling: Pooling is also known as down sampling. It is used in the reduction of the unwanted features that are extracted.
  • Fully connected: The classification is done by the fully connected layer. The outputs of convolution and pooling passes to fully connected layer. These outputs are used to classify the input image.
  • ReLu: ReLu means “Rectified Linear Unit”. “ReLu” activation function is used to make the negative features extracted by the model as zero. If the features extracted are positive then those features will be sent to the next layers. If the features are negative then they will made as zero.
  • Batch Normalization: Every layer of the network can learn more independently thanks to a layer called batch normalization. It is used to normalise the output of the preceding layers. The normalising activations scale the input layer. Usage of batch normalization makes learning more effective.
  • Dropout: The regularization method used to stop overfitting in the model is called dropouts. A certain percentage of the network's neurons are switched at random with the addition of dropouts. The incoming and outgoing connections to the neurons are also turned off when they are turned off. To help the model learn more effectively, this is done.
  • SoftMax: SoftMax is used to label the regions present in the image. The lesion area is labelled as ‘1’ and the healthy area is labelled as ‘0’. SoftMax layer is the output layer. It gives the classified output image.

The DenseNet model is trained on the training dataset. At first input image of dimensions 100 x 100 x 3 is given as input to covolution layer. The number of filters is 96 in first convolution layer. The output of the first convolution layer is sent to max pooling layer for down sampling. The second convolution layer takes the output of max pool layer as input. It contains 256 filters. These operations are performed until the image is flatten out. The dropout factor is 0.5. The activation function used in hidden layers is ‘ReLu’ and activation function used in output layer is ’SoftMax’.

Testing the model:

The trained model is tested on the testing dataset which contains 5,233 images. The model accurately predicted 4,244 images.

Measuring the accuracy:

There are seven classes of lesions in the HAM10000 dataset. They are:

  • Melanocytic nevi
  • Melanoma
  • Benign-Keratoses
  • Basal cell carcinoma
  • Actinic keratoses
  • Vascular lesions
  • Dermatofibroma

The accuracy of each class is measured as per the predicted output of the lesion. If the predicted type matches with the type of the lesion present in the dataset the accuracy is counted.

Skin Lesion Detection:

The output of the model is the prediction of the type of the lesion. The result is displayed in the Figure 12.

Figure 12. Skin lesion detection

5. Results

First, the images are resized into 100 x 100 standardized size. The images are turned in each and every angle. The image is taken as input to the first network layer. The operations like Max Pooling, Coevolution, etc. are applied to extract the features like color, shape. These operations are performed until image is flatten. The batch size is 32 and the number of epochs is taken as 100.

Accuracy of the model with no. of epochs is mentioned in Table 1.

Table 1. Accuracy for no. of epochs

S.NO

No. of epochs

Accuracy

1

 20

65.64

2

50

79.01

3

80

86.63

4

100

90.02

Figure 13 shows the accuracy of the model for 100 epochs. X-axis shows epochs and Y-axis shows accuracy.

Figure 13. Accuracy of the model

Figure 14 describes the accuracy of model for the type of skin lesion.

Table 2 compares the accuracy of the model with the existing models on three different datasets.

Figure 14. Accuracy for type of skin lesion

Table 2. Comparison of accuracy

MODEL

DATASET

ACCURACY

GP-CNN

ISIC 2016

89.6%

PROPOSED

ISIC 2016

90.02%

GP-CNN

ISIC 2017

87.56%

PROPOSED

ISIC 2017

88.52%

CNN model with Human and AI

HAM10000

82.95%

MobileNet CNN

HAM10000

83.93%

PROPOSED

HAM10000

90.02%

6. Conclusion and Future Scope

6.1 Conclusion

In this work, DenseNet CNN model is described for detection of skin lesion. Several steps are there for detection of skin lesion using DenseNet CNN. HAM10000 dataset is used in this work. The dataset contains of 10,015 skin lesion image samples which are collected from various people. The original images are resized using computer vision and stored. DenseNet model is trained with the resized images. Finally, the results of the DenseNet CNN model are compared with the GP-CNN and MobileNet CNN models. The accuracy of DenseNet is more compared to GP-CNN and MobileNet CNN models. Obtained accuracy on ISIC 2016 is 90.02% and on ISIC 2017 is 88.52% and on HAM10000 is 90.02%.

6.2 Future scope

In this methodology, we used CNN and it has taken more time for giving the result. As, to reduce the computational time and to give more accurate results the model can be developed using the CNN for segmentation. The model can also be made as a web app where the user can directly upload the image of the lesion and can know the type of the lesion. So, the patient can identify the cancer at the early stage.

  References

[1] Rogers, H.W., Weinstock, M.A., Feldman, S.R., Coldiron, B.M. (2015). Incidence estimate of nonmelanoma skin cancer (keratinocyte carcinomas) in the US population, 2012. JAMA Dermatology, 151(10): 1081-1086. https://doi.org/10.1001/jamadermatol.2015.1187

[2] Cancer facts and figures 2020. American Cancer Society. https://www.cancer.org/research/cancer-facts-statistics/all-cancer-facts-figures/cancer-facts-figures-2020.html, accessed on Jan. 2020. 

[3] Abdelhalim, I.S.A., Mohamed, M.F., Mahdy, Y.B. (2021). Data augmentation for skin lesion using self-attention based progressive generative adversarial network. Expert Systems with Applications, 165: 113922. https://doi.org/10.1016/j.eswa.2020.113922

[4] Liu, X., Chen, C.H., Karvela, M., Toumazou, C. (2020). A DNA-based intelligent expert system for personalised skin-health recommendations. IEEE Journal of Biomedical and Health Informatics, 24(11): 3276-3284. https://doi.org/10.1109/JBHI.2020.2978667

[5] Duggani, K., Nath, M.K. (2021). A technical review report on deep learning approach for skin cancer detection and segmentation. In: Khanna, A., Gupta, D., Pólkowski, Z., Bhattacharyya, S., Castillo, O. (eds) Data Analytics and Management. Lecture Notes on Data Engineering and Communications Technologies, vol 54. Springer, Singapore. https://doi.org/10.1007/978-981-15-8335-3_9

[6] Mitra, A., Khaitan, S., Abidi, A.I., Chakraborty, S. (2021). Diagnosing Alzheimer’s disease using deep learning techniques. In: Al-Turjman, F., Kumar, M., Stephan, T., Bhardwaj, A. (eds) Evolving Role of AI and IoMT in the Healthcare Market. Springer, Cham. https://doi.org/10.1007/978-3-030-82079-4_5

[7] Carnevale, L., Celesti, A., Fazio, M., Villari, M. (2020). A big data analytics approach for the development of advanced cardiology applications. Information, 11(2): 60. https://doi.org/10.3390/info11020060

[8] Fan, H., Xie, F., Li, Y., Jiang, Z., Liu, J. (2017). Automatic segmentation of dermoscopy images using saliency combined with Otsu threshold. Computers in Biology and Medicine, 85: 75-85. https://doi.org/10.1016/j.compbiomed.2017.03.025

[9] Zhou, S.K., Chellappa, R. (2005). Beyond one still image: Face recognition from multiple still images or a video sequence. Face Processing: Advanced Modeling and Methods, 547-567.

[10] Ghalejoogh, G.S., Kordy, H.M., Ebrahimi, F. (2020). A hierarchical structure based on stacking approach for skin lesion classification. Expert Systems with Applications, 145: 113127. https://doi.org/10.1016/j.eswa.2019.113127

[11] Al-Masni, M.A., Kim, D.H., Kim, T.S. (2020). Multiple skin lesions diagnostics via integrated deep convolutional networks for segmentation and classification. Computer Methods and Programs in Biomedicine, 190: 105351. https://doi.org/10.1016/j.cmpb.2020.105351

[12] Xie, F., Fan, H., Li, Y., Jiang, Z., Meng, R., Bovik, A. (2016). Melanoma classification on dermoscopy images using a neural network ensemble model. IEEE Transactions on Medical Imaging, 36(3): 849-858. https://doi.org/10.1109/TMI.2016.2633551

[13] Hekler, A., Utikal, J.S., Enk, A.H., Hauschild, A., Weichenthal, M., Maron, R.C., Berking, C., Haferkamp, S., Klode, J., Schadendorf, D., Schilling, B., Holland-Letz, T., Izar, B., von Kalle, C., Fröhling, S., Brinker, T.J., (2019). Superior skin cancer classification by the combination of human and artificial intelligence. European Journal of Cancer, 120: 114-121. https://doi.org/10.1016/j.ejca.2019.07.019

[14] Sae-Lim, W., Wettayaprasit, W., Aiyarak, P. (2019). Convolutional neural networks using MobileNet for skin lesion classification. In 2019 16th International Joint Conference on Computer Science and Software Engineering (JCSSE), pp. 242-247. https://doi.org/10.1109/JCSSE.2019.8864155

[15] Kim, J., Han, D., Tai, Y.W., Kim, J. (2015). Salient region detection via high-dimensional color transform and local spatial support. IEEE Transactions on Image Processing, 25(1): 9-23. https://doi.org/10.1109/TIP.2015.2495122

[16] Chen, C., Gan, Y., Vong, C.M. (2020). Extreme semi-supervised learning for multiclass classification. Neurocomputing, 376: 103-118. https://doi.org/10.1016/j.neucom.2019.09.039

[17] Tang, P., Liang, Q., Yan, X., Xiang, S., Zhang, D. (2020). GP-CNN-DTEL: Global-part CNN model with data-transformed ensemble learning for skin lesion classification. IEEE Journal of Biomedical and Health Informatics, 24(10): 2870-2882. https://doi.org/10.1109/JBHI.2020.2977013

[18] Afza, F., Khan, M.A., Sharif, M., Rehman, A. (2019). Microscopic skin laceration segmentation and classification: A framework of statistical normal distribution and optimal feature selection. Microscopy Research and Technique, 82(9): 1471-1488. https://doi.org/10.1002/jemt.23301

[19] Tschandl, P., Rosendahl, C., Kittler, H. (2018). The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific Data, 5(1): 1-9. https://doi.org/10.1038/sdata.2018.161