A Binary Multi Class and Multi Level Classification with Dual Priority Labelling Model for COVID-19 and Other Thorax Disease Detection

A Binary Multi Class and Multi Level Classification with Dual Priority Labelling Model for COVID-19 and Other Thorax Disease Detection

Lakshmi Narayana Gumma* | Ramalingam Thiruvengatanadhan Pattusamy Dhana Lakshmi | Kurakula LakshmiNadh

Department of Computer Science & Engineering, Annamalai University, Tamilnadu 608002, India

Department of Computer Science & Engineering Narasaraopeta Engineering College, Andhra Pradesh 522601, India

Corresponding Author Email: 
gummalakshminarayana62@gmail.com
Page: 
657-664
|
DOI: 
https://doi.org/10.18280/ria.360501
Received: 
20 September 2022
|
Revised: 
10 October 2022
|
Accepted: 
20 October 2022
|
Available online: 
23 December 2022
| Citation

© 2022 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

Thorax diseases are most diagnosed through medical images and are manual and time-consuming. The recent COVID-19 pandemic has demonstrated that machine learning systems can be an excellent option for classifying these medical images. However, a confidence classification in this context is the need. During COVID-19, we first need to detect and isolate COVID-19 patients. When it comes to diagnosing and preventing thoracic disorders, nothing beats the convenience and low cost of a chest X-ray. According to expert opinion on screening chest X-rays, abnormalities were most commonly found in the lungs and hearts. However, in fact, acquiring region-level annotation is costly, and model training mostly depends on image-level class labels in a poorly supervised way, making computer-aided chest X-ray filtering a formidable obstacle. Hence, in this work, we propose a binary, multi-class, and multi-level classification model based on transfer learning models ResNet-50, InceptionNet, and VGG-19. After that, a multi-class classifier is used to know which class it mostly be- longs to. Finally, the multi-level classifier is used to know how many diseases the patient suffers from. This research presents a Binary Multi Class and Multi Level Classification with Dual Priority Labelling (BMCMLC-DPL) model for COVID-19 and other thorax disease detection. Using state-of-the-art deep neural networks (ResNet-50), we have shown how accurate the classification of COVID-19, along with 14 other chest diseases, can be performed. Our classification technique thus achieved an average training accuracy of 98.6% and a test accuracy of 96.52% for the first level of binary classification. For the second level of 16 class classification, our technique achieved a maximum training accuracy of 91.22% and test accuracy of 86.634% by using ResNet-50. However, due to the lack of multi-level COVID-19 patient data, multi-level classification is performed only on 14 classes, showing the state-of-the-art accuracy of the system.

Keywords: 

chest X-ray images (CXR), CNN, COVID-19 pneu-monia detection, deep learning, medical image

1. Introduction

The COVID-19 pandemic emerged in 2019 cases significant health concerns. Symptoms of it are mild or severe. Mild symptoms are fever, cough, tiredness, loss of taste and smell, sore throat, etc. The severe symptoms are difficulty in breathing or shortness of breath [1], chest pain, loss of speech and mobility, etc. In extreme cases, COVID-19 mainly affects in lungs [2]. The damage to lungs, ARDS, Pneumonia, etc., are subsequent and life-threatening. Specifically, it becomes deadly when the patient has other underlying illnesses, respiratory complications, or both [3].

Radiological imaging techniques, such as chest X-rays (CXR) and computed tomography (CT) scans [4], have been widely employed for diagnosis for quite some time. In contrast to CT scans, however, CXR is more convenient, less expensive, and safer for patients [5]. Predicting the presence of COVID-19 using CXR pictures can aid in the treatment of the individual patient and prevent the spread of the virus to other members of the community [6]. CXR has been shown to be effective in the diagnosis of 14 more disorders, all of which we have now catalogued. As a result, the patient's primary problems can be identified using multi-class categorization, which employs a total of 15 classes (including COVID-19 infections). We can also identify other diseases that a patient with COVID-19 [7] is experiencing by using multi-level categorization [8]. As a whole, this type of model can aid in the spotting and isolating of patients, as well as the various treatment procedures [9].

The use of deep learning for such categorization tasks has been demonstrated by the research. Three excellent examples of transfer learning with impressive track records on ImageNet [10] and other classification tasks are ResNet-50, InceptionNet, and VGG-19. Because of this, this work employs these transfer learning models with tweaks for binary, multi-class [11], and multi-level classification tasks [12]. In order to test the effectiveness of the proposed approach, we have used the NIH Chest X-ray data set for 14 thorax disorders [13]. Images of COVID-19 and normal CRXs are sourced from the covid-chest-Xray (kaggle) data set, the Radiology Assistant data set, and GitHub. By doing the necessary pre-processing and data augmentation, we were able to generate normalised, massive amounts of data with a wide range of characteristics [14]. To start, we used binary classification (COVID-19 vs. other classes, including the normal, to categorise the COVID-19 patient). In the case of a patient with COVID-19, we have performed multi-class and multi-level categorization [15]. There are 15 categories used in multi-class classification [16]. This action is useful in determining what disorders are causing the patient the most discomfort [17]. Then, we used multi-level classification to identify the patient's whole medical history. We used 14 different categories of photos to train our multi-level classifier [18]. Finally, ResNet-50, InceptionNet, and VGG-19 each underwent multiple runs to ensure reliable and accurate results, ruling out the possibility of random outcomes [19]. The results show that, across the board, ResNet-50 performs the best.

When it comes to screening and clinical diagnosis, chest X-ray screening is one of the most accessible and cost-effective radiological tests [20]. In clinical practise, the interpretation of chest X-ray pictures relies greatly on the skill and experience of radiologists. It takes a lot of time, and there is room for subjective evaluation mistakes [21]. As a result, it is highly wanted to create a computer-aided disease diagnosis to aid doctors in their work. In recent years, many proposals have been made utilising deep learning to automatically diagnose thoracic disorders for chest X-ray pictures, with impressive results in areas such as disease classification [22], anomaly detection [23], chest X-ray segmentation [24], and disease prediction [25]. Our work focuses on the disease categorization job of computer-aided diagnosis for chest X-rays. Due to the low quality and poor sensitivity of chest X-ray pictures, the classification process is particularly difficult for computer-aided screening. The numerous thorax diseases are shown in Figure 1.

Figure 1. Thorax diseases representations

Artificial neural network (ANN) architecture [26] are increasingly being incorporated into the development of classification systems for medical diagnosis [27]. When compared to conventional pattern identification strategies, modern neural network architectures like the multilayer neural network (MLNN) [28], stochastic neural networks (PNN) [29], learning vector quantization (LVQ) [30], reminds neural network (GRNN) [17], and radial basis function (RBF) have proven to be superior in diagnosing diseases like chest diseases. Different neural network-based classification algorithms have also been used in the field of diagnosing chest diseases. Studies of chest disorders using neural networks have been conducted previously. Structure of a learning vector quantized neural network is classified according to how close the unknown data is to the aforementioned models. There are two types of layers in a learning vectors quantization neural network: the competitive layer and the linear output layer. The competitive layer is responsible for the classification of input vectors. The linear output layer is responsible for mapping the classes of the competing layer to the target classes of the user's specification.

The rest of the paper is organized as follows: in Section 2, we have presented the related work. The proposed system is described in Section 3. In Section 4, the experimental setup and result analysis are presented. Finally, with the future work direction, the paper is concluded in Section 5.

2. Literature Survey

Recently, Artificial Intelligence-based systems, specifically deep learning-based systems, grabbed the researcher’s attention because of their significant ability. This section has discussed such systems that can detect COVID-19 to a large extent concerning standard and other chest-infected medical images.

The fuzzy Colouring technique is used for pre-processing, then MobileNetV2 and SqueezeNet, are used for features extraction, and finally, Support Vector Machines (SVM) are used to classify images into three classes (normal, pneumonic, and COVID-19 positive) [1]. In COVID-Net, a deep 121-layer dense convolutional neural network [2] incorporated with CheXNet is used for three-class classification (normal, Pneumonic, COVID-19) and four-class (normal, viral Pneumonia, bacterial Pneumonia, and COVID-19) classification of images. A twenty-two layered convolutional architecture with Sigmoid, ReLU, and Leaky ReLU is proposed for two classes (COVID-19 and non-COVID-19), for three- classes (normal, Pneumonic, COVID-19), and four categories (normal, viral Pneumonia, bacterial Pneumonia, and COVID-19) [3] classification concerning COVID-19 in CoroDet. Practical analysis of chest-related diseases concerning image and textual data, and a deep neural network model, along with Generative Adversarial Networks (GAN) [4] based synthetic textual data processing, is performed, showing the model’s contribution in terms of COVID-19 detection. Apostolopoulos and Bessiana demonstrated that CNN with VGG-19 [5] and MobileNet architecture could automatically detect and extract the essential COVID-19, Pneumonia, and normal image features with 97.8% accuracy. Ozturk et al. showed that DarkCovidNet (you only look once (YOLO), with seventeen convolutional layers), can be used to classify the X-ray images in real-time into similar three classes (COVID-19, Pneumonia, and normal) with 87%, and two classes (COVID-19 and non-COVID-19) with 98.08% accuracy. A CNN- based model is used to differentiate normal and Pneumonia X-ray images [6], andCOVID-19 and Pneumonia X-ray images. A deep convolutional neural network Decompose, Transfer, and Compose, called DeTraC, can be used for three classes: COVID-19, normal, and SARS image classification [7], by exploiting its class decomposition advantages to deal with data-set irregularities [8].

While many studies have focused on using CT scans and x-rays to diagnose COVID, Mangal et al. [9] used a variety of deep learning approaches to compare x-ray images from the COVID-19 study. Accuracy ranges from 96 to 99% for DenseNet121, ResNet50, VGG16, and VGG19, with VGG16 and VGG19 claiming an overall accuracy of 99.33 percent. Moreover, Ozturk et al. [10] discovered that ResNet50 achieved the maximum accuracy of 98.22% when training on X-ray datasets [11]. However, most studies only use modest samples sizes. When the dataset grows larger, inefficiencies become more apparent. Our research therefore measures the consequences of analysing the X-ray dataset at scale and analyses its performance. Smit et al. [12] uses multiple pre-trained fully convolutional networks (CNN) on pulmonary x-rays to categorise x-rays into two classes, pneumonia and non-pneumonia, by altering the parameters and number of layers of the CNN. She assessed the accuracy of four different pre-trained models (VGG16, VGG19, ResNet50, and Inception-v3), finding that they ranged from 87.28% to 88.46%.

In a similar vein, a comparison of different deep convolutional neural networks (DCNNs) for tuberculosis prediction reveals that Densenet201 performs better than competing methods [13]. Further, Wang et al. [14] provides an examination of the use of various CNN algorithms for tuberculosis prediction under a variety of tuberculosis classification schemes. The performance of these algorithms in detecting COVID-19 patients was analysed [15], which suggested a deep learning-based technique employing Densenet-121 trained on radiology images produced by the CheXnet model. Two convolutional neural network models, Xception and VGG16, are used to represent the pneumonia investigation [16]. A CNN-based MobileNet design for plant disease detection was proposed by Martínez Chamorro et al. [17].

For the purpose of automating the diagnosis of pulmonary tuberculosis from chest x-rays, Hwang et al. [18] trained a deep CNN. Both the dual CNNs AlexNet and GoogLeNet were used for this classification task. Prior to analysis, the dataset was cleaned. With an accuracy of 0.99, their model was very reliable. In addition to a high degree of sensitivity (97.3%), their model also demonstrated excellent specificity (100%).

3. Proposed Model

3.1 Dataset description

The COVID-19 chest X-ray images and normal person CXT images are collected from three different sources (Kaggle, 20201, Radiology Assistant, 20202, and GitHub 20203). These datasets include frontal views of X-ray images of the lungs from three groups: normal patients, patients with bacterial or viral pneumonia, and COVID-19 infected patients.

For the other 14 chest diseases, CXR images are taken from NIH Chest X-ray Dataset4, provided by the National Institute of Health (NIH). It comprises 112,120 CXR images of 30,805 unique patients, out of which eight classes are extensions of the ChestX-ray8 dataset [14, 3]. Hence, in total, 16 classes of data are used: 1. Normal, 2. COVID-19, 3. Atelectasis, 4. Cardiomegaly, 5. Consolidation, 6. Edema, 7. Effusion, 8. Emphysema, 9. Fibrosis, 10. Hernia, 11. Infiltration, 12. Mass, 13. Nodule, 14. Pleural Thickening, 15. Pneumonia, 16. Pneumothorax.

3.2 Pre-processing and augmentation

We have pre-processed and augmented the images to clean and normalize the input images and to deal with class imbalance and image variety problems. Pre-processing is applied to training and testing data, whereas augmentation is applied to only training data. Pre-processing and augmentation steps are enumerated below:

1. Image resizing: CXR images are converted into square images of fixed size (224,224) to become suitable input for the deep learning system.

2. Image normalization: histogram and contrast limited adaptive histogram equalization (CLAHE) normalization to produce darker and sharper (bone enhanced) images. This step enhanced the images by adjusting the contrast and brightness.

3. Position augmentation: flipping, rotation, is applied to images and saved the generated images. The images are flipped horizontally or vertically. Some of the images are rotated at various angles.

4. Random noise: We added salt and pepper noise to convert pixels to be completely black or white and saved the image after noise addition. This reduces the overfitting problem of the wrong elements.

5. Color augmentation: the brightness, contrast, hue and saturation of the pixel of the images are changed at different ratios and saved the generated images. The resultant image becomes darker or lighter than the original one by adjusting the brightness. The contrast changes the degree of separation between an image’s darkest and brightest areas. Saturation is the separation between colours, and the shade in an image changes by applying hue.

The data collection includes 5,856 verified Chest X-ray pictures. Both a training set and a testing set of photos from separate patients are used. Each X-ray is labelled with one or more diagnoses, indicating possible relationships between conditions. With calculations of horizontal and vertical gradients, Sobel filter is simply used for identification of borders between the top and bottom of the image, or the left and right sides of the frame. The Sobel operator will reveal whether the image's transitions are sudden or gradual. The Scharr operator is an adjustment where the mean squared angular error is calculated, which is both an optimization and the introduction of greater sensitivity. As a result, the kernels can be derivative in character and more closely approximate Gaussian filters for performing image pre-processing. The borders and increased transparency of the skeleton provide crucial visual data for the CNN model.

3.3 Classification model

The proposed system aims to identify whether a person is COVID-19 positive or not. Suppose a person is COVID-19 positive, then for his diagnosis purpose. In that case, we need to know the most dominating disease by which a person is suffering most and how many other diseases he is suffering from. For this purpose, we have designed a three-level classification model: binary, multi-class and multi-level classification as shown in Figure 2. Firstly, to determine whether a person suffers from COVID-19 or not binary classifier is used.

Figure 2. Overview of the proposed architecture

Finally, the multi-level classifier is used with 14 classes (except COVID-19 positive and normal classes) to know the other lung diseases from which a COVID-19 positive person is suffering.

Hence, an image can be classified into more than one class, independent of each other. Thus, in place of softmax activation, we use the sigmoid activation function on the final layer to convert the values between 0 to 1 independent of the other scores. If the obtained value is more than and equal to 0.5, it is classified into that class. Thus, an image can fall into many classes because of independent calculation. One hot vector and binary cross entropy as loss function is used.

The components of convolutional neural networks (CNN) are similar to those of artificial neural networks (ANN) in that they are neurons, and these components are capable of optimising themselves via a phenomena known as self-learning. In a CNN, each neuron takes in a standard form of information and processes it in a predetermined way. This network takes as its input a series of raw picture vectors and produces as its output the class score for a given input vector. Each and every stage and node of this network is imbued with the perceptive score weight function. The terminal layer of the network is linked to several class-specific loss functions. CNNs retain all the standard features and capabilities of ANNs. Unlike ANNs, these networks are tailored specifically for recognising visual patterns. Due to the complexity of the calculations, the classical ANNs have a low computational efficiency for image-related data. In addition, a CNN requires much less preprocessing than other similar algorithms.

Nasopharyngeal swab reverse transcription polymerase chain reaction (RT-PCR) assays are the gold standard for diagnosing COVID-19 infections. Although a rapid diagnosis of infected patients is desirable, it may be hindered by factors such as a high false-negative rate, a lengthy test, and a lack of RT-PCR assay kits during the outbreak's early phases. Imaging the lung in a patient with a COVID-19 infection is best done with computed tomography (CT) or a chest X-ray (CXR). Unlike a swab test, CT and CXR can pinpoint the exact area of the disease or damage. Consolidation of the air spaces in the lungs, which is shown as a hazy periphery on a chest x-ray, is the hallmark disease of a CXR. Benefits of imaging include high sensitivity, rapid results, and clear visualisation of the lung infection's scope. One drawback of imaging is that it lacks specificity, making it difficult to identify the cause of a lung infection and to determine its severity.

Three images are shown in Figures 3, where the first image is levelled with Effusion and Pneumothorax tag. The second image is tagged with Consolidation, Effusion, and Infiltration and the third image as a tag with no-findings (normalimage). Hence, the one-hot vector is utilized for multi-tag representation.

Figure 3. Images with multi-levels

To train and test the VGG-19, InceptionNet, and ResNet-50 are used. Thismodel uses 224 * 224 sized RGB images. We have used a transfer learning environment with modification of the last layers to produce the required output—first pre-processing needs to perform. We need to call their pre-processing unit, which will convert RGB to BGR and zero-centred each colour channel concerning the Image Net dataset. These models are shown in Figure 4.

Figure 4. VGG-19, InceptionNet, and ResNet-50 architectures

The CNN can be broken down into 5 distinct phases. In the first step, known as the input layer, a picture is used as the network's starting point. On the other hand, the image is convolved by the convolution layer, which takes the image obtained from the previous step and uses it to pull out the important and distinguishing elements, such as edges, colours, and corners. A known trainable parameter matrix is multiplied by this layer and then added to the picture segment matrix. The final outcome of the dot product is a feature matrix, a matrix that contains only the features that are absolutely necessary to depict the problem at hand. By further condensing the feature matrix in the pooling layer, just the most salient aspects of an image are used to form the final feature vector. All of this is carried out so that the system's computational efficiency can be maximised. The pooling layer provides this reduction with average pooling and max pooling. We've been flattening our image down to this layer. To create a single vector, the fully linked layer takes the reduced matrix and linearly transforms it. The normalised output is then sent into a feed-forward neural network, with back propagation utilised at each training iteration. A well-trained model will be able to tell the difference between relatively unimportant details and the crucial ones that tend to dominate an image. This works well with the final layer of categorization, the output layer. In the fields of computer vision [29] and image comprehension, CNNs have shown to be invaluable tools. CNNs learn from massive learning datasets, typically millions of training images, using sophisticated and efficient implementations, and are typically realised by a mix of simple nonlinear and linear filters like convolution and rectification.

Algorithm BMCMLC-DPL

{

Input: X Ray Image Dataset {XDSset}

Output: Classification Set

Step-1: Initially the X ray images will be considered and loaded and feature extraction is performed on the images. The features will be extracted and are used for training the model for classification. The feature extraction is performed as

$F \operatorname{Set}[X D \operatorname{set}(F)]=\sum_{r=1}^L \frac{g e t a t t r(r)+\operatorname{maxVal}(r)}{\operatorname{size}(X D \operatorname{Det})}$$+\operatorname{minVal}(r)+T h$

Here getattr is used to retrieve the feature value from the dataset and Th is the threshold value considered for feature normalization.

Step-2: The extracted features are used and the binary classifier is designed that is used to identify whether COVID is effected or not. The binary classifier predicts the results with 1 that represents positive and 0 that represents negative. The binary classifier is applied as

$\operatorname{BinClas}(F \operatorname{Set}(r))$$=\lim _{r \rightarrow L}\left(\max (F \operatorname{Set}(r))+\frac{\operatorname{size}(X \text { Det }) 1}{\min (F \operatorname{set}(r))}\right)^2 \quad\left\{\begin{array}{l}\text { BinClas }=1 \text { if } \lim <\text { Th } \\ \text { BinClas }=0 \text { Otherwise }\end{array}\right.$

Step-3: The multi level classifier is proposed that considers the previously classified set and the non COVID prediction set is used for thorax disease detection. The multi level classification is applied and the feature vector is generated as

Fvector $(\operatorname{BinClas}(r))=\sum_{r=1}^L \lambda(\operatorname{BinClas}(r))+\frac{F(\operatorname{Val}(\operatorname{BinClas}(r))}{\operatorname{size}(\operatorname{BinClas})}+\max (F \operatorname{set}(r))$

Step-4: The dual priority labeling is performed to the COVID classification set and the Thorax disease set also in order for accurate classification of the lung disease. The dual priority labeling is performed as

$\operatorname{DPlabel}[L]=\sum_{r=1}^L \min ($ Fvector $(r))+\max (\operatorname{BinClas}(r))+\frac{\lambda(\text { Fvector }(r))}{\text { size }(\text { Fvector })}$

Step-5: The final stage classification is performed based on the binary and multi level classifier for generating the final prediction set using transfer learning for training. The process is performed as

$\operatorname{PredSet}[l]=\prod_{r=1}^L \operatorname{getVal}(\max ($ Fvector $(r))-\min (\lambda)+$ Th

}

4. Results

When assessing thoracic disorders, chest X-ray imaging is among the most accessible techniques. In addition, medical image analysts have spent a lot of time thinking about how to use computers to screen chest X-rays. It is especially helpful for the information deep learning model that several large-scale hospital chest X-ray datasets have just been made public. Disease labels serve as the gold standard for model training in our investigations. Concurrently, we use the bounding boxes for comparative methods of chest X-rays to pinpoint the location of problematic regions. The proposed model is compared with the traditional SMOTE (Synthetic Minority Oversampling Technique) algorithm.

This section presents the experimental setup and obtained results with detailed result analysis. The binary, multi-class and multi-level classification of CXR images concerning COVID-19 patient detection and diagnosis is performed, utilizing three state-of-the-art transfer learning models, VGG-19, GoogleNet (InceptionNet), and ResNet-50. We conducted several runs with variations in data-set and hyper-parameters to analyze their impact. Finally, the best performing model is obtained. As no similar experimental setup is performed on the same data set, we did not perform a comparative analysis of the results. However, a comparison with the existing system based on their model and obtained results are performed to show the performance of the obtained system.

Table 1. Training and validation accuracy and loss obtained

Classifier

Model

Training

Validation

 

 

Loss

Accuracy

Loss

Accuracy

Binary

Classifier

VGG19

0.05

0.97

0.06

0.97

InceptionNet

0.02

0.97

0.08

0.99

ResNet50

0.01

0.99

0.00

1

Multi-class

Classifier

VGG19

0.19

0.86

0.02

0.95

InceptionNet

0.33

0.87

0.24

0.99

ResNet50

0.11

0.87

0.19

0.97

Multi-level

Classifier

VGG19

0.12

0.84

0.01

0.98

InceptionNet

0.0

0.83

0.11

0.97

ResNet50

0.35

0.85

0.12

0.96

2250 images from 15 classes (14 other diseases and normal class) are taken. One hundred fifty images from each class are taken to make class balance. A ratio of 8.1.1 is used to train, validate, and test data. The second model is trained using 15 classes (COVID-19 and 14 other diseases classes), each containing 150images, and the third model is trained using 14 classes of data. To train the third model, images and their corresponding classes are encoded using a one-hotvector. However, there could be multiple ones present in a one-hot vector.

The hyper-parameters used are Epochs: 20 epochs and Batch-size: 32, Class-mode = ’categorical’. the model is compiled using loss function =’categorical-crossentropy, optimizer=’adam’, and metrics=’accuracy’. We have used a softmax classifier for binary and multi-class classification and a sigmoid classifier for multi-level. Ten runs for each model are conducted by varying data-set, and other parameters and the average run’s output is reported.

The obtained results, training loss and accuracy and validation loss and ac-curacy are presented in Table 1. The table shows the best performing is model isResNet-50. All these three models performed well in binary classification. How-ever, for multi-class and multi-level classification, the result deteriorated. How-ever, we have compared our model with other models, but as none of the model’s used similar strategies, the comparison will be inappropriate. Also, only a few systems used all the 16 classes of data for 2 or 3 class classification purposes. All these chest diseases have similar symptoms, but the models are trained with fewer samples.

Data redundancy can be minimised with the aid of feature extraction. In the end, data reduction aids model construction with less computational effort and boosts the pace of machine learning's learning and generalisation phases. The X Ray image features are extracted for accurate disease detection. The feature extraction time levels of the proposed and traditional models are shown in Figure 5.

In feature extraction, a large number of pixels are represented efficiently so that interesting sections of the image are recorded effectively with dimensionality reduction. The feature extraction accuracy levels of the proposed and traditional models are shown in Figure 6.

Figure 5. X ray feature extraction time levels

Figure 6. X ray feature extraction accuracy levels

Figure 7. Binary classification accuracy levels

A binary classification problem is one with only two possible outcomes. Classifying things into two categories, or bins, is an example of a practical application of dichotomization. Many real-world binary classification problems have asymmetrical classes, thus it's often more useful to look at the distribution of different sorts of errors than at the accuracy as a whole. The binary classification accuracy levels of the proposed and existing models are shown in Figure 7.

Figure 8. Multi level classification time levels

Figure 9. Dual priority labeling accuracy levels

Figure 10. Average training, and validation accuracy levels

As part of a classification tasks, previously collected training data is used to train a classifier how to label fresh data. Bring in data from the original source and separate the data into test and training sets. When there are more than three possible categories to place an instance into, a multiclass or multinomial classification task is performed at feature extraction level and after binary classification level. The multi level classification time levels are shown in Figure 8.

The dual priority labeling is performed for considering the feature set in order to train the model in a sequence with priority order allocating high priority to the independent features. The dual priority labeling accuracy levels of the proposed and existing models are shown in Figure 9. The training and validation accuracy levels are shown in Figure 10.

5. Conclusions

People all over the world face major health challenges because of chest diseases. Early detection of these disorders allows patients to receive lifesaving treatment. When it comes to early disease prediction, conventional Neural Networks play a critical role in the healthcare industry. Among the many factors used to diagnose chest illnesses, X-rays are among the most reliable. In this work, we offer a method for detecting lung disease from chest X-ray pictures, specifically targeting the classification of Covid-19 and pneumonia. The methods of deep learning, machine learning, and soft computing all underpin the proposed paradigm. The first step in making a diagnosis of COVID-19 is disease detection; subsequent steps require the presence of additional disorders. In this research, we use deep learning models with chest X-ray pictures to do this. As a result, we utilise a binary classifier to identify cases of COVID-19, a multi-class classifier to determine the primary illness, and a multi-level classifier to determine the total number of diseases affecting the patient. First, in the proposed system, images undergo pre-processing and augmentation. The next step is a commitment to binary classification (COVID-19 against 15 other classifications) to determine whether or not a person is COVID-19 positive. After that, a multi-class classification (15 categories except normal) is performed to see the disease from which the person is mainly affected. Finally, the multi-level classifier is used to understand how many diseases the patient suffers from. These different level of classification helps in the detection and accurate diagnosis of a patient. However, multi-model data such as images and text data (age, gender) can help in better classification; hence this will be our future work direction.

  References

[1] Abbas, A., Abdelsamea, M.M., Gaber, M.M. (2021). Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network. Applied Intelli-gence, 51(2): 854–864. https://doi.org/10.1007/s10489-020-01829-7

[2] Venkataramana, L., Prasad, D.V.V., Saraswathi, S. Mithumary, C.M., Karthikeyan, R., Monika, N. (2022). Classification of COVID-19 from tuberculosis and pneumonia using deep learning techniques. Med Biol Eng Comput, 60: 2681–2691. https://doi.org/10.1007/s11517-022-02632-x 

[3] Albahli, S., Yar, G.N.A.H. (2021) Fast and accurate detection of covid-19 along with 14other chest pathologies using a multi-level classification: Algorithm development and validation study. Journal of Medical Internet Research, 23(2): e23693. https://doi.org/10.2196/23693

[4] Apostolopoulos, I.D., Mpesiana, T.A. (2020). COVID-19: Automatic detection from X-ray images utilizing transfer learning with convolutional neural networks. Physical and Engineering Sciences in Medicine, 43(2): 635-640(2020). https://doi.org/10.1007/s13246-020-00865-4

[5] Baltruschat, I.M., Nickisch, H., Grass, M., Knopp, T., Saalbach, A. (2019). Comparisonof deep learning approaches for multi-label chest X-ray classification. Scientific Reports, 9(1): 1-10. https://doi.org/10.48550/arXiv.1803.02315

[6] Hemdan, E.E.D., Shouman, M.A., Karar, M.E. (2020). Covidx-net: A framework of deep learning classifiers to diagnose COVID-19 in x-ray images. arXiv preprintarXiv:2003.11055.

[7] Hussain, E., Hasan, M., Rahman, M.A., Lee, I., Tamanna, T., Parvez, M.Z. (2021). CoroDet: A deep learning based classification for COVID-19 detection using chestX-rayimages. Chaos, Solitons & Fractals, 142. https://doi.org/10.1016/j.chaos.2020.110495

[8] Jain, G., Mittal, D., Thakur, D., Mittal, M.K. (2020). A deep learning approach to detectCovid-19 coronavirus with X-ray images. Biocybernetics and Biomedical Engineering, 40(4): 1391-1405.

[9] Mangal, A., Kalia, S., Rajgopal, H., Rangarajan, K., Namboodiri, V., Banerjee, S., Arora, C. (2020). Covid AID: COVID-19 detection using chest X-ray. arXiv preprintarXiv:2004.09803.

[10] Ozturk, T., Talo, M., Yildirim, E.A., Baloglu, U.B., Yildirim, O., Acharya, U.R. (2020). Automated detection of COVID-19 cases using deep neural networks with X-ray images. Computers in Biology and Medicine, 121. https://doi.org/10.1016/j.compbiomed.2020.103792

[11] Sethy, P.K., Behera, S.K. (2020). Detection of coronavirus disease (COVID-19) based on deep features. https://doi.org/10.20944/preprints202003.0300.v1

[12] Smit, A., Jain, S., Rajpurkar, P., Pareek, A., Ng, A.Y., Lungren, M.P. (2020). CheXbert: combining automatic labelers and expert annotations for accurate radio logy report labeling using BERT. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1500-1519. https://doi.org/10.18653/v1%2F2020.emnlp-main.117

[13] Toga¸ M., Ergen, B., C¨omert, Z. (2020). COVID-19 detection using deep learning models to exploit Social Mimic Optimization and structured chest X-ray images using fuzzy color and stacking approaches. Computers in Biology and Medicine, 121. https://doi.org/10.1016/j.compbiomed.2020.103805

[14] Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R. (2019). Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: IEEECVPR, 7: 369-392. http://dx.doi.org/10.1007/978-3-030-13969-8_18

[15] Yoo, S.H., Geng, H., Chiu, T.L., Yu, S.K., Cho, D.C., Heo, J., Choi, M.S., Choi, I.H., Van, C., Nhung, N.V., Min, B.J., Lee, H. (2020). Deep learning-based decision-tree classifier for COVID-19 diagnosis from chest X-ray imaging. Frontiers in Medicine, 7: 427. http://dx.doi.org/10.3389/fmed.2020.00427

[16] X-Ray: Imaging Test Quickly Helps Diagnosis - Mayo Clinic. https://www.mayoclinic.org/tests- procedures/x-ray/about/pac20395303. Accessed 16 August 2021. 

[17] Chamorro, E.M., Tascón, A.D., Sanz, L.I., Vélez, S.O., Nacenta, S.B. (2020). Radiologic diagnosis of patients with COVID-19. Radiología (English Edition), 63(1): pp. 56–73. https://doi.org/10.1016/j.rxeng.2020.11.001

[18] Hwang, E.J., Kim, K.B., Kim, J.Y., Lim, J., Nam, J.G., Choi, H., Kim, H., Yoon, S.H., Goo, J.M., Park, C.M. (2021). COVID-19 Pneumonia on chest X-rays: Performance of a deep learning-based computer-aided detection system. PLOS ONE, 16(6): e0252440. https://doi.org/10.1371/journal.pone.0252440

[19] Health topics, Who. int, 2021. https://www.who.int/health-topics/, accessed: Sept. 12, 2021. 

[20] Stoppneumonia.org, 2021. https://stoppneumonia.org/wp-content/uploads/2019/11/India12.11.2019-Web.pdf, accessed on Sept. 12, 2021. 

[21] CDCTB. (2019) Tuberculosis (TB)- Basic TB Facts. Centers for Disease Control and Prevention. https://www.cdc.gov/tb/topic/basics/default.htm, accessed on June. 17, 2022.

[22] The U.S. Government and Global Tuberculosis Efforts. https://www.kff.org/globalhealth-policy/fact-sheet/the-u-s-government-and-global-tuberculosisefforts/, accessed on Jan. 17, 2022. 

[23] Khan, I.U., Nida A. (2020). A deep-learning-based framework for automated diagnosis of COVID-19 Using X-ray images. Information, 11(9): p. 419. https://doi.org/10.3390/info11090419. 

[24] Sethy, P.K., Behera, S.K., (2020). Detection of coronavirus disease (COVID-19) based on deep features. Preprints 2020, 2020030300. https://doi.org/10.20944/preprints202003.0300.v1

[25] Cohen, J.P., Morrison, P., Dao, L. (2020). COVID-19 image data collection. http://arxiv.org/abs/2003.11597. 

[26] Chest X-Ray Images (Pneumonia). https://kaggle.com/paultimothymooney/chest-xray-pneumonia, accessed Aug. 16, 2021. 

[27] Rachna, J., Nagrath, P., Kataria, G., Kaushik, V.S., Hemanth, D.J. (2020). Pneumonia detection in chest X-ray images using convolutional neural networks and transfer learning. Measurement, 165: 108046. https://doi.org/10.1016/j.measurement.2020.108046

[28] Ho, K., Gwak, J., Prakash, O., Song, J.I., Park, C.M. (2019). Utilizing pretrained deep learning models for automated pulmonary tuberculosis detection using chest radiography. Intelligent Information and Database Systems. http://dx.doi.org/10.1007/978-3-030-14802-7_34

[29] Jian, L., Yidi, H. (2020). Comparison of different CNN models in tuberculosis detecting. KSII Transactions on Internet and Information Systems, 14(8): 3519-3533. https://doi.org/10.3837/tiis.2020.08.021

[30] Rahman, T., Khandakar, A., Kadir, M.A., Islam, K.R., Islam, K.F., Mazhar, R., Hamid, T., Islam, M.T., Kashem, S., Mahbub, Z.B., Ayari, M.A., Chowdhury, M.E.H. (2020). Reliable tuberculosis detection using chest X-ray with deep learning, segmentation and visualization. IEEE Access, 8: 191586-191601. https://doi.org/10.1109/ACCESS.2020.3031384