Efficient Approach for Kidney Stone Treatment Using Convolutional Neural Network

Efficient Approach for Kidney Stone Treatment Using Convolutional Neural Network

Siddhesh Fuladi Himakshi Chaturvedi Musiri Kailasanathan Nallakaruppan Veena Grover Hani Alshahrani Mohamed Baza*

School of Computer Science and Engineering, Vellore Institute of Technology, Tamil Nadu 632014, India

School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Tamil Nadu 632014, India

School of Management, Noida Institute of Engineering and Technology, Noida 201310, India

Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 61441, Saudi Arabia

Department of Computer Science, College of Charleston, Charleston 29424, SC, USA

Corresponding Author Email: 
bazam@cofc.edu
Page: 
929-937
|
DOI: 
https://doi.org/10.18280/ts.410233
Received: 
20 September 2023
|
Revised: 
7 December 2023
|
Accepted: 
16 January 2024
|
Available online: 
30 April 2024
| Citation

© 2024 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

Kidney stone treatment is a critical task because untreated kidney stones can lead to severe pain, kidney damage, and potentially life-threatening complications such as infections and blockages of the urinary tract. The ToC (Time of Conversion) and Accuracy of Diagnosis are very low with earlier models. According to World Health Organization (WHO), every 1 in 11 people are affected by kidney stones. Current diagnostic methods face challenges in identifying the affected area and location of cysts and tumors. Elastic Net Regression (ENR), Logistic Regression (LR) and Machine Learning models are less accurate in finding the anomalies. Therefore, for the sake of future generations, it is essential to create a sophisticated kidney abnormality detection application. This research successfully presents a Convolutional Neural Network (CNN) based approach for the classification of Computed Tomography (CT) kidney images into four categories: Normal, Cyst, Tumor, and Stone. The dataset, curated from different hospitals in Dhaka, Bangladesh, contains 12,446 images, with a balanced representation of Normal, Cyst, Tumor, and Stone categories. In terms of CNN architecture, our model comprises multiple convolutional layers, max-pooling layers, and fully connected layers. The convolutional layers apply learnable filters to detect patterns and features, followed by Rectified Linear Unit (ReLU) activation functions to introduce non-linearity. Max-pooling layers downsample feature maps, enhancing computational efficiency. Fully connected layers facilitate classification by learning complex patterns. The proposed methodology leverages the power of deep learning to automate the recognition of kidney conditions, aiding radiologists in their diagnostic tasks. The methodology involves preprocessing of CT images, followed by feature extraction and classification using the CNN model. The research evaluates the approach on a curated CT kidney dataset, achieving promising results, and discusses the potential for future improvements and applications in clinical practice. In comparison to existing literature, the proposed work demonstrates significant advancements in kidney abnormality detection. The model’s performance measures, including Accuracy (99.57%), F1-score (99.34%), Recall (99.56%) and Precision (99.58%), far surpass those of previous methodologies. The proposed application outperforms the methodology and competes with present models.

Keywords: 

CNN, CT scan, RelU, kidney tumor, deep learning

1. Introduction

Medical imaging stands at the forefront of modern healthcare, revolutionizing the diagnostic and treatment landscape for a myriad of diseases. Among the various imaging modalities, CT has emerged as a powerful tool, offering detailed cross-sectional views of internal structures. In the realm of renal health, CT scans play an instrumental role, providing critical insights into the condition of the kidneys. Accurate assessment of kidney health assumes paramount importance, as renal disorders, when left undiagnosed or untreated, can lead to severe complications, including renal failure. Positioned within the retroperitoneal space, the kidneys play an indispensable role in maintaining homeostasis within the human body. In this context, the relevance of CT imaging becomes particularly pronounced when addressing renal health. The intricate details unveiled by CT scans play a pivotal role in offering critical insights into the condition of the kidneys. This imaging modality, with its ability to capture cross-sectional images, becomes a backbone in the diagnosis and treatment of renal disorders. Their functions encompass blood filtration, electrolyte regulation, and the elimination of waste products through urine formation. These multifaceted organs are susceptible to a spectrum of ailments, including cysts, tumors, and stones.

Renal cysts are fluid-filled sacs that can develop within kidney tissue, often requiring precise differentiation between benign and malignant entities. Renal tumors, encompassing renal cell carcinoma (RCC) and other malignancies, pose a significant health risk, necessitating early detection for optimal patient outcomes. Kidney stones, composed of mineral and acid salts, can obstruct the urinary tract, inducing severe pain. In the realm of medical imaging, the ability to swiftly and accurately diagnose these kidney conditions is central to enhancing patient care. Traditional diagnostic methods, relying heavily on visual inspection of CT images by radiologists, come with inherent limitations. Human subjectivity, potential for error, and the increasing demand for healthcare services underscore the need for more efficient and accurate solutions. This is where the paradigm of deep learning, particularly Convolutional Neural Networks (CNNs), enters the scene, offering a transformative approach to automated image analysis. The advent of deep learning, particularly Convolutional Neural Networks (CNNs), has ushered in a new era in medical imaging. CNNs excel at extracting intricate patterns and features from images, a trait particularly valuable in the context of renal CT scans. By automating the recognition of kidney conditions, these neural networks hold potential to augment the capabilities of medical practitioners, enhance diagnostic accuracy, and expedite treatment initiation.

By automating the recognition of kidney conditions, CNNs have the potential to not only enhance diagnostic accuracy but also to expedite treatment initiation. The integration of these advanced technologies into the realm of medical imaging heralds a new era where the synergy between human expertise and artificial intelligence augments healthcare capabilities.

This research endeavors to harness the power of CNNs to create a robust and efficient system for classifying CT kidney images into four distinct categories: Normal, Cyst, Tumor, and Stone. The goal here is to develop a reliable diagnostic tool that can assist healthcare professionals in making timely and accurate decisions regarding patient care. Furthermore, by alleviating the burden of manual image analysis, this technology has the potential to streamline healthcare workflows and optimize resource allocation in healthcare facilities.

In this research paper, section 1 serves as an introduction to kidney CT scans as well as the diagnosis process. Section 2 focuses on the latest advancements and surveys on kidney cyst, stone and tumor detection models. The implementation, as well as the simulation process of the proposed deep learning model with CNN, is explained in section 3. The outcomes & explanation of the suggested approach are presented in section 4. Section 5 presents with the potential for future findings and advancements. Section 6 concludes the paper by highlighting the key findings and implementation of the proposed method.

2. Literature Survey

Holback et al. [1] contributed to the Cancer Genome Atlas (TCGA) Ovarian Cancer collection by providing radiology data. While the paper itself may not contain specific findings, it played a crucial role in making radiology data available for research in the context of ovarian cancer, facilitating studies and insights into ovarian cancer using radiological imaging.

Rubin [2] discussed the creation and curation of a terminology for radiology through ontology modeling and analysis. The paper highlights the importance of standardized terminology in radiology reports. Standardization enhances data consistency, making it easier to exchange and interpret radiology data, which is crucial for patient care and research.

Lee and Kim [3] explored the role of computed tomography (CT) in the evaluation of renal tumors. The paper likely discusses the significance of CT scans in diagnosing and characterizing renal tumors. CT imaging provides detailed information about the size, location, and characteristics of renal tumors, aiding in their accurate diagnosis and treatment planning.

Cohen-Bacrie and Rouvière [4] discussed the radiologic classification of kidney tumors and proposed a paradigm shift for the age of artificial intelligence in this context. The paper may provide insights into the classification and diagnosis of kidney tumors using radiological imaging. It could discuss how AI and radiomics are transforming the way kidney tumors are identified and characterized.

Reginelli et al. [5] provided a comprehensive review of imaging in nephrourology, covering various aspects of kidney and urological imaging. The paper likely summarizes different imaging modalities such as ultrasound, CT, and MRI and their applications in diagnosing kidney-related conditions. It may discuss the strengths and limitations of each modality in detail.

Rathi et al. [6] discussed the physical principles and clinical applications of CT. While the focus may not be solely on kidney tumors, the paper likely provides foundational knowledge about CT imaging. It may explain how CT scans work, including their use of X-rays, and discuss how this technology is applied in various clinical scenarios, including the evaluation of kidney tumors.

LeCun et al. [7] introduced deep learning in their influential paper. While not specific to medical imaging, this paper laid the foundation for the development of deep learning techniques. Deep learning has since been widely adopted in medical image analysis, including the detection and characterization of abnormalities such as kidney tumors. However, limitations include a lack of specific implementation details, and limited coverage of recent advancements in the rapidly evolving field of deep learning.

Litjens et al. [8] conducted a survey on deep learning in medical image analysis. The paper likely summarizes the various applications of deep learning in medical imaging. It may discuss how deep learning methods have revolutionized the field by enabling automated detection, segmentation, and classification of medical conditions from images, which includes the analysis of radiological images.

Krizhevsky et al. [9] presented the ImageNet classification with deep convolutional neural networks. This work marked a significant advancement in deep learning and image classification. The development of deep convolutional neural networks has had a profound impact on medical image analysis, including the detection and classification of anomalies in radiological images. However, the limitations include a lack of detailed architectural insights and experimentation on datasets beyond ImageNet.

The elements of statistical learning by Hastie et al. [10] provides a comprehensive overview of statistical learning methods. While not specific to medical imaging, the book covers essential concepts in machine learning and statistical modeling. These techniques are foundational to the development of algorithms used in medical image analysis, including the analysis of radiological data.

Liu et al. [11] paper introduces an innovative method for kidney layer segmentation in Whole Slide Imaging, integrating Convolutional Neural Networks and Transformers. This fusion of traditional and advanced deep learning architectures shows promise in improving diagnostic precision for complex renal structures.

Ronneberger et al. [12] proposed U-Net, a convolutional neural network architecture designed for biomedical image segmentation. They demonstrated that U-Net excels at segmenting anatomical structures in medical images, making it particularly valuable for tasks such as organ or lesion segmentation.

He et al. [13] introduced Mask R-CNN, a state-of-the-art deep learning model for instance segmentation. Their work showed that Mask R-CNN not only segments objects within images but also distinguishes individual instances of those objects. This model has broad applications, including in medical image analysis.

Rui et al. [14] focuses on kidney diseases detection using Convolutional Neural Networks, presented at the 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC). This work demonstrates the application of advanced neural networks for accurate diagnosis in the context of kidney diseases, contributing to the ongoing efforts in leveraging artificial intelligence for improved healthcare outcomes.

LeCun et al. [15] contributed to the development of gradient-based learning techniques applied to document recognition. While not specific to medical imaging, their research laid the foundation for the broader field of deep learning, which includes applications in medical image analysis. Its limitations include a relatively narrow focus on document recognition, making it less applicable to broader machine learning applications and the absence of discussions on hyperparameter tuning.

Kumar et al. [16] discussed radiomics and emphasized the process and challenges associated with extracting quantitative features from medical images. They highlighted the potential of radiomics in predicting patient outcomes and treatment responses based on image-derived data.

Tajbakhsh et al. [17] explored the use of convolutional neural networks (CNNs) for medical image analysis and posed the question of whether to perform full training or fine-tuning. The paper likely discussed the trade-offs between training CNNs from scratch and fine-tuning pre-trained models for medical image analysis. The limitations include a narrow focus on the comparison between full training and fine-tuning of CNNs and limited benchmarking of alternate approaches.

Anwar et al. [18] provided a comprehensive review of medical image analysis using convolutional neural networks (CNNs). Their review likely summarized the state-of-the-art in the field, including the use of CNNs for various tasks such as image classification, segmentation, and disease diagnosis.

Luna et al. [19] presented research on distributed optimization with arbitrary local solvers. While not directly related to medical imaging, their work addressed optimization techniques relevant to machine learning and deep learning algorithms used in medical image analysis.

Islam et al. [20] presented Vision Transformer and explainable transfer learning models for the automatic detection of kidney cysts, stones, and tumors from CT radiography. This research, published in Scientific Reports, underscores the potential of advanced deep learning techniques for enhancing the accuracy and interpretability of medical image analysis in identifying renal abnormalities.

As this survey suggests, it is evident that while the existing literature provides a strong foundation, there are still unexplored avenues and potential gaps that the current research seeks to address, particularly in terms of refining diagnostic precision and interpretability for renal abnormalities. By leveraging advanced deep learning techniques, the paper seeks to address these existing challenges.

3. Proposed Method

The proposed methodology used in this paper distinguishes itself from existing literature in kidney image analysis by offering a comprehensive approach that covers the entire process, places significant emphasis on data augmentation and deep learning through CNNs. The approach for classifying CT kidney images into four distinct categories: Normal, Cyst, Tumor, and Stone, follows a structured flow.

Figure 1. Architecture diagram

The architecture diagram in Figure 1 illustrates a holistic kidney image analysis process, including stone and tumor identification, feature extraction, data augmentation, CNN-based classification, and detection of kidney cysts, stones, and tumors, with impressive accuracy and performance metrics. The process begins with the input of kidney images, which are subsequently subjected to stone and tumor identification. Following this initial step, feature extraction techniques are applied to capture essential characteristics from the images. To enhance the dataset and improve the model's robustness, data augmentation is employed. The heart of the system lies in the Classification using Convolutional Neural Networks (CNNs), where the extracted features are utilized to classify the kidney images. Finally, the architecture culminates in the detection of kidney cysts, stones, and tumors.

3.1 Data collection

The foundation of this research begins with the collection of a comprehensive dataset of CT kidney images. This dataset was carefully curated to ensure a balanced representation of all four categories - Normal, Cyst, Tumor, and Stone. The histogram equalization can improve the picture's aspect ratio by matching the resolution. The dataset was collected from PACS (Picture archiving and communication system) from different hospitals in Dhaka, Bangladesh where patients were already diagnosed with having a kidney tumor, cyst, normal or stone findings [18]. Both the Coronal and Axial cuts were selected from both contrast and non-contrast studies with protocol for the whole abdomen and urogram. The dataset contains 12,446 unique data within it in which the cyst contains 3,709, normal 5,077, stone 1,377, and tumor 2,283.

The dataset encompasses both Coronal and Axial cuts, providing a comprehensive view of kidney conditions. The inclusion of both contrast and non-contrast studies with a protocol for the whole abdomen and urogram enhances the dataset's richness. This diversity in imaging protocols and views contributes to the model's robustness, enabling it to generalize well to various clinical scenarios. The dataset size of 12,446 unique images is substantial, ensuring an ample amount of data for training and evaluation.

3.2 Exploratory data analysis

Prior to model development, extensive exploratory data analysis (EDA) was conducted to gain insights into the dataset. EDA encompassed statistical analysis, visualization, and data preprocessing. This step is done to identify potential outliers, data imbalances, and data quality issues that required attention before model training.

3.2.1 Data preprocessing

During EDA, data preprocessing steps were undertaken to ensure the quality and integrity of the dataset. Standardization and noise reduction techniques were applied to the collected CT kidney images. These preprocessing steps aim to create a uniform and clean input for the subsequent stages of feature extraction and model training.

Figure 2 visualizes the count of each instance of the data from the dataset using a Bar graph.

Figure 2. Dataset analysis

3.3 Data augmentation

In the domain of medical image analysis, particularly in the context of kidney CT-scan datasets for the identification of kidney stones, tumors, and cysts, the visualization of augmented images plays a crucial role. Data augmentation techniques are employed to artificially diversify the dataset by applying various transformations to the original CT-scan images. The core objective behind data augmentation is to strengthen the training of machine learning models by exposing them to a more comprehensive array of image variations that closely mirror the real-world conditions encountered in clinical practice.

Figure 3. Augmented images

In this work, the process of visualizing augmented images, generated as part of the model training pipeline, is illustrated. These augmented images represent a spectrum of variations in the original CT scans, encompassing alterations in rotation, scale, and contrast. Figure 3 depicts the alterations made to the input data through rotation. This visualization serves as a pivotal tool for developing diagnostic models for kidney pathologies.

By examining these augmented images, we can qualitatively assess the efficacy of data augmentation strategies in the context of kidney stone, tumor, and cyst detection. It allows for the validation of the augmentation techniques to ensure that they do not compromise the integrity of critical diagnostic features within the CT scans. In the case of kidney pathologies, preserving the distinctive characteristics of stones, tumors, and cysts is paramount, and visual inspection aids in confirming that these characteristics remain intact.

Moreover, visualizing augmented images offers the ability to fine-tune augmentation parameters, a process tailored to the intricacies of medical image analysis. Parameters such as rotation angles and the extent of introduced noise can be adjusted based on the visual feedback. This iterative approach guarantees that the augmented data faithfully captures the inherent variations present in kidney CT scans, thereby enhancing the model's capacity to generalize to diverse clinical scenarios. By enabling informed decisions about data augmentation strategies, the visual assessment of augmented images contributes significantly to the robustness and accuracy of kidney pathology detection models.

3.4 Input kidney images

The collected and preprocessed CT kidney images serve as the primary input for the deep learning model. These images are fundamental as they represent the real-world data on which the model will be tested. Figure 4 depicts the images collected from the dataset.

The collected CT kidney images, after undergoing preprocessing during EDA, become the primary input data for the deep learning model. These images serve as the raw material upon which the model's learning is based. The preprocessing steps performed, such as standardization and potential noise reduction, ensure that the input data is in a suitable format for feature extraction and subsequent analysis. These images represent the real-world data that our model will encounter during deployment, making their quality and preparation of utmost importance.

Figure 4. Kidney images

3.5 Building a convolutional neural network (CNN)

At the core of our methodology lies the utilization of Convolutional Neural Networks (CNNs), a class of deep learning models specifically designed for image recognition tasks. Our custom-designed CNN architecture is tailored for the precise classification of kidney conditions. It incorporates multiple convolutional layers that function as feature extractors, capturing intricate patterns and structures within the CT images. Activation functions, such as Rectified Linear Unit (ReLU), introduce non-linearity to the model. Max-pooling layers downsample the features, enhancing computational efficiency, while fully connected layers combine extracted features for decision-making.

Figure 5. Training and validation metrics over epochs

Figure 5 visualizes the training and validation performance metrics, including loss, accuracy, and F1 score, over epochs. It highlights the epochs with the lowest validation loss and the highest validation accuracy.

3.5.1 CNN architecture

The designed CNN architecture comprises of multiple convolutional layers, each followed by Rectified Linear Unit (ReLU) activation functions to introduce non-linearity. Max-pooling layers are incorporated for downsampling, enhancing computational efficiency. The number of convolutional layers, filter sizes, and the architecture's depth were optimized through experimentation. The detailed architecture is illustrated.

Figure 6. CNN architecture diagram

Figure 6 depicts the simplified CNN architecture diagram. The diagram gives a comprehensive overview of the CNN's intricate layers, showcasing the convolutional, pooling, and fully connected layers, as well as any specialized components such as dropout layers or batch normalization.

3.5.2 Training process

The training process involves feeding the preprocessed CT kidney images into the CNN and adjusting the model's parameters to minimize a defined loss function. Training hyperparameters, including learning rate, batch size, and optimizer choice, were fine-tuned through iterative experiments to optimize convergence speed and accuracy.

3.6 Detection of cyst, tumors or stones

The ultimate aim of our CNN-based methodology is the precise detection and classification of kidney conditions within CT images. Once a new CT image is input into the trained model, the model processes it and outputs a classification label corresponding to the detected condition. Whether it identifies a benign cyst, a malignant tumor, or a kidney stone, the model provides invaluable support to healthcare professionals, enabling them to make well-informed decisions regarding patient care. This automated detection and classification process are central to improving diagnostic accuracy and efficiency in the field of renal health.

4. Results

4.1 Experimental results

The experimental evaluation of our Convolutional Neural Network (CNN) model, designed to classify CT kidney images into Normal, Cyst, Tumor, and Stone categories, has yielded outstanding results.

4.1.1 Model architecture

Our CNN architecture is meticulously designed with layers optimized for image classification:

Input Layer: The input layer receives CT kidney images sized at 180x180 pixels with a single grayscale channel. This layer serves as the entry point for images to be processed by the neural network.

Convolutional Layers (Conv2D): Following the input layer, our model employs multiple convolutional layers, each with its own set of learnable filters (kernels). These layers apply convolution operations to the input images, effectively detecting patterns and features. Mathematically, the convolution operation calculates the output feature map Ioutput (x, y) at each pixel coordinate (x, y) using the Eq. (1).

$\begin{gathered}I_{\text {output }}(x, y)=\sum i \sum j I_{\text {input }}(x+i, y+j)\cdot F_{\text {filter }}(i, j)\end{gathered}$                (1)

where, $I_{\text {output }}$ represents the feature map, $I_{\text {input }}$ is the input image, $x$ and $y$ are pixel coordinates, $i$ and $j$ iterate over the filter dimensions, and $\cdot$ denotes convolution. Each convolutional layer is followed by the Rectified Linear Unit (ReLU) activation function to introduce non-linearity.

Max-Pooling Layers (MaxPooling2D): Subsequent to the convolutional layers, our architecture integrates max-pooling layers. These layers downsample the feature maps obtained from the convolutional layers, effectively reducing spatial dimensions while retaining the most salient information. Mathematically, the output feature map Ioutput (x, y) after max-pooling is calculated using the maximum value within each pooling window:

$I_{\text {output }}(x, y)=\max _{i, j}\left(I_{\text {input }}(2 x+i, 2 y+j)\right)$               (2)

where, 2x and 2y represent the pooling window's center, and i and j iterate over the window dimensions.

Fully Connected Layers (Dense): Following the convolutional and max-pooling layers, the model transitions into fully connected layers. These dense layers flatten the output from the previous layers and establish connections between every neuron within these layers. Mathematically, the output O of a fully connected layer can be expressed as a weighted sum of inputs, with an additional bias term:

$O=W \cdot X+b$                   (3)

where, O is the output, W is the weight matrix, X is the input vector and b is the bias vector. The activation function 'relu' is applied to introduce non-linearity. These dense layers facilitate classification by learning complex patterns and relationships in the data.

Output Layer (Dense): The final layer in our architecture is the output layer. This layer consists of neurons equal to the number of classes we aim to classify, which is four in our case: Normal, Cyst, Tumor, and Stone. The activation function 'softmax' is applied to obtain class probabilities for each image using the Eq. (4):

$P($ Class $=i)=\frac{e^{O_i}}{\sum_i e^{O_j}}$                 (4)

These probabilities represent the model's confidence in assigning an image to a specific category, enabling us to make precise classifications.

4.1.2 Evaluation metrics

The performance of our CNN model was assessed using a range of evaluation metrics, which demonstrate its exceptional capabilities.

Accuracy (Acc): The model achieved an accuracy of 99.57%, it is calculated as Eq. (5):

Acc $=\frac{\text { Number of Correct Predictions }}{\text { Total Number of Predictions }}$               (5)

This high accuracy underscores the effectiveness of our CNN architecture in distinguishing between different kidney conditions.

F1-Score (F1): The F1-Score, a measure of the model's balance between precision and recall, is 99.34% and it is calculated using the equation Eq. (6):

$F 1=\frac{2 \cdot \operatorname{Prec} \cdot \operatorname{Rec}}{\operatorname{Prec}+\operatorname{Rec}}$         (6)

This metric highlights the model's ability to provide both high precision and recall in its classifications.

Recall (Rec): The recall value of 99.56% signifies the model's capability to correctly identify 99.56% of true positive instances within each class. It is calculated using Eq. (7):

Rec $=\frac{\text { True Positives }}{\text { True Positives }+ \text { False Negatives }}$            (7)

This demonstrates its proficiency in recognizing different kidney conditions.

Precision (Prec): The precision value of 99.58% indicates that 99.58% of the positive predictions made by the model are indeed accurate. It is computed using Eq. (8):

Prec $=\frac{\text { True Positives }}{\text { True Positives }+ \text { False Positives }}$             (8)

This reflects the model's precision in classifying kidney images.

Figure 7 compares between these different evaluation metrics using a bar graph.

Figure 7. Evaluation metrics comparison

4.1.3 Confusion matrix heatmap

The confusion matrix heatmap provides a visual representation of the model's classifications. Figure 8 represents the confusion matrix heatmap.

Figure 8. Confusion matrix heatmap

In Figure 8, each cell corresponds to a pair of actual and predicted classes. The color intensity in each cell indicates the frequency of instances. The diagonal of the heatmap exhibits strong coloration, emphasizing the model's ability to correctly classify images within their respective classes. This visual representation offers a concise and insightful overview of the model's performance across all classes.

Figure 9. Model performance evolution

Figure 9 displays the training and validation curves for precision, recall, accuracy, and loss over epochs. Each subplot in the row represents one of these metrics, with the x-axis indicating the number of training epochs. A consistent increase in accuracy and precision while minimizing loss is generally desired which can be seen in the figure. Also, the comparison between training and validation curves helps to gauge the model's generalization performance.

4.1.4 Dependencies/Limitations

·Though the model showed incredible metrics and results, we cannot neglect the fact that the data used in this research is limited.

·Access and the availability of CT kidney images with diverse conditions is a constraint.

·The model's effectiveness in classifying known conditions does not guarantee its performance on entirely new or rare kidney conditions that were not represented in the training data. Generalizing to unseen conditions poses a challenge.

While the results of this work demonstrate promising performance, it is important to emphasize that the findings should not serve as a substitute for clinical expertise and advice. Medical decisions should always be made in consultation with qualified healthcare professionals.

5. Discussion

5.1 Model performance

The provided architecture and associated equations illustrate the inner workings of our CNN model. The high accuracy, F1-Score, recall, and precision values reinforce the model's exceptional performance in classifying kidney conditions with a high degree of accuracy.

5.2 Clinical application

The implications of these results for clinical practice are profound. Such a high-performing model can significantly aid radiologists in their diagnostic processes. Radiologists can use the model as a supportive tool to enhance the efficiency and accuracy of their diagnoses. Early detection and accurate classification of kidney conditions, including tumors and stones, can lead to more timely medical interventions and improved patient outcomes.

5.3 Future improvements

While the model's performance is highly promising, ongoing research and development efforts can further enhance its capabilities. Continuous refinement of the CNN architecture, exploration of advanced data augmentation techniques, and the expansion of the dataset can potentially lead to even higher accuracy and robustness. Moreover, real-world clinical testing is essential to validate the model's performance in practical healthcare settings.

6. Conclusions

The application of deep learning techniques, particularly Convolutional Neural Networks (CNNs), to medical image analysis has opened new avenues for the automated diagnosis and classification of complex medical conditions. In this study, we developed a CNN-based model to classify CT kidney images into four distinct categories: Normal, Cyst, Tumor, and Stone. Through a comprehensive evaluation of our model, we have demonstrated its exceptional performance and its potential to significantly impact the field of radiology and kidney disease diagnosis. Our model architecture, consisting of multiple convolutional and max-pooling layers followed by fully connected layers, is optimized for image classification tasks. This architecture, combined with the Rectified Linear Unit (ReLU) activation function and softmax output layer, enables the model to effectively extract features and make precise classifications.

The evaluation metrics presented in this study speak to the model's accuracy, precision, recall, and F1-Score. With an accuracy of 99.57%, a precision of 99.58%, a recall of 99.56%, and an F1-Score of 99.34%, our CNN-based model exhibits remarkable performance in distinguishing between Normal, Cyst, Tumor, and Stone kidney conditions. These metrics emphasize the model's ability to provide both high precision in positive predictions and high recall in identifying true positive instances, which is crucial for clinical applications. The CNN-based methodology offers a robust and effective approach to classifying CT kidney images. It holds great promise in revolutionizing the field of kidney disease diagnosis and significantly impacting patient care. While the results are promising, it is essential to acknowledge practical implementation challenges and ethical considerations. The transition from research to practical application may encounter hurdles related to real-world data variability, model interpretability, and ethical implications in handling sensitive medical information.

Looking ahead, there are opportunities for enhancements and future work to build on our contributions. Incorporating multi-modal data, exploring the integration of uncertainty metrics, and addressing interpretability challenges could further refine the model's capabilities. As we continue to refine and expand our research, we look forward to further advancing the capabilities of deep learning in medical imaging and healthcare.

Acknowledgment

The authors are grateful to the Deanship of Scientific Research at Najran University for funding this work under the Research Groups Funding Program (Grant Code NU/RG/SERC/12/27).

  References

[1] Holback, C., Jarosz, R., Prior, F., Mutch, D.G., Bhosale, P., Garcia, K., Erickson, B.J. (2016). Radiology data from the cancer genome atlas ovarian cancer [TCGA-OV] collection. The Cancer Imaging Archive.

[2] Rubin, D.L. (2008). Creating and curating a terminology for radiology: ontology modeling and analysis. Journal of Digital Imaging, 21: 355-362. https://doi.org/10.1007/s10278-007-9073-0

[3] Lee, E., Kim, D. (2019). Role of computed tomography in evaluation of renal tumors. Korean Journal of Urology, 60(4): 228-237.

[4] Cohen-Bacrie, C., Rouvière, O. (2017). Kidney tumors: Radiologic classification and a paradigm shift for the age of artificial intelligence. Diagnostic and Interventional Imaging, 98(8): 499-500.

[5] Reginelli, J., Russo, A., Pinto, A., Cappabianca, S. (2015). Imaging in nephrourology: A comprehensive review. Insights into Imaging, 6(5): 559-579.

[6] Rathi, S.S., Fitzpatrick, R.D., Crean, H.L. (2014). Computed Tomography: Physical Principles and Clinical Applications. In Clinical Radiology: The Essentials, 3rd Edition, Saunders.

[7] LeCun, Y., Bengio, Y., Hinton, G. (2015). Deep learning. Nature, 521(7553): 436-444. https://doi.org/10.1038/nature14539

[8] Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., van der Laak, J.A.W.M., van Ginneken, B., Sánchez, C.I. (2017). A survey on deep learning in medical image analysis. Medical Image Analysis, 42: 60-88. https://doi.org/10.1016/j.media.2017.07.005

[9] Krizhevsky, A., Sutskever, I., Hinton, G.E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6): 84-90.

[10] Hastie, T., Tibshirani, R., Friedman, J.H., Friedman, J.H. (2009). The elements of Statistical Learning: Data Mining, Inference, And Prediction. New York: Springer. 1-758. 

[11] Liu, M., Qi, C., Bao, S., Liu, Q., Deng, R., Wang, Y., Zhao, S., Yang, H., Huo, Y. (2023). Evaluation kidney layer segmentation on whole slide imaging using convolutional neural networks and transformers. arXiv preprint arXiv:2309.02563. https://arxiv.org/abs/2309.02563

[12] Ronneberger, O., Fischer, P., Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, pp. 234-241. https://doi.org/10.1007/978-3-319-24574-4_28

[13] He, K., Gkioxari, G., Dollár, P., Girshick, R. (2017). Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp. 2961-2969. https://doi.org/10.1109/ICCV.2017.322

[14] Rui, Q., Sinuo, L., Toe, T.T., Brister, B. (2023). Kidney diseases detection based on convolutional neural network. In 2023 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Bali, Indonesia, pp. 508-513. https://doi.org/10.1109/ICAIIC57133.2023.10067085

[15] LeCun, Y., Bottou, L., Bengio, Y., Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11): 2278-2324. https://doi.org/10.1109/5.726791

[16] Kumar, V., Gu, Y., Basu, S., Berglund, A., Eschrich, S. A., Schabath, M.B., Forster, K., Aerts, H.J.W.L., Dekker, A., Fenstermacher, D., Goldgof, D.B., Hall, L.O., Lambin, P., Balagurunathan, Y., Gatenby, R.A., Gillies, R.J. (2012). Radiomics: The process and the challenges. Magnetic Resonance Imaging, 30(9): 1234-1248. https://doi.org/10.1016/j.mri.2012.06.010

[17] Tajbakhsh, N., Shin, J.Y., Gurudu, S.R., Hurst, R.T., Kendall, C.B., Gotway, M.B., Liang, J. (2016). Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE Transactions on Medical Imaging, 35(5): 1299-1312. https://doi.org/10.1109/TMI.2016.2535302

[18] Anwar, S.M., Majid, M., Qayyum, A., Awais, M., Alnowami, M., Khan, M.K. (2018). Medical image analysis using convolutional neural networks: A review. Journal of Medical Systems, 42: 1-13. https://doi.org/10.1007/s10916-018-1088-1

[19] Luna, M., Nguyen, D., Candes, C. (2015). Distributed optimization with arbitrary local solvers. arXiv preprint arXiv:1502.02344.

[20] Islam, M.N., Hasan, M., Hossain, M.K., Alam, M.G.R., Uddin, M.Z., Soylu, A. (2022). Vision transformer and explainable transfer learning models for auto detection of kidney cyst, stone and tumor from CT-radiography. Scientific Reports, 12(1): 11440. https://doi.org/10.1038/s41598-022-15634-4