JOURNAL METRICS

Impact Factor (JCR) 2023: 1.2 ℹImpact Factor (JCR):

The JCR provides quantitative tools for ranking, evaluating, categorizing, and comparing journals. The impact factor is one of these; it is a measure of the frequency with which the “average article” in a journal has been cited in a particular year or period. The annual JCR impact factor is a ratio between citations and recent citable items published. Thus, the impact factor of a journal is calculated by dividing the number of current year citations to the source items published in that journal during the previous two years.

5-Year Impact Factor: 1.2 ℹ5-Year Impact Factor:

A 5-Year Impact Factor shows the long-term citation trend for a journal. This is calculated differently from the Journal Impact Factor, so it is not simply an average of the Impact Factors in the time period. The Impact Factor itself is based only on Web of Science Core Collection citation data from the last three years and thus reflects only recent impact. The Journal Impact Factor is the average number of times articles from the journal published in the past two years have been cited in the Journal Citation Reports year.

qqtu_pian_20240428144739.png

An Optimal Prediction of Leaf Disease Based on Hybrid Deep Learnings and Metaheuristic Technique

Padmapriya Kondalsamy^* | Kavitha Kaliappan

Department of Electronics and Communication Engineering, AAA College of Engineering and Technology, Amathur Sivakasi 626005, India

Department of Electronics and Communication Engineering Velammal College of Engineering and Technology, Madurai 625009, India

Corresponding Author Email:

padmapriya@aaacet.ac.in

Received:

7 April 2024

Revised:

16 July 2024

Accepted:

28 August 2024

Available online:

28 February 2025

| Citation

ts_42.01_31.pdf

OPEN ACCESS

Abstract:

For plant growth, Leaf diseases are a significant threat to crop health and productivity. Numerous methods are presented to overcome these issues and also deep learning (DL) methods have been explored for disease prevention and prediction. This paper focused on a Hybrid Long Short-Term Memory (LSTM)-Generative Adversarial Networks (GAN) and CATBOOST algorithms for effective leaf disease prediction. To provide an optimal result, Nutcracker optimization is used to fine-tune the LSTM-GAN model’s hyperparameter which is capable of capturing sequential patterns in the input data. Additionally, the CATBOOST algorithm is integrated to enhance the classification performance and accuracy of the prediction model. The performance evaluations are validated using metrics like precision, recall, F1 score, specificity, accuracy, root mean square error (RMSE), and mean absolute error (MAE) metrics. Experimental results demonstrate that the proposed model outperforms other DL methods.

Keywords:

leaf disease prediction, nutcracker optimization, hybrid model, CATBOOST model, LSTM-GAN model, performance metrics

1. Introduction

Leaf diseases are major affecting the growth and cultivation of Plants and crops due to factors like bacteria, fungi, viruses, and environmental stress [1]. It can damage the crops, minimize their yield and quality, and cause many issues with food production. It is necessary to detect leaf disease at an earlier stage to provide effective agriculture and plant health management. By making an earlier prediction, the farmers can take preventive measures based on treatments, proper nutrient supplementation, and optimized irrigation strategies [2]. This earlier detection helps to reduce crop damage and increase overall productivity.

In the existing methods, agricultural professionals and experts predict diseases based on visual inspection. However, it has too many limitations like subjective, time-consuming, and depends on the expertise of the person inspecting the plants. Moreover, diseases may go unnoticed until visible symptoms appear which could be too late for effective intervention [3, 4]. Recently, numerous advanced models have been presented for prediction such as computed tomography (CT) and magnetic resonance imaging (MRI) respectively. However, the plant pathology application is limited because of its cost, complexity, and the need for specialized equipment.

Even though these traditional techniques had a lot of challenges, namely manual intervention requirements, limited scalability, and low accuracy in earlier stage detection. These limitations have to be avoided with automated and efficient strategies to address the problem effectively. Deep Learning (DL) techniques are implemented to avoid these issues that have significant advancements in image processing and leaf disease prediction [5, 6]. The DL models can analyze large leaf image datasets, learn complex features and patterns, and accurately classify leaves as healthy or diseased. Some benefits of DL-based prediction include: i) Increased accuracy: DL models can achieve high accuracy levels in disease detection, sometimes surpassing human experts, ii) Faster analysis: DL algorithms can process images rapidly, allowing for real-time or near-real-time detection and response, iii) Scalability: Once trained, DL models can be processed a plant species and diseases, making them highly scalable and iv) Non-invasive: Unlike invasive techniques like CT or MRI scans, DL-based prediction methods only require leaf images, making them non-destructive and suitable for widespread use respectively.

Several DL techniques have been successfully applied to leaf disease detection namely Convolutional Neural Networks (CNNs): CNNs are usually applied for image categorization tasks. It can learn and extract features automatically from leaf images, making them suitable for disease recognition, Transfer Learning (TL) [7, 8]: This technique involves using pre-trained CNN methods on large image datasets and adjusting it for specific leaf disease classification tasks. It reduces the need for extensive training data and can improve performance, GAN [9]: It can generate synthetic images resembling real leaf disease samples, augmenting the training data and improving the model's ability to generalize, Recurrent Neural Networks (RNNs) [10, 11]: RNNs, specifically LSTM networks, can analyze sequential data like time series information on disease progression or environmental factors affecting plant health.

However, in this work, Leaf disease prediction can be improved by combining using these techniques namely GAN-LSTM-CatBoost methods that are fine-tuned using nutcracker optimization. The GAN model generates synthetic leaf disease samples, the LSTM model analyzes disease progression over time, and the CatBoost technique enhances classification capabilities. Then the Fine-tuning of hyperparameters is processed with nutcracker optimization to provide an optimal model's performance. This integrated approach aims to increase the precision and reliability of leaf disease prediction systems. This proposed approach resulted from more precise and robust predictions in leaf diseases, ultimately aiding in effective plant health management and agricultural practices.

2. Related Works

In their research, Lanjewar and Panchbhai [12] utilized a CNN model to accurately predict tea leaf disease by attaining remarkable accuracy across validation, training, and test datasets. It also explored the performance of other deep CNN models such as Xception, ResNet50, and NASNetMobile for tea leaf disease detection, evaluating their effectiveness using methods like multi-fold cross-validation and confusion matrix.

Bouni et al. [13] focused on identifying tomato leaf disease by employing a DCNN and transfer learning. It utilized different CNN backbones including AlexNet, VGG-16, ResNet, and DenseNet. Through comparison using optimization methods such as Adam and RmsProp, it showed that the DenseNet model with RmsProp optimization reached the highest precision of 99.9%.

To enhance the reliability of measurements, Mahesh et al. [14] employed the mobilenetv2 architecture to diagnose leaf diseases. This work evaluated the performance of their proposed model by conducting evaluations on Imagenet classification, object detection, and picture segmentation and attained a 95% accuracy in identifying various plant diseases.

In their study, Bhandari et al. [15] aimed to identify 9 different infectious diseases in tomato leaves, along with healthy leaves. It uses the EfficientNetB5 model with a tomato leaf disease dataset, reaching impressive average accuracies of 99.84%±0.10% in training, 98.28%±0.20% in validation, and 99.07%±0.38% in testing over 10 cross-folds.

Adem et al. [16] developed a hybrid method of DL such as Faster R-CNN, SSD, VGG16, and Yolov4 for disease diagnosis and severity identification. It is trained and tested these models using 1040 images. This model attained a classification accuracy rate of 96.47% with their most successful method.

In some cases, a novel concept utilizing a 3D 2D CNN is introduced by Stephen et al. [17] for feature extraction and an optimized deep GAN with an improved backtracking search (IBS) algorithm for classification. Their integrated 3D2D DCNN effectively extracted features related to rice diseases, achieving an improved accuracy of 98.7%, surpassing existing techniques.

In another work, a shuffled shepherd social optimization (SSSO) technique for leaf disease classification is presented by Daniya and Srinivasan [18]. This method used a deep maxout network for classification and LSTM to attain a severity percentage detection of 7.24% which has the highest accuracy compared to existing techniques.

Aufar et al. [19] developed a leaf disease classification based on various neural network methods namely InceptionResNetV4, ResNet50, DensNet169, and MobileNetV2. These methods are used to optimize the classification accuracy. This work has an 80:10:10 ratio for validation, training, and testing data. In their result, the InceptionResnetV2 methods achieved the highest accuracies of 100% respectively.

To overcome the limitations of insufficient dataset samples, Ashwini and Sellam [20] presented a pre-processing method for corn leaf disease. Classification The Ebola optimization search (EOS) method is used to decrease the classification errors in their 3D-CNN model. The result of this work is improved with a higher accuracy with the support of the EOS model. These test results were conducted to demonstrate the efficacy of the model.

Yang et al. [21] presented an adversarial training method that has a collaborative multi-path feature aggregation network to predict a leaf disease. Their method included its module divided into various paths such as enabling feature extraction and long-range data. Their method achieved a 99.50% average precision rate on the Plant Village dataset and validated its effectiveness in prediction.

Lastly, Stephen et al. [22] presented several CNN models to categorize and identify healthy and diseased leaves like Brown spot, Leaf Blast and Hispa. Firstly, it used ResNet50 and ResNet34 to avoid gradient vanishing issues. Therefore, the self-attention ResNet34 model achieved a higher accuracy of 98.54% than other CNN methods.

3. Preliminaries

The proposed method integrates the various DL models namely GAN, LSTM, and CatBoost. In this work, the GAN method is used to generate realistic data, LSTM is used to capture a sequential dependency, and CatBoost to handle an efficient categorical feature handling. Based on this combination, the proposed method attained a higher performance and accuracy in leaf disease prediction. This hybrid DL model presents exciting opportunities to address complex tasks and increase the boundaries of DL capabilities. The basic idea of three methods is given in the following:

3.1 GAN model

The GAN method is a DL technique that consists of both a generator and a discriminator. The generator module ‘G’ produces synthetic data samples, such as images or text, while the discriminator ‘D’ component classifies the real and fake samples. These modules are trained in a competitive manner, where the ‘G’ module aims to generate realistic samples to deceive the discriminator which aims to precisely categorize the samples. This training process allows the generator to enhance and produce realistic data gradually.

The generator network provides an input random noise and creates synthetic data samples. Initially, random and meaningless samples are generated. In the training process, the generator trained a sample which is used to resemble the real data by mapping the noise input to the target data distribution.

The discriminator network also acts as a binary classifier. This network has provided real data samples from the training set and generated samples as input and learns to differentiate among them. It is used to classify the real data perfectly as real and the generated data as fake.

The GAN model is achieved through an adversarial loss function. The generator network is used to reduce this loss function, while the discriminator has to maximize it. The optimization method involved a weight updating of both networks through backpropagation, allowing it to learn and enhance it iteratively.

3.2 LSTM model

The LSTM model is based on RNN architecture and that solves all the limitations of RNN. That is the long-term dependencies captured in sequential data. It is constructed specifically to retain and utilize data over extended sequences to enable effective sequential data analysis.

The LSTM network comprises the memory cell that incorporates distinct gates that regulate information flow and also it has several gates like a forget gate, an input gate, and an output gate.

The first gate is used to manage the new data in the memory cell. Also, the forget gate is used to evaluate the retention or dismissal of data from the previous time step. The output gate is used to control the selection of information to be output from the memory cell. By varying the data flow through these gates selectively, LSTMs are used for capturing and remembering intricate dependencies within sequential data. This model has the capability to process a sequential pattern which has proven valuable in various tasks. Ongoing research and advancements are presented to enhance the LSTM method and explore its potential across diverse areas of DL and data analysis.

3.3 CatBoost model

CatBoost or the Categorical Boosting model is one of the significant gradient boosting algorithms that are specifically constructed to control categorical features in learning tasks. It incorporates an innovative algorithm called "ordered boosting" to efficiently process categorical variables without requiring extensive preprocessing. It handles categorical features by sorting them in a numerical order based on the target variable's statistics for each category. This approach enables CatBoost to effectively utilize categorical information during the training process.

It combines decision trees as weak learners to create a stronger predictive model and also builds decision tree ensembles, where each subsequent tree corrects the mistakes made by the preceding trees. By utilizing gradient-based optimization techniques, CatBoost ensures efficient and accurate model training.

It has a key feature that has the ability to automatically handle missing values within the data. It can effectively handle missing data points during the training phase without requiring explicit imputation or handling strategies. Also, it has the ability to handle categorical features efficiently, and combined with its high predictive accuracy, it makes it a popular choice for tackling real-world DL problems.

4. Materials and Methods

This section carried the proposed workflow that is given in Figure 1 which comprises the Input datasets, Pre-processing, Proposed Nutcracker optimized LSTM-GAN-CATBOOST model and performance Evaluation respectively. This section explains every block in detail in the following.

3238231f-ce6b-44ec-8408-ab2b6dd12536.png

Figure 1. Proposed leaf disease detection system

4.1 Dataset description

In this particular research, a dataset is obtained from the Plant Village dataset. This dataset has 18,161 tomato leaf images and leaf mask segmentation [23]. These images are employed for both segmentation and classification training. The dataset was divided into ten distinct classes, with 1 healthy leaf and the remaining 9 depending on various unhealthy conditions such as bacterial spot, leaf mould, early blight, septoria leaf spot, two-spotted spider mite, target spot, late bright mould, yellow leaf curl virus and mosaic virus. The unhealthy classes are also classified into bacterial, fungal, viral, mite and mould diseases. The proposed model is coded in Python 3.7.0 with the supporting DL packages of TensorFlow, sklearn and boost model packages.

4.2 Pre-processing

It is a process of preparing the dataset to adjust and manage an input image using a few essential steps known as image pre-processing. Once the image dataset is provided, techniques like resizing, cropping, normalization, colour space conversion, filtering, enhancement, augmentation, histogram equalization and feature extraction are processed. These techniques are used to serve the image standards to eliminate an undesired variation. It has been attained an enhancing significant feature and augmenting the dataset in it. The application of image pre-processing refines the input data to improve its model training and performance in capturing patterns and attaining accurate detection.

After the pre-processing stage, the data are divided into training and testing samples. The training set comprises 3/4th of the data the remains 1/4th is used for testing. The proposed optimized Hybrid model is employed for classification in both the training and testing phases. This ensures the reliability of the predictions and the effectiveness of leaf disease detection.

4.3 Training phase

Optimised Hybrid GAN-LSTM-CAT Boost Model

The LSTM-GAN-CATBOOST algorithm consists of two sub-models: LSTM-GAN and Dis1-CATBOOST. The first sub-model, LSTM-GAN, is trained using a dataset containing normal class samples. It combines the power of LSTM and GAN techniques to extract features from time series data [24]. The LSTM component captures temporal dependencies, while the GAN component generates synthetic samples with similar features as the normal class data. The LSTM-GAN model consists of a ‘G’ and a ‘D’ module.

The generator takes Gaussian noise as input and provides synthetic samples that resemble the normal class data. The discriminator, on the other hand, aims to differentiate between real and produced samples. It is trained to categorize the generated samples as false; while the ‘G’ module aims to cheat the ‘D’ module into classifying the Real data are produced samples. The training data used for LSTM-GAN consists of the normal class dataset.

Once training is complete, the generator produces generated samples that possess similar features to the training data of normal classes. The discriminator becomes proficient in extracting these features and distinguishing between normal and abnormal data samples. The LSTM-GAN training architecture is illustrated in Figure 2.

2.png

Figure 2. Training of LSTM-GAN

From Figure 2, the discriminator consists of Dis1 and Dis2 where Dis1 extracts 128 features from sample information of time series while Dis2 is the discriminator’s final layer. It exports an abnormal score based on the relationship between extracted features. The Dis1 feature extractions are used to train the second sub-model like Dis1-CATBOOST [25].

The generator and discriminator’s hierarchical structures are depicted in Figure 3. The generator primarily consists of three convolutional layers, while the discriminator incorporates one LSTM layer and three convolutional layers respectively.

87666888-dda5-4a1e-81a3-bc04dc66cc02.png

Figure 3. (a) Generator structure (b) Discriminator structure

Within the LSTM-GAN architecture, various layers and techniques are employed. LeakyReLU layers facilitate learning by utilizing the Conv1D layer's multidimensional features extraction from time series data, inverse neurons gradient and the LSTM layer captures temporal information. BatchNormalization layers ensure that the inputs to each layer preserve a consistent distribution, aiding in the training process. Dropout layers are utilized to prevent overfittings in it.

In the proposed model, to reach an optimal result in the training set of the LSTM-GAN model, the hyperparameters of LSTM-GAN are finetuned using the Nutcracker optimization method.

NutCracker Optimization Technique

This method is motivated by the behavior of nutcracker birds. It aims to solve optimization issues by mimicking the foraging behavior of these birds. This method has to provide an iterative search for the best solution. The nutcracker birds crack nuts using a process like cracking and pecking actions. The exploration and exploitation phases are evaluated based on these behaviors in optimization issues.

This optimization model is applied for a hyperparameter tuning of LSTM-GAN models. Hyperparameters are crucial settings that have the behaviour and performance of the model. The optimal parameters are achieved using the hyperparameters which is a challenging task in it. This process is attained a greater impact on the LSTM-GAN model’s effectiveness.

This model has to perform an exploration and exploitation phase in the search space efficiently. The hyperparameters of the LSTM-GAN model are the Number of LSTM layers, hidden layer, dropout rate, learning rate, batch size, Number of training iterations, loss function and optimizer etc. that are used in this model. Every nutcracker explores a specific combination of hyperparameters, similar to how a nutcracker bird selects a nut to crack open.

During the exploration phase, the nutcrackers evaluate their hyperparameter performance by evaluating the LSTM-GAN model. The performance is measured using an accuracy or loss. The nutcrackers aim to improve their performance by searching for better hyperparameters iteratively. It can be mathematically modelled as follows:

$x_{\mathrm{i}, \mathrm{j}}^{P 1 s 1}=x_{\mathrm{i}, \mathrm{j}}+{r\,\,and}\left(M_{\mathrm{i}, \mathrm{j}}-I_{\mathrm{i}, \mathrm{j}} x_{\mathrm{i}, \mathrm{j}}\right) \mathrm{i}=1,2 . . {N}$ (1)

$x_{\mathrm{i}, \mathrm{j}}^{P 1 s 1}$ is a position of the $i$ th nutcracker where a rand is a random number that varies from zero to 1. $M$ represents the one member, j is the dimension, and I_i,j are random numbers that vary from one to two.

In the exploitation phase, the nutcrackers exchange data and knowledge gained from their exploration. It is used to explore promising regions in the hyperparameter space and potentially provide better solutions. By combining this learning process, the nutcrackers can exploit the best hyperparameter configurations. Based on it, the hyperparameter tuning performs mutation and crossover functions. These mechanisms introduce diversity and enable the exploration of different hyperparameter combinations, preventing the algorithm from getting stuck in suboptimal solutions. It can be mathematically modelled as follows:

$x_{\mathrm{i}, \mathrm{j}}^{P 2 s 1}=\left\{\begin{array}{c}x_{\mathrm{i}, \mathrm{j}}+{rand}\left(x_{k, \mathrm{j}}-I x_{\mathrm{i}, \mathrm{j}}\right),\left(F_k<F_{\mathrm{i}}\right) \\ x_{\mathrm{i}, \mathrm{j}}+{rand}\left(x_{\mathrm{i}, \mathrm{j}}-x_{\mathrm{k}, \mathrm{j}}\right), E L S E\end{array}\right.$ (2)

where, $x_{\mathrm{i}, \mathrm{j}}^{P 2 s 1}$ is the new position of nutcracker. Where rand is a random number that varies from zero to one.

By iteratively applying the Nutcracker Optimization Method [26] and evaluating the LSTM-GAN model with different hyperparameter sets, the optimal hyperparameters are generated for the specific task or dataset that is shown in Algorithm 1.

Algorithm 1: NutCracker Optimization-based LSTM-GAN model

Step 1: Set the population size, and maximum iterations, and define the search space for hyperparameters.

Step 2: Initialize a population of nutcrackers with random hyperparameter configurations.

Step 3: Evaluate the fitness of each nutcracker's hyperparameters using the LSTM-GAN model on training data.

Step 4: Keep track of the nutcracker with the best fitness (highest performance).

Step 5: Iterate for a specified number of iterations:

a. Perform local search operations to exchange information among nutcrackers.

b. Introduce diversity through mutation and crossover to maintain exploration.

c. Update the nutcracker fitness using population and process Eq. (1) and Eq. (2) iteratively.

d. Choose the best fitness.

Step 6: Return the hyperparameters with the best fitness as the optimized configuration.

It can significantly improve the performance of the LSTM-GAN model effectively. In the training phase, the LSTM-GAN-CATBOOST algorithm employs the optimized LSTM-GAN model for feature extraction and generation of synthetic samples. The CATBOOST algorithm applied as Dis1-CATBOOST performed features extracted by the discriminator to classify time series data. The hybrid of LSTM, GAN, and CATBOOST provides an accurate classification and time series data analysis to enable abnormal leaf detection.

Dis1-CATBOOST

This model serves as the LSTM-GAN-CATBOOST model as a second sub-model for leaf disease prediction. It relies on the discriminator-trained Dis1 from the first sub-model. The training process and data preprocessing for Dis1-CATBOOST are described below.

The Dis1-CATBOOST training set is depicted in Figure 4. To address the classification imbalance problem, the dataset is preprocessed in a way that abnormal data and normal data can each account for 50% of the samples. These samples are fed into Dis1 to attain two feature sets: feature Set 1 for normal data and feature Set 2 for unusual data. Mutually feature sets consist of 128 dimensions. The labels for normal data are set to 0 (label set 1), while the labels for abnormal data are set to 1 (label set 2). Label Set 1, Feature Set 1, Label Set 2 and Feature Set 2 are used as CATBOOST training datasets. The objective is to train CATBOOST to accurately classify normal and abnormal samples. Therefore, the feature extraction performed by Dis1 on abnormal and normal data plays a crucial role in constructing CATBOOST. These extracted features are used as input for CATBOOST, enabling effective classification.

a0ed6230-cfc0-4e54-8e1d-7b8933f0032e.png

Figure 4. Dis1-CATBOOST training set

5. Testing Phase

After training the proposed models, the proposed model is performed in a testing phase that is given in Figure 5. The proposed work aimed to obtain detection results and evaluate time series data as abnormal scores. In Figure 5, the testing data features are extracted and used as input for the generator to reconstruct the data.

f2872002-cca9-47d0-808e-ec7393560daa.png

Figure 5. Testing process of hybrid model

The restored data and the original data of RMSE are evaluated as the reconstruction residual (RR), while the MAE validated the RR’s abnormal values and the testing data serves for the discriminator as the discrimination loss.

For normal test data, both the discrimination loss and RR theoretically have values of 0. However, for abnormal test data, the RR is greater than 0 and the loss of discrimination is 1. To balance the loss and RR, it provided a parameter called alpha (α) with a value of 0.5. The discrimination loss and RR are combined using α as a weight to generate a leaf disease detection score. When this score exceeds the threshold, denoted as beta (β) as 0.5, the test data is classified as irregular leaf disease prediction. Therefore, the proposed method has achieved the leaf disease prediction based on time series data. Importantly, the β and α are parameters that mutually influence each other and impact the prediction results.

6. Experimental Result

The performance evaluation and comparison of the proposed optimised Hybrid model are presented in this section. The leaf disease prediction is performed with effectiveness by using this proposed approach with metrics precision, Recall, F1 score, specificity, RMSE and MAE respectively.

Precision: It is termed as the proportion of true positive (TP) to the sum of true positive and false positive predictions (FP):

Precision $(\mathrm{P})=\mathrm{TP} /(\mathrm{TP}+\mathrm{FP})$ (3)

Recall: It also known as sensitivity or TP rate, quantifies the model's capability to correctly identify positive instances. It is computed as the ratio of TP to the sum of TP and false negative (FN):

$\operatorname{Recall}(\mathrm{R})=\mathrm{TP} /(\mathrm{TP}+\mathrm{FN})$ (4)

F1 Score: It is a metric that mixes ‘P’ and ‘R’ rates into a single value that gives a balanced measure of the model's performance using an equation:

F 1 Score $=(2 * \mathrm{P} * \mathrm{R}) /(\mathrm{P}+\mathrm{R})$ (5)

Specificity: It measures the ability to correctly identify negative instances with a true negative (TN) to the sum of TN and FP that is given in the equation:

Specificity $=$ TN/(TN + FP $)$ (6)

RMSE: It is an evaluation metric used to assess the accuracy of continuous prediction models. It measures the average magnitude of the differences among actual and predicted values of leaf disease across all instances using an equation:

RMSE $=\sqrt{\frac{1}{\mathrm{n}} \sum_{\mathrm{a}=1}^{\mathrm{n}}\left(\mathrm{d}_{\mathrm{a}}^{\prime}-\mathrm{d}_{\mathrm{a}}\right)^2}$ (7)

MAE: It is another evaluation metric that measures the average of the absolute differences among actual values and predicted values of leaf disease across all instances using an equation:

$M A E=\sqrt{\frac{1}{n} \sum_{a=1}^n\left|d_a^{\prime}-d_a\right|^2}$ (8)

where, 'n' represents the total number of instances, and $d_a^{\prime}$ represent the predicted and $d_a$ denotes an actual value of leaf disease, respectively.

Accuracy: It measures the overall correctness of predictions made by a model. It represents the percentage of instances that were correctly classified out of the total number of instances.

Table 1 presents an analysis of different approaches used for leaf disease prediction, along with their corresponding evaluation metrics. The proposed model outperformed all the techniques with its excellent performance across all metrics. In terms of error metrics given in Figure 6, the proposed model exhibits a low root mean square error (RMSE) of 1.56, indicating a minimal average magnitude of errors between the predicted and actual values. Similarly, the mean absolute error (MAE) is also low at 1.35, further affirming the accuracy of the predictions.

From Figure 7, the proposed model achieves an exceptional precision of 98%, indicating that a high percentage of the instances predicted as positive by the model are indeed correct (Figure 8(a)). Additionally, it achieves a recall of 97.2%, accurately identifying a significant portion of the actual positive instances (Figure 8(b)). The F1 score is used to represent a balance between precision and recall, is impressively high at 98.3%, showcasing the overall accuracy of the proposed model in classifying positive instances (Figure 8(c)). Furthermore, the proposed model demonstrates a specificity of 97.5%, correctly identifying a large proportion of the actual negative instances (Figure 8(d)). With an accuracy of 98.6%, the model showcases its ability to accurately classify a vast majority of all instances (Figure 8(e)).

Comparatively, when compared to other techniques such as GAN, LSTM, RNN, CATBOOST, XGBOOST, CNN, and SVM, the proposed optimized hybrid LSTM-GAN-CATBOOST model consistently outperforms them in terms of precision, recall, F1 score, specificity, accuracy, RMSE, and MAE. These results highlight the effectiveness and reliability of the proposed model in predicting leaf diseases with high accuracy and minimal error.

Table 1. Comparison of the proposed model with other models

Techniques	Precision (%)	Recall (%)	F1 Score (%)	Specificity (%)	Accuracy (%)	RMSE	MAE
Proposed	98	97.2	98.3	97.5	98.6	1.56	1.35
GAN	96.2	96	95.5	94	97.2	2.3	2.66
LSTM	95.5	95	94.2	93.2	97	2.8	3.2
RNN	94	93.2	92.6	92.6	95.3	3.2	3.45
CATBOOST	93.4	91	90	91.8	94.5	3.5	3.9
XGBOOST	92	90	89.8	88.5	92	4.2	4.6
CNN	90	89.5	88	86	90.5	4.8	4.93
SVM	89	88	86.5	85	88.5	5.5	5.89

f6587124-6bfe-414e-bfa7-da49c24ba557.png

Figure 6. Performance metrics based on leaf disease prediction

8444d986-1f1d-4d38-b5c0-977d2395e69c.png

Figure 7. Performance metrics of error evaluation

a9fdd61d2ef62137544c2e755afc32a0.png

(a)

1b164af9-fa0a-465c-96ac-b78f7536a58e.png

(b)

2e3963c7-48ca-4805-b400-a061226dfa2e.png

(c)

5520be8c-5d55-4fa2-9ce4-743c4f3876cd.png

(d)

a3658bbe-6c8f-4816-b4a6-4dadb8b15fa5.png

(e)

Figure 8. Performance chart for metrics a) precision, b) Recall, c) F1 score, d) Specificity, and e) Accuracy

7. Conclusion

The leaf disease detection is presented using a hybrid model of LSTM-GAN and CATBOOST algorithms to achieve an efficient leaf disease prediction. The proposed method used a nutcracker optimization that used to fine-tune the LSTM-GAN model’s hyperparameters to achieve an optimal value in detection. Next, the training model has to perform a CATBOOST algorithm for improved classification performance. The performance results of the proposed model are validated and evaluated in leaf disease prediction. The experimental result proposed has achieved precision, recall, F1 score, specificity, accuracy, RMSE, and MAE values of 98%, 97.2%, 98.3%, 97.5%, 98.6%, 1.56, and 1.35, respectively. This result showed that the proposed method outperforms other traditional techniques such as GAN, LSTM, RNN, CATBOOST, XGBOOST, CNN, and SVM respectively. The experimental result showed that the proposed method has achieved a better result than other traditional techniques such as GAN, LSTM, RNN, CATBOOST, XGBOOST, CNN and SVM respectively. The proposed model attained high accuracy, reliability and low error in detecting leaf diseases effectively. For future enhancements, further research can focus on expanding the application of the proposed approach to different crop species and leaf disease types. Additionally, the exploration of additional optimization techniques and ensemble models could potentially enhance the predictive performance even further. Overall, continuous improvements and advancements in leaf disease prediction techniques are crucial for ensuring crop health and agricultural productivity.

References

[1] Sudar, K.M., Nagaraj, P., Prakash, B., Reddy, M.M., Naidu, M.M., Kumar, H. (2022). Development of tomato leaf disease prediction system to the farmers by using artificial intelligent network. In 2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, pp. 955-961. https://doi.org/10.1109/ICICCS53718.2022.9788189

[2] Vasavi, P., Punitha, A., Rao, T.V.N. (2022). Crop leaf disease detection and classification using machine learning and deep learning algorithms by visual symptoms: A review. International Journal of Electrical and Computer Engineering, 12(2): 2079. https://doi.org/10.11591/ijece.v12i2.pp2079-2086

[3] Peng, Y., Wang, Y. (2022). Leaf disease image retrieval with object detection and deep metric learning. Frontiers in Plant Science, 13: 963302. https://doi.org/10.3389/fpls.2022.963302

[4] Kaur, P., Harnal, S., Tiwari, R., Upadhyay, S., Bhatia, S., Mashat, A., Alabdali, A.M. (2022). Recognition of leaf disease using hybrid convolutional neural network by applying feature reduction. Sensors, 22(2): 575. https://doi.org/10.3390/s22020575

[5] Paymode, A.S., Malode, V.B. (2022). Transfer learning for multi-crop leaf disease image classification using convolutional neural network VGG. Artificial Intelligence in Agriculture, 6: 23-33. https://doi.org/10.1016/j.aiia.2021.12.002

[6] Harakannanavar, S.S., Rudagi, J.M., Puranikmath, V.I., Siddiqua, A., Pramodhini, R. (2022). Plant leaf disease detection using computer vision and machine learning algorithms. Global Transitions Proceedings, 3(1): 305-310. https://doi.org/10.1016/j.gltp.2022.03.016

[7] Sungheetha, A. (2022). State of art survey on plant leaf disease detection. Journal of Innovative Image Processing, 4(2): 93-102. https://doi.org/10.36548/jiip.2022.2.004

[8] Mohapatra, M., Parida, A.K., Mallick, P.K., Zymbler, M., Kumar, S. (2022). Botanical leaf disease detection and classification using convolutional neural network: A hybrid metaheuristic enabled approach. Computers, 11(5): 82. https://doi.org/10.3390/computers11050082

[9] Lu, X., Yang, R., Zhou, J., Jiao, J., Liu, F., Liu, Y., Su, B., Gu, P. (2022). A hybrid model of ghost-convolution enlightened transformer for effective diagnosis of grape leaf disease and pest. Journal of King Saud University-Computer and Information Sciences, 34(5): 1755-1767. https://doi.org/10.1016/j.jksuci.2022.03.006

[10] Sridevi, S., Kiran Kumar, K. (2024). Optimised hybrid classification approach for rice leaf disease prediction with proposed texture features. Journal of Control and Decision, 11(1): 84-97. https://doi.org/10.1080/23307706.2022.2141359

[11] Suresh, Seetharaman, K. (2023). Real-time automatic detection and classification of groundnut leaf disease using hybrid machine learning techniques. Multimedia Tools and Applications, 82(2): 1935-1963. https://doi.org/10.1007/s11042-022-12893-1

[12] Lanjewar, M.G., Panchbhai, K.G. (2023). Convolutional neural network based tea leaf disease prediction system on smart phone using paas cloud. Neural Computing and Applications, 35(3): 2755-2771. https://doi.org/10.1007/s00521-022-07743-y

[13] Bouni, M., Hssina, B., Douzi, K., Douzi, S. (2023). Impact of pretrained deep neural networks for tomato leaf disease prediction. Journal of Electrical and Computer Engineering, 2023(1): 5051005. https://doi.org/10.1155/2023/5051005

[14] Mahesh, T.R., Sivakami, R., Manimozhi, I., Krishnamoorthy, N., Swapna, B. (2023). Early predictive model for detection of plant leaf diseases using MobileNetV2 architecture. International Journal of Intelligent Systems and Applications in Engineering, 11(2): 46-54.

[15] Bhandari, M., Shahi, T.B., Neupane, A., Walsh, K.B. (2023). Botanicx-ai: Identification of tomato leaf diseases using an explanation-driven deep-learning model. Journal of Imaging, 9(2): 53. https://doi.org/10.3390/jimaging9020053

[16] Adem, K., Ozguven, M.M., Altas, Z. (2023). A sugar beet leaf disease classification method based on image processing and deep learning. Multimedia Tools and Applications, 82(8): 12577-12594.

[17] Stephen, A., Punitha, A., Chandrasekar, A. (2023). Optimal deep generative adversarial network and convolutional neural network for rice leaf disease prediction. The Visual Computer, 1-18. https://doi.org/10.1007/s11042-022-13925-6

[18] Daniya, T., Srinivasan, V. (2023). Shuffled shepherd social optimization based deep learning for rice leaf disease classification and severity percentage prediction. Concurrency and Computation: Practice and Experience, 35(4): e7523. https://doi.org/10.1002/cpe.7523

[19] Aufar, Y., Abdillah, M.H., Romadoni, J. (2023). Web-based CNN application for arabica coffee leaf disease prediction in smart agriculture. Jurnal Resti (Rekayasa Sistem Dan Teknologi Informasi), 7(1): 71-79. https://doi.org/10.29207/resti.v7i1.4622

[20] Ashwini, C., Sellam, V. (2023). EOS-3D-DCNN: Ebola optimization search-based 3D-dense convolutional neural network for corn leaf disease prediction. Neural Computing and Applications, 35(15): 11125-11139. https://doi.org/10.1007/s00521-023-08289-3

[21] Yang, W., Shen, P., Ye, Z., Zhu, Z., Xu, C., Liu, Y., Mei, L. (2023). Adversarial training collaborating multi-path context feature aggregation network for maize disease density prediction. Processes, 11(4): 1132. https://doi.org/10.3390/pr11041132

[22] Stephen, A., Punitha, A., Chandrasekar, A. (2023). Designing self attention-based ResNet architecture for rice leaf disease classification. Neural Computing and Applications, 35(9): 6737-6751. https://doi.org/10.1007/s00521-022-07793-2

[23] SpMohanty/PlantVillage-Dataset. https://github.com/spMohanty/PlantVillage-Dataset, accessed on Jan. 24, 2021.

[24] Lian, Y., Geng, Y., Tian, T. (2023). Anomaly detection method for multivariate time series data of oil and gas stations based on digital twin and MTAD-GAN. Applied Sciences, 13(3): 1891. https://doi.org/10.3390/app13031891

[25] Xu, X., Zhao, H., Liu, H., Sun, H. (2020). LSTM-Gan-xgboost based anomaly detection algorithm for time series data. In 2020 11th International Conference on Prognostics and System Health Management (PHM-2020 Jinan), Jinan, China, pp. 334-339. https://doi.org/10.1109/PHM-Jinan48558.2020.00066

[26] Abdel-Basset, M., Mohamed, R., Jameel, M., Abouhawwash, M. (2023). Nutcracker optimizer: A novel nature-inspired metaheuristic algorithm for global optimization and engineering design problems. Knowledge-Based Systems, 262: 110248. https://doi.org/10.1016/j.knosys.2022.110248

IJHT
MMEP
ACSM
EJEE
ISI
I2M
JESA
RCMA
RIA
TS
IJSDP
IJSSE
IJDNE
JNMES
IJES
EESRJ
RCES
AMA_A
AMA_B
AMA_C
AMA_D
MMC_A
MMC_B
MMC_C
MMC_D

Username
Password
Remember me

Search form

An Optimal Prediction of Leaf Disease Based on Hybrid Deep Learnings and Metaheuristic Technique