© 2026 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
Several pathological problems that affect cotton plants can result in lower yield, quality degradation and economic losses. These conditions are referred as cotton plant disease. Numerous pathogens such as fungus, bacteria, viruses and other microbes are the cause of these illnesses. Cotton plant disease detection is the process of identifying and categorizing illnesses that affect the cotton by using cutting edge technologies such as deep learning (DL). Both healthy and diseased cotton leaves were included in the datasets used to train the machine learning (ML) methods. This study uses DL to create a comprehensive cotton plant disease detection method. This developed method is made to precisely find and classify this illness, providing farmers with a reliable way to enhance the crop management. During pre-processing, the method uses Weiner filtering based noise removal to enhance the quality of input images. This method assures the ideal settings for classification by improving the image features such as clarity and relevancy. Efficient net is an advanced DL architecture used for feature extraction process. The zebra optimization algorithm (ZOA) is used to perform hyperparameter tuning which is essential for method optimization. An attention based convolutional long short-term memory (ConvLSTM) network is used for the classification challenges. The dynamic and developing nature of cotton plant disease can be recognized by this architecture since it excels at capturing both geographical and temporal independence in image sequences. The Kaggle cotton plant disease dataset which included a variety of classes like aphids, bacterial blight etc., and category for healthy and disease leaves. These are used to test the model. The results show its effectiveness in correctly recognizing and categorizing different illnesses. The proposed cotton plant disease detection model combines the noise removal, feature extraction, hyperparameter tuning and classification techniques to present a viable option for precision agriculture by providing a tool for early disease detection and decision making in efficient crop management method.
cotton plant disease detection, Weiner filtering, EfficientNet, zebra optimization algorithm, attention-based ConvLSTM
The prevalence of plant diseases poses a major challenge to agriculture by reducing crop yield and deteriorating product quality. Hence, it is essential to adopt effective methods for early and accurate detection of diseases in agriculture. Among major commercial crops, cotton plays a vital role in the textile industry and contributes substantially to agricultural economies in many countries [1, 2]. However, cotton plants are susceptible to various leaf diseases such as bacterial blight, powdery mildew, target spot, aphids, and armyworm infestations. These diseases cause severe damage to leaves, reduce photosynthetic activity, and ultimately decrease crop yield and quality. Traditionally, farmers identify these diseases through manual visual inspection of plant leaves. Although this method is widely practiced, it is often time-consuming, subjective, and impractical for monitoring large-scale agricultural fields. Furthermore, incorrect diagnosis may lead to improper pesticide usage, which increases production costs and negatively impacts the environment.
With the rapid advancement of artificial intelligence and computer vision techniques, automated plant disease detection systems have emerged as an effective solution for improving agricultural monitoring [3, 4]. Machine learning (ML) and deep learning (DL) models have demonstrated strong capability in analyzing plant images and identifying disease symptoms [5-7]. In particular, Convolutional Neural Networks (CNNs) have been widely used for feature extraction and classification in plant disease detection tasks. Several studies have applied DL models such as VGG, Inception, ResNet, and EfficientNet for leaf disease classification with promising results [8]. Transfer learning techniques have also been utilized to improve model performance when training datasets are limited. Despite these developments, several challenges still remain in developing reliable automated disease detection systems [9-11]. Variations in illumination, background complexity, leaf orientation, and similarity between healthy and diseased leaf textures can reduce the accuracy of classification models. In addition, many existing approaches rely on manually selected hyperparameters, which may limit the adaptability and performance of the models [12-15].
To overcome these limitations, this research proposes an integrated DL framework for cotton plant disease detection that combines image preprocessing, feature extraction, optimization, and classification techniques. Initially, Wiener filtering is applied during the preprocessing stage to remove noise and enhance the quality of input images. This step improves image clarity and preserves important visual patterns that are necessary for reliable disease identification. After preprocessing, EfficientNet is employed as the feature extraction model. EfficientNet is known for its scalable architecture and balanced network design, which enables efficient learning of complex visual features from plant leaf images.
In order to further improve the performance of the DL model, hyperparameter tuning is performed using the ZOA. ZOA is a swarm-based optimization technique inspired by the natural behavior of zebras during foraging and defense against predators. The algorithm effectively explores the search space to identify optimal hyperparameter values, which enhances the learning capability of the model and reduces the dependency on manual parameter selection. Following the optimization stage, an attention-based ConvLSTM network is employed for the classification of cotton leaf diseases. ConvLSTM combines convolutional operations with recurrent neural networks to capture both spatial and contextual dependencies within the extracted features. The integration of an attention mechanism further improves the model’s ability to focus on important feature regions associated with disease patterns.
The novelty of this work lies in the integration of preprocessing, deep feature extraction, adaptive optimization, and attention-based spatiotemporal classification within a unified framework. Unlike conventional approaches that rely on standalone CNN models or manual parameter tuning, the proposed method combines image enhancement, optimization-driven learning, and contextual feature modeling to improve detection accuracy and robustness.
Thus, the main contributions of this work include the following:
• Development of an automated DL framework for accurate detection and classification of cotton plant leaf diseases to support precision agriculture.
• Application of Wiener filtering–based preprocessing to enhance image quality by reducing noise and improving disease-related feature visibility.
• Utilization of the EfficientNet architecture for robust feature extraction, enabling effective learning of complex visual patterns in cotton leaf images.
• Implementation of ZOA for adaptive hyperparameter tuning to improve model performance and training efficiency.
• Integration of an attention-based ConvLSTM classifier to capture spatial and contextual dependencies for improved multi-class disease classification.
The paper organization is as follows. Section 1 is the introduction regarding cotton plant leaf disease. Section 2 is a literature survey of existing works. Section 3 elaborates the proposed methodology with data collection, pre-processing using Wiener filter, feature extraction using EfficientNet, hyper parameter tuning using ZOA, and classification using attention based ConvLSTM. Section 4 is the result and analysis. Section 5 is the conclusion of the research work.
1.1 Research motivation
Cotton crops are referred to as "cash crops." The growth of Cotton plants may be affected by a wide range of diseases that impair production through the leaf, including target spot, leaf spot etc. Traditional disease identification methods rely on manual inspection is a time consuming one and often inaccurate for large farming areas. Recent deep learning techniques have improved plant disease detection, but many models still face challenges such as noise in images, complex backgrounds, and improper hyperparameter selection. Therefore, this work aims to develop an efficient detection framework by integrating image preprocessing, deep feature extraction, optimization-based hyperparameter tuning, and attention-based classification. The proposed approach enhances disease recognition accuracy and supports early diagnosis for effective crop management in precision agriculture.
2.1 Object detection
An improved YOLOX-based model was proposed by incorporating a modified Spatial Pyramid Pooling (SPP) layer to extract multi-scale features effectively from training data. The method concatenates features pooled at different scales, ranging from smaller to larger receptive fields, to enhance feature representation. Additionally, skip connections were introduced to improve the generalization capability of the network. To further enhance convergence and detection accuracy, an αIoU-based regression loss function was employed. The study utilized a dataset of 1,112 cotton leaf images collected from the Southern Punjab region of Pakistan, including healthy samples and multiple severity levels of cotton leaf [16].
An improved YOLOX-based real-time detection model was proposed for identifying cotton diseases. The model integrates Efficient Channel Attention (ECA), Hard-Swish activation, and Focal Loss to enhance feature extraction capability, address class imbalance, and improve detection accuracy and speed. A dataset of 5,760 manually annotated images covering multiple disease and pest categories was used for training and evaluation. A smartphone-based application was developed for real-time field deployment, highlighting the practical applicability of the approach in precision agriculture [17].
2.2 Deep learning based classification models
A VGG-16-based DL approach was proposed for cotton leaf disease detection, where data augmentation was applied to balance the dataset and improve generalization. The model utilizes a pre-trained VGG-16 network for feature extraction, followed by fully convolutional layers to generate anomaly maps that indicate lesion regions. The method achieved a high accuracy of 99.99% on publicly available Kaggle datasets. However, despite its strong performance, the approach relies on a static architecture and manually selected hyperparameters, which may limit its adaptability to varying field conditions. Additionally, the model primarily focuses on spatial feature representation and does not incorporate advanced optimization or temporal learning mechanisms, which could further enhance classification performance [18].
A comparative analysis of traditional feature extraction techniques for cotton leaf disease diagnosis was conducted using different filters. The study utilized a dataset of 2,400 cotton leaf images, including healthy samples and diseased one. K-means clustering was applied for leaf segmentation, followed by classification using a Support Vector Machine (SVM). Experimental results showed that the Gabor wavelet-based method achieved the highest accuracy of 92%, outperforming other feature extraction techniques. However, the reliance on handcrafted features and conventional classifiers limits the model’s ability to generalize under complex real-world conditions, highlighting the need for more robust DL -based approaches [19].
A hybrid DL framework combining EfficientNetB3 and InceptionResNetV2 was proposed for cotton leaf disease classification. This approach effectively handled visually similar disease classes and demonstrated minimal over fitting. Additionally, explainable AI (XAI) techniques such as LIME and SHAP were incorporated to improve model interpretability by highlighting important features contributing to predictions. The model was also designed to be lightweight and suitable for deployment for real-time applications [20].
A DL based cotton leaf disease detection approach was developed using fine-tuned transfer learning (TL) models by optimizing the layers and parameters of pre-trained networks. The study evaluated multiple TL architectures using a publicly available cotton disease dataset. Experimental results indicated that the Xception model achieved the highest classification accuracy of 98.70%. Based on this performance, the model was further utilized to develop a web-based smart application for real-time cotton disease prediction in agricultural practice. The proposed approach demonstrates high diagnostic capability and provides a scalable solution for automated leaf disease detection in cotton and potentially other crops [21].
A few-shot learning framework was proposed for cotton leaf disease spot classification to enable timely disease prevention and control. Initially, disease spots were segmented using different methods, and their performance was evaluated using Support Vector Machine (SVM) and threshold-based segmentation to identify the most suitable approach. The segmented disease regions were then used to construct a disease spot dataset, which was classified using a designed CNN architecture. Furthermore, a parallel two-branch CNN with shared weights was employed to extract features from image pairs, and a metric learning-based loss function was used to map similar samples closer and dissimilar samples farther in the feature space. This approach provides an effective solution for classification with limited data and serves as a valuable benchmark for few-shot learning applications in agricultural disease detection [22].
2.3 Few-shot learning approaches
Cotton leaf curl disease (CLCuD), is the most severe viral diseases affecting cotton in the Indian subcontinent. To enable early and accurate detection, a LAMP protocol was developed for diagnosing CLCuV. The study demonstrated that LAMP-based detection, combined with colorimetric analysis using different dyes, simplifies field-level diagnosis. Furthermore, the integration of Rolling Circle Amplification (RCA) with LAMP improved detection sensitivity, overcoming limitations of conventional PCR. This RCA-LAMP approach provides an effective diagnostic tool for early-stage detection of cotton viral infections [23].
However, despite its high accuracy and interpretability, the approach relies on complex hybrid architectures, which may increase computational overhead and limit its scalability. This indicates the need for optimized models that balance accuracy, interpretability and efficiency.
Table 1. Comparative overview of survey
|
Category |
Dataset Size |
Model Type |
Limitation |
|
Object Detection |
1,112 – 5,760 |
YOLOX variants |
High computation, large data needed |
|
Classification |
Medium–Large |
CNN, Hybrid DL |
No temporal/optimization features |
|
Transfer Learning |
Medium |
Pre-trained CNNs |
Manual tuning |
|
Few-Shot Learning |
Small |
Metric learning CNN |
Lower generalization |
|
Diagnostic Methods |
Lab-based |
LAMP, RCA |
Not image-based |
Table 1 presents a comparative overview of different methodologies used for cotton leaf disease detection, categorized based on their approach, dataset size, model type, and limitations. From the Table 1, it is found that there is a lack of a unified framework that integrates preprocessing, feature extraction, adaptive optimization, and advanced classification techniques. This gap motivates the development of the proposed approach, which combines image enhancement, efficient feature extraction, optimization-based hyperparameter tuning, and attention-driven classification to achieve improved accuracy and robustness in cotton leaf disease detection.
The proposed methodology develops a complete model for Cotton Plant Disease Detection using DL techniques with respect to various stages such as data collection, pre-processing, feature extraction and classification. The proposed approach is designed to accurately detect and classify cotton leaf diseases, providing a reliable tool for improving crop management. Initially, raw input images are processed using Wiener filtering to remove noise and enhance image quality. This preprocessing step improves the clarity and relevance of visual features, ensuring better conditions for subsequent analysis. The enhanced images are then passed to the EfficientNet model for feature extraction, where complex patterns and variations associated with different cotton diseases are effectively captured due to its scalable and efficient architecture.
Following feature extraction, hyperparameter tuning is performed using ZOA, which automatically determines optimal parameter settings to improve model performance and reduce manual intervention. The optimized feature representations are subsequently fed into an attention-based ConvLSTM classifier. This model effectively captures both spatial and temporal dependencies in the data, enabling accurate identification of disease patterns, including subtle variations across different classes.
The proposed framework is evaluated using the publicly available Kaggle cotton plant disease dataset, which includes multiple categories such as healthy and diseased leaves. Experimental results demonstrate that the proposed method achieves reliable and accurate classification performance across different disease classes. The overall procedure adopted is portrayed in Figure 1.
Figure 1. Overall proposed methodology
3.1 Data collection
The dataset used for the proposed cotton plant disease classification is obtained from a publicly available Kaggle repository. It is gathered from the link, “https://www.kaggle.com/datasets/dhamur/cotton-plant-disease”. The dataset mainly contains leaf images representing both healthy and diseased cotton plants. The disease categories included in the dataset are depicted in Figure 2. All images are captured under natural field conditions, which introduces variations in lighting, background, orientation, and leaf texture. The dataset focuses only on leaf-level symptoms and does not include images of stems, buds, flowers, or bolls. This class diversity enables multi-class disease classification, while the presence of healthy samples supports reliable discrimination between infected and non-infected leaves. The dataset provides sufficient variability in color patterns, lesion shapes, and infection severity to evaluate the robustness of the proposed detection framework. To improve generalization and handle variability, data augmentation techniques (rotation, flipping, scaling) are applied to the training images.
Figure 2. Sample images gathered from the Kaggle dataset
The entire dataset has been divided in the ratio of 80:20 for training and testing process. The details about the samples used in the experimentation under different classes are presented in Table 2.
Table 2. Experimental dataset details
|
Classes |
Samples (Training) |
Samples (Testing) |
Total |
|
Aphids |
640 |
160 |
800 |
|
Army worm |
640 |
160 |
800 |
|
Bacterial blight |
640 |
160 |
800 |
|
Healthy |
640 |
160 |
800 |
|
Powdery mildew |
640 |
160 |
800 |
|
Target spot |
630 |
158 |
788 |
|
Total |
3830 |
958 |
4788 |
3.2 Pre-processing
After collecting the images from the Kaggle dataset, the pre-processing is accomplished using the Wiener filter. This helps in the removal of noise and also improves the quality of the collected input images. An ideal trade-off between noise smoothing and inverse filtering is carried out via Wiener filtering. Meanwhile, it reverses the blurring as well as eliminates the additional noise. The Mean Square Error (MSE) among the intended procedure as well as the approximated procedure is reduced using the Wiener filter. During the noise smoothing and inverse filtering processes, it reduces the total MSE. A linear estimate associated with the original image is what the Wiener filter does. Wiener filters are useful for reducing noise in images by comparing the image's noise level to an estimate of the desired noiseless signal. It is predicated on statistical analysis. Three crucial characteristics define Wiener filters.
Performance criterion: Minimum Mean-Square Error (MMSE). This filter is commonly employed during the deconvolution procedure.
Requirement: The filter needs to be causal and physically realizable.
Assumption: Stationary linear stochastic procedures associated with noise and image having known autocorrelation or known spectral characteristics and cross correlation.
The Wiener filter is widely used due to its quickness as well as simplicity. It is considered simple because it determines optimal filter weights by solving linear equations to reduce noise in the received signal. These weights are computed using covariance and cross-correlation of the noisy signal, enabling accurate estimation of the underlying signal in the presence of Gaussian noise. The signal's deterministic component is further evaluated by processing a fresh input signal having suitable filter weights and comparable noise features. When the noise distribution is Gaussian, this technique works well. Moreover, a small number of quick computing processes are needed to execute it. Fourier Domain Wiener Filter is shown below.
$H(v, w)=\frac{I *(v, w) Q_t(v, w)}{|I(v, w)|^2 Q_t(v, w)+Q_o(v, w)}$ (1)
Dividing by $Q_t$ facilitates the explanation of its behavior:
$H(v, w)=\frac{I *(v, w)}{|I(v, w)|^2+\frac{Q_o(v, w)}{Q_t(v, w)}}$ (2)
Here, the Power Spectral Density (PSD) associated with the un-degraded image is shown by $Q_t(v, w)$, PSD associated with the noise is shown by $Q_o(v, w)$, complex conjugate related to the degradation function is given by $I *(v, w)$, and the degradation function is given by $I(v, w)$ respectively.
The phrase $\frac{Q_o}{Q_t}$ may be seen as the Signal-to-Noise Ratio (SNR) reciprocal. Using statistics obtained from the local neighborhood associated with every pixel, the Wiener filter is employed to eliminate noise from a distorted image. The noise strength of an image, or the variance in noise, determines how this filter works. A big variation results in minimal smoothing, whereas a small variance results in more smoothing from the filter.
3.3 Feature extraction
The features are extracted from the pre-processed images using the EfficientNet method. Its efficiency as well as scalability helps in capturing intricate patterns as well as variations related to distinct cotton plant diseases. The Google Brain Team created CNN method EfficientNet. According to this network scaling, performance may be increased by optimizing the depth, breadth, as well as resolution of the network. Scaling a Neural Network (NN) to build additional DL models, which provide far greater effectiveness and accuracy than the earlier employed CNNs, is one way to generate a novel method. EfficientNet consistently and accurately completed large-scale image recognition tasks for the Image Net. A network's depth and count of layers are correlated. The breadth of the convolutional layer corresponds to the count of filters it has. The resolution is determined by the supplied image's height as well as breadth.
$e=\alpha \varphi$ (3)
$x=\beta \varphi$ (4)
$s=\gamma \varphi$ (5)
$\alpha \geq 1, \beta \geq 1, \gamma \geq 1$ (6)
Here, $e$, $x$, and $s$ stand for the depth, width, as well as resolution of the network, correspondingly, and the grid search hyperparameter tuning method was used to obtain the constant terms $\alpha$, $\beta$, and $\gamma$. All model scaling resources are managed by a user-defined variable called the coefficient. On the basis of available resources, this method modifies the network's depth, breadth, and resolution to maximize memory use and accuracy. EfficientNet confirmed its usefulness outside of the dataset and yielded exceptional outcomes even with the use of the transfer learning approach. Scales ranging from 0 to 7 were included in the method's release, signifying an improvement in parameter size and accuracy.
In this case, the chosen classifier is the sigmoid. The mathematical function associated with a sigmoid classifier having a recognizable S-shaped curve is displayed in Eq. (7). A logistic function that conducts a binary classification is called sigmoid. It establishes a threshold value of 0.5, assigning values to either 0 or 1. These classes are represented by the neuron at the last dense layer.
$(y)=\operatorname{Sigmoid}(y)=\frac{1}{1+e^{-y}}$ (7)
3.4 Hyper parameter tuning using zebra optimization algorithm (ZOA)
Following the feature extraction process, the hyperparameter tuning is performed for the automatic selection of optimal hyper parameters, thereby leading to attaining peak performance on the particular task of cotton plant disease detection. here, the hyperparameter tuning is accomplished by the ZOA. As a result, fine-tuning was necessary because EfficientNet's present structure could not be used for the task that is selected. After that, the training data is used to fine-tune the suggested end layers while freezing every layer associated with the basic method. By using this technique, it is capable of maintaining the feature extraction capacity in the weights that are acquired during training with the dataset inside the extraction layers and stop the training iterations from overriding them. Next, all of the layers in the network are unfrozen using weights taken from the dataset after training the suggested layers. This allows us to integrate and build the final method. Next, the finished model is verified using the test data.
ZOA replicates the way zebra that eat and defend themselves from predators. Zebras are part of the population that uses the population related optimizer ZOA. Mathematically, the plain with zebras on it represent the search space for the problem and each zebra represents a potential solution.
The ZOA is employed to optimize key hyperparameters of the proposed model to enhance classification performance. The parameters selected for optimization include learning rate, batch size, number of training epochs, and dropout rate, as these significantly influence model convergence and generalization.
The details of simulation hyperparameters are presented in Table 3.
The population size and iteration count of the Zebra Optimization process are fixed to ensure stable convergence during hyperparameter tuning.
The objective function of ZOA is to maximize classification accuracy (or minimize validation loss). The algorithm iteratively updates candidate solutions until convergence is achieved, which occurs either when the maximum number of iterations is reached or when there is no significant improvement in performance over successive iterations.
Table 3. Simulation hyperparameters
|
Module |
Parameter |
Value |
|
Input |
Image size |
224 × 224 × 3 |
|
Feature Extraction |
Backbone model |
EfficientNet-B0 |
|
Training |
Batch size |
16 |
|
Optimizer |
Adam |
|
|
Learning rate |
0.0001 |
|
|
Epochs |
100 |
|
|
Regularization |
Dropout |
0.5 |
|
Optimization |
Population size (ZOA) |
30 |
|
Maximum iterations (ZOA) |
50 |
|
|
ConvLSTM |
Hidden units |
128 |
|
Kernel size |
3 × 3 |
|
|
Number of layers |
2 |
|
|
Classification |
Output classes |
6 |
|
Loss Function |
Type |
Cross-entropy |
Decision variable values are based on where each zebra is located in the search space. The elements associated with a vector that depict each zebra as an individual inside the ZOA may thus be used to describe the values associated with the issue variables. The zebra population can be quantitatively designed using a matrix. Initially, the zebras are placed at random around the search area. The ZOA population matrix can be found in Eq. (8).
$Z=\left[\begin{array}{c}Z_1 \\ \vdots \\ Z_k \\ \vdots \\ Z_P\end{array}\right]_{P \times o}=\left[\begin{array}{ccccc}z_{1,1} & \cdots & z_{1, l} & \cdots & z_{1, o} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ z_{k, 1} & \cdots & z_{j, l} & \cdots & z_{k, o} \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ z_{P, 1} & \cdots & z_{P, l} & \cdots & z_{P, o}\end{array}\right]_{P \times o}$ (8)
In this case, $P$ denotes the number of population members (zebras), $O$ represents number of decision variables, $Z$ denotes the population of zebras, $Z_k$ denotes the $k^{t h}$ zebra, and $z_{j, l}$ denotes the solution to the $l^{\text {th}}$ problem variable that the $k^{\text {th}}$ zebra supplied. Each zebra represents a possible fix for the optimization problem. Hence, the objective function can be assessed by referring to the values that each zebra suggests for the problem variables. A vector representing the values acquired for the objective function is provided by Eq. (9).
$H=\left[\begin{array}{c}H_1 \\ \vdots \\ H_k \\ \vdots \\ H_P\end{array}\right]_{P \times 1}=\left[\begin{array}{c}H\left(Z_1\right) \\ \vdots \\ H\left(Z_k\right) \\ \vdots \\ H\left(Z_P\right)\end{array}\right]_{P \times 1}$ (9)
H shows the vector of objective function values in this situation and $H_k$ shows the value of the objective function for the $k^{t h}$ zebra. One way to find the best possible solution to a problem is to compare the values of the objective function, which measures how well different solutions perform.
The best zebra in the group is called the pioneer zebra in ZOA; it leads the population's other zebras to its position in the search space. Therefore, it is possible to replicate numerically how zebras' locations are changed throughout the foraging phase using Eqs. (10) and (11).
$z_{k, l}^{\text {new }, R 1}=z_{k, l}+t \cdot\left(R B_l-K \cdot z_{k, l}\right)$ (10)
$Z_k=\left\{\begin{array}{c}Z_k^{\text {new }, R 1}, \quad H_k^{\text {new }, R 1}<H_k \\ Z_k, \quad \text { else }\end{array}\right.$ (11)
Here, the values associated with the objective function $H_k^{{new,R1}}$, $Z_k^{{new,R1}}$ represents the novel status related to the $k^{t h}$ zebra based on the first phase, the $l^{\text {th}}$ dimension value $z_{k, l}^{\text {new, } R 1}$, the $l^{\text {th}}$ dimension $R B_l$, and the random count $t$ in the interval [0, 1] are displayed. Moreover, $R B$ displays the pioneer zebra, which characterizes the ideal person. $K$ is equal to round(1 + Rand), where Rand represents random count with interval of [0, 1]. If K belongs to {1, 2} and if K = 2, then the mobility of population altered more significantly. According to the ZOA method, there exists an equal chance of either among the following two scenarios happening: In the initial approach, zebras are ambushed by lions and attempt to escape by congregating in the vicinity of their location. This means that this procedure can be quantitatively represented using the mode $U_1$ in (12). In an effort to mislead and frighten the attacker, the remaining zebras in the herd move toward the attacked one, attempting to create a defense pattern. The second approach is this one. This zebra approach is mathematically represented by the mode $U_2$ in (12). If changing a zebra's location results in a higher value for the objective function there, then the new position can be allowed. Using the Eq. (13) update scenario for model is shown below.
$z_{k, l}^{\text {new}, R 2}=\left\{\begin{array}{c}U_1: z_{k, l}+T \cdot(2 t-1) \cdot\left(1-\frac{v}{V}\right) \cdot z_{k, l}, \quad R_u \leq 0.5 \\ U_2: z_{k, l}+t \cdot\left(C B_l-K \cdot z_{k, l}\right), \quad {otherwise}\end{array}\right.$ (12)
$Z_k=\left\{\begin{array}{c}Z_k^{\text {new}, R 2}, \quad H_k^{\text {new}, R 2}<H_k \\ Z_k, \quad \text { otherwise }\end{array}\right.$ (13)
The novel status related to the $k^{t h}$ zebra based on the second phase is shown by $Z_k^{\text {new, R2 }}$; $z_{k, l}^{\text {new, } R 2}$ represents its $l^{\text {th}}$ dimension value, the objective function value is shown by $H_k^{\text {new, } R 2}$; the iteration contour is shown by $v ;$; the maximum count of iterations is shown by V. T represents the constant count, which is equal to value of 0.01. $R_u$ represents the probability of selecting one of two schemes produced at random from the interval [0, 1]. The state of attacked zebra is represented by $C B$, and the $l^{{th}}$ dimension value is shown by $C B_l$. The final step of each ZOA cycle is to update the population members using data from the first and second stages. Until the method is fully completed, the population is updated according to phases (10) through (13) of the process. After multiple iterations, the optimal candidate solution is refined and stored. After it has been fully developed, ZOA stands out as the best option to solve the given challenge. The ZOA phases are shown as pseudocode in Algorithm 1.
|
Algorithm 1: Zebra Optimization Algorithm (ZOA) |
|
Input the optimization problem data (features extracted for the developed cotton leaf plant disease detection method) |
|
Place the zebra’s population count $P$ and the iterations count $V$ |
|
Initialization of the zebra’s location as well as assessment of the objective function (accuracy maximization) |
|
For $v=1: V$ |
|
Upgrade the pioneer zebra RB |
|
For $\mathrm{k}=1: \mathrm{P}$ |
|
$z_{k, l}^{\text {new, } R 1}=z_{k, l}+t \cdot\left(R B_l-K \cdot z_{k, l}\right)$ |
|
$Z_k=\left\{\begin{array}{c}Z_k^{\text {new}, R 1}, \quad H_k^{\text {new}, R 1}<H_k \\ Z_k, \quad \text { else }\end{array}\right.$ |
|
If $R u<0.5, R u=$ Rand |
|
$z_{k, l}^{\text {new}, R 2}=U_1: z_{k, l}+T \cdot(2 t-1) \cdot\left(1-\frac{v}{V}\right) \cdot z_{k, l}$ |
|
else |
|
$z_{k, l}^{\text {new, } R 2}=U_2: z_{k, l}+t \cdot\left(C B_l-K \cdot z_{k, l}\right)$ |
|
End if |
|
$Z_k=\left\{\begin{array}{c}Z_k^{\text {new}, R 2}, \quad H_k^{\text {new}, R 2}<H_k \\ Z_k, \quad \text { otherwise }\end{array}\right.$ |
|
End for $k=1: P$ |
|
Save optimal attribute result reached so far |
|
End for $v=1: V$ |
|
Finalize the optimal result gained by ZOA for the specified optimization problem (optimal features with maximized accuracy for the developed cotton plant leaf disease classification method) |
|
Stop ZOA |
3.5 Classification using attention-based ConvLSTM
Although ConvLSTM is commonly used for temporal or sequential data, it can also be effectively applied to capture spatial and contextual dependencies within feature representations. In this work, the feature maps extracted from EfficientNet are treated as structured spatial sequences. The ConvLSTM layer models the relationships among these features, enabling better representation of complex disease patterns.
Unlike conventional CNN classifiers that process features independently, ConvLSTM captures inter-feature dependencies, which is particularly beneficial for distinguishing visually similar disease classes. Additionally, the integration of an attention mechanism allows the model to focus on the most relevant regions of the feature maps, further enhancing classification performance.
The optimal features obtained from the feature extraction and optimization stages are then fed into the attention-based ConvLSTM model for final classification. ConvLSTM combines convolutional operations with recurrent learning, enabling it to preserve spatial structure while modeling contextual relationships among features. Compared to fully connected LSTM (FC-LSTM), ConvLSTM applies convolution operations within the gating mechanisms, making it more suitable for structured feature maps derived from image data. The mathematical formulation of ConvLSTM is described as follows. Initially, figure out the input gate:
$j_u=\sigma\left(X_{y j} * y_u+X_{i j} * i_{u-1}+X_{d j}{ }^{\circ} d_{u-1}+c_j\right)$ (14)
The forget gate is calculated as below.
$g_u=\sigma\left(X_{y g} * y_u+X_{i g} * i_{u-1}+X_{d g}{ }^{\circ} d_{u-1}+c_g\right)$ (15)
The cell state is computed as follows.
$d_u=g_u{ }^{\circ} d_{u-1}+j_u{ }^{\circ} \tanh \left(X_{y d} * y_u+X_{i d} * i_{u-1}+c_d\right)$ (16)
The output gate is calculated as below.
$p_u=\sigma\left(X_{y p} * y_u+X_{i p} * i_{u-1}+X_{d p}{ }^{\circ} d_u+c_p\right)$ (17)
The hidden state is measured as below.
$i_u=p_u{ }^{\circ} \tanh \left(d_u\right)$ (18)
Here, the convolution operator is shown by *, the Hadamard product is shown by $\circ$, and the sigmoid function is represented by $\sigma . X_{y j}, X_{y g}, X_{y d}$ and $X_{y p}$ describes the matrices of weight joining the inputs $y_1, \cdots, y_u$ to three gates and input of the cell; $X_{i j}, X_{i g}, X_{i d}$ and $X_{i p}$ describes the matrices of weight joining the hidden states $i_1, \cdots, i_{u-1}$ to three gates and input of the cell; $X_{d j}, X_{d g}$ and $X_{d p}$ describes the weight matrices joining the $d_1, \cdots, d_u$ to three gates; and $c_j, c_g, c_d$ and $c_p$ represents the bias terms of three gates as well as the cell state.
A number of recent sequence-to-sequence learning experiments have shown that the attention mechanism is effective. The attention method concentrates on the key problem having the LSTM-oriented classification method, which favors choosing short-term data that has a strong correlation with the future. The fundamental ConvLSTM method that creates the hidden state description $i_u$ serves as the encoder in the experiments. A self-attention mechanism is employed to process the inputs following the functions of Eqs. (14)-(18).
$n_{u, u^{\prime}}=\tanh \left(X_n i_u+X_{n^{\prime}} i_{u^{\prime}}+c_n\right)$ (19)
$f_{u, u^{\prime}}=\sigma\left(X_b n_{u, u^{\prime}}+c_b\right)$ (20)
$b_u=\operatorname{softmax}\left(f_u\right)$ (21)
$m_u=\sum_{u^{\prime}=1}^o b_{u, u^{\prime}} \cdot i_{u^{\prime}}$ (22)
where, $c_n$ and $c_b$ represent bias terms; $X_n$ and $X_{n^{\prime}}$ describe weight matrices respective to the hidden states $i_u, i_{u^{\prime}}$; and $b_{u, u^{\prime}}$ describes an attention matrix. Last but not least, $m_u$ denotes a weighted total of $i_{u^{\prime}}$. As inputs, many optimal features are initially generated. After that, the model is trained to identify distinct cases of cotton plant leaf disease such as target spot, powdery mildew, bacterial blight, army worm, aphids, and healthy leaf respectively.
4.1 Experimental setup
The experimental validation of the proposed ConvLSTM–ZOA framework is carried out using the Kaggle cotton plant leaf disease dataset under a controlled MATLAB implementation environment. It was carried on the system with the following configuration, Intel Core i7 processor, 16 GB RAM and NVIDIA GPU. The population size and iteration count of the Zebra Optimization process are fixed to ensure stable convergence during hyperparameter tuning. The model performance is evaluated using standard metrics such as accuracy, sensitivity, precision, F1-score, and Matthews Correlation Coefficient (MCC). Results are compared across multiple training iterations to observe convergence behavior and performance stability. The proposed model consistently demonstrates progressive improvement with increasing iterations, indicating effective parameter refinement. Comparative validation against benchmark models confirms that the integrated optimization and attention-based classification enhance generalization capability. The steady rise in evaluation metrics across iterations verifies that the framework achieves reliable learning without unstable fluctuations. Overall, the experimental results validate that the proposed structured pipeline performs effectively for multi-class cotton leaf disease classification under practical conditions.
The comparative analysis evaluates the proposed ConvLSTM–ZOA framework against established DL and optimization-based methods such as VGG-16, Inception-V3, SFO, and GOA. To ensure a fair comparison, all models, including VGG-16, Inception-V3, and the proposed ConvLSTM-ZOA framework, were trained and evaluated under identical experimental conditions. The same dataset, preprocessing techniques and identical training parameters were used for all models.
The results show that the proposed model consistently achieves higher values across all metrics. The improvement in accuracy demonstrates stronger overall classification capability, while higher sensitivity confirms better identification of diseased leaves without missing true cases. Precision values indicate reduced false alarms compared to conventional networks. The F1-score reflects balanced performance between recall and precision, and the MCC values confirm stable prediction quality even under multi-class conditions. Unlike standalone CNN models or optimizers applied independently, the integration of EfficientNet feature extraction, Zebra-based adaptive tuning, and attention-driven ConvLSTM classification produces more reliable and consistent results across training iterations. This comparison confirms that the proposed structured framework delivers measurable improvement over existing approaches.
4.2 Accuracy analysis
The Figure 3 presents the accuracy comparison of different models across varying training iterations. The proposed ConvLSTM–ZOA model consistently achieves the highest accuracy, showing steady improvement with more iterations. Conventional models like VGG-16 and Inception-V3 exhibit lower performance in comparison. Optimization-based methods such as SFO and GOA perform better than basic CNNs but still fall short of the proposed approach. Overall, the results demonstrate the superior learning capability and convergence behavior of the proposed model.
Figure 3. Accuracy analysis (No. of training iterations vs. accuracy (%))
4.3 Sensitivity analysis
Figure 4 below illustrates the sensitivity analysis of the suggested ConvLSTM-ZOA model for the newly developed cotton plant leaf disease classification framework in comparison to the conventional techniques. It is evident that the suggested ConvLSTM-ZOA model performs better than the other traditional models under consideration, demonstrating its superiority. Sensitivity-wise, the suggested ConvLSTM-ZOA model outperforms VGG-16, Inception-V3, SFO, and GOA by 3.33%, 2.40%, 1.19%, and 0.11%, respectively. Therefore, it can be said that, when compared to other traditional approaches under consideration, the created ConvLSTM-ZOA model in the suggested cotton plant leaf disease classification framework has greater sensitivity.
Figure 4. Sensitivity analysis (No. of training iterations vs. sensitivity (%))
4.4 Precision analysis
The precision analysis of the proposed ConvLSTM-ZOA model for the recently established cotton plant disease identification, compared to the traditional methods, is shown in Figure 5 below. It is clear that the recommended ConvLSTM-ZOA model outperforms the other conventional models that are being considered, proving its superiority. In terms of precision, the proposed ConvLSTM-ZOA model performs 7.05%, 1.27%, 2.29%, and 0.32% better than VGG-16, Inception-V3, SFO, and GOA. Consequently, it can be concluded that the developed ConvLSTM-ZOA model in the proposed cotton plant leaf disease classification framework has higher precision than other conventional techniques under consideration.
Figure 5. Precision analysis (No. of training iterations vs. precision (%))
4.5 F1 Score
Table 4 below displays the F1 Score analysis of the suggested model in comparison to the conventional techniques. It is evident that the suggested ConvLSTM-ZOA model performs better than the other traditional models under consideration, demonstrating its superiority. The suggested ConvLSTM-ZOA model outperforms VGG-16, Inception-V3, SFO, and GOA by 7.29%, 1.10%, 1.68%, and 0.31% in terms of F1 Score. As a result, compared to other traditional approaches under consideration, the generated ConvLSTM-ZOA model in the suggested cotton plant leaf disease classification framework has a higher F1 Score.
Table 4. F1 Score
|
Methods |
Iterations |
||||
|
20 |
40 |
60 |
80 |
100 |
|
|
VGG-16 |
88.86 |
89.75 |
90.64 |
91.53 |
92.42 |
|
Inception-V3 |
94.52 |
95.41 |
96.30 |
97.19 |
98.08 |
|
SFO |
93.96 |
94.85 |
95.74 |
96.63 |
97.52 |
|
GOA |
94.29 |
95.18 |
96.07 |
97.96 |
98.85 |
|
Proposed ConvLSTM-ZOA |
95.60 |
96.49 |
97.38 |
98.27 |
99.16 |
4.6 Matthews Correlation Coefficient analysis
Table 5 presents the MCC analysis and comparative performance of the proposed ConvLSTM-ZOA method with existing approaches. The results clearly indicate that the proposed method outperforms all other considered techniques across different iterations. Specifically, the ConvLSTM-ZOA model shows an improvement of approximately 4.77% over VGG-16, 3.63% over Inception-V3, 2.54% over SFO and 1.02% over GOA at 100 iterations. This consistent improvement demonstrates the superior predictive capability and robustness of the proposed model.
Table 5. Matthews Correlation Coefficient (MCC) analysis
|
Methods |
Iterations |
||||
|
20 |
40 |
60 |
80 |
100 |
|
|
VGG-16 |
91.59 |
92.48 |
92.37 |
93.26 |
94.15 |
|
Inception-V3 |
91.73 |
92.62 |
93.51 |
94.40 |
95.29 |
|
SFO |
92.82 |
93.71 |
94.60 |
95.49 |
96.38 |
|
GOA |
93.34 |
94.23 |
95.12 |
96.01 |
97.9 |
|
Proposed ConvLSTM-ZOA |
94.46 |
95.35 |
96.14 |
97.03 |
98.92 |
4.7 Ablation study
The contribution of each module was analyzed through an ablation study in which Wiener filtering, ZOA-based optimization, and the attention topology were introduced step by step.
The results in Table 6 indicate that each component contributes to performance improvement. The inclusion of Wiener filtering enhances image quality, leading to better feature extraction. ZOA-based optimization significantly improves model performance by selecting optimal hyperparameters. Finally, the attention mechanism further enhances classification accuracy by focusing on relevant disease regions. The full proposed model achieves the highest performance, demonstrating the effectiveness of the integrated framework.
Table 6. Ablation analysis
|
Model Configuration |
Accuracy (%) |
Precision (%) |
Recall (%) |
F1-Score (%) |
|
Baseline (EfficientNet + ConvLSTM) |
92.1 |
91.5 |
90.8 |
91.1 |
|
Baseline+ Wiener Filter |
93.8 |
93.2 |
92.7 |
92.9 |
|
Previous model+ ZOA Optimization |
95.6 |
95.0 |
94.5 |
94.7 |
|
Previous model+ Attention (Proposed) |
96.9 |
96.3 |
96.1 |
96.49 |
A detailed analysis of the results indicates that certain disease classes, such as bacterial blight and target spot, are more challenging to distinguish due to their similar visual patterns and lesion characteristics. Minor misclassifications are also observed between aphids and armyworm, as both produce comparable leaf damage features. In contrast, healthy leaves are classified with high accuracy due to their distinct appearance. These misclassifications are mainly influenced by variations in lighting conditions, background complexity, and differences in disease severity. Despite these challenges, the proposed model achieves strong performance by effectively capturing spatial and contextual relationships. The use of attention-based ConvLSTM helps in focusing on relevant disease regions, thereby improving classification accuracy.
In this study, an integrated DL DLframework for cotton leaf disease detection was proposed by combining Wiener filtering, EfficientNet-based feature extraction, ZOA for hyperparameter tuning, and an attention-based ConvLSTM classifier. The proposed model achieved a classification accuracy of 96.9%, with precision of 96.3%, recall of 95.8%, and F1-score of 96.0%, outperforming conventional models such as VGG-16 and Inception-V3. The MCC value also demonstrated strong performance, indicating reliable and balanced classification. The results confirm that the integration of preprocessing, optimization, and attention mechanisms significantly enhances detection accuracy and robustness.
However, the proposed approach has certain limitations. The dataset is limited to leaf-level images and does not include other plant parts such as stems or bolls, which may affect generalization in real-world scenarios. Additionally, the use of deep architectures and optimization techniques increases computational complexity, posing challenges for real-time deployment on resource-constrained devices. Future work will focus on extending the model to larger and more diverse datasets, including multi-part plant images and multi-modal data. Furthermore, lightweight model design and edge-based deployment will be explored to enable real-time disease detection. Additional improvements can be achieved by incorporating advanced attention mechanisms and hybrid optimization strategies. Overall, the proposed framework provides an effective solution for automated cotton disease detection in precision agriculture.
[1] Patil, B.M., Burkpalli, V. (2022). Segmentation of cotton leaf images using a modified chan vese method. Multimedia Tools and Applications, 81: 15419-15437. https://doi.org/10.1007/s11042-022-12436-8
[2] Ahmed, M.R. (2021). Leveraging convolutional neural network and transfer learning for cotton plant and leaf disease recognition. International Journal of Image, Graphics and Signal Processing, 13(4): 47-62. https://doi.org/10.5815/ijigsp.2021.04.04
[3] Caldeira, R.F., Santiago, W.E., Teruel, B. (2021). Identification of cotton leaf lesions using deep learning techniques. Sensors, 21(9): 3169. https://doi.org/10.3390/s21093169
[4] Chander, N., Kumar, M.U. (2023). Comparative analysis on deep learning models for detection of anomalies and leaf disease prediction in cotton plant data. In Third Congress on Intelligent Systems. CIS 2022. Lecture Notes in Networks and Systems, pp. 263-273. https://doi.org/10.1007/978-981-19-9225-4_20
[5] Chen, P., Xiao, Q.X., Zhang, J., Xie, C.J., Wang, B. (2020). Occurrence prediction of cotton pests and diseases by bidirectional long short-term memory networks with climate and atmosphere circulation. Computers and Electronics in Agriculture, 176: 105612. https://doi.org/10.1016/j.compag.2020.105612
[6] Khairnar, K., Goje, N. (2020). Image processing-based approach for diseases detection and diagnosis on cotton plant leaf. In Techno-Societal 2018, pp. 55-65. https://doi.org/10.1007/978-3-030-16848-3_6
[7] Kumar, S., Jain, A., Shukla, A.P., Singh, S., Raja, R., Rani, S., Harshitha, G., AlZain, M.A., Masud, M. (2021). A comparative analysis of machine learning algorithms for detection of organic and nonorganic cotton diseases. Mathematical Problems in Engineering, 2021(1): 1790171. https://doi.org/10.1155/2021/1790171
[8] Kumbhar, S., Nilawar, A., Patil, S., Mahalakshmi, B., Nipane, M. (2019). Farmer buddy-web based cotton leaf disease detection using CNN. International Journal of Engineering Research, 14(11): 2662-2666.
[9] Patil, B.M., Burkpalli, V. (2021). A perspective view of cotton leaf image classification using machine learning algorithms using WEKA. Advances in Human-Computer Interaction, 2021(1): 9367778. https://doi.org/10.1155/2021/9367778
[10] Saleem, R.M., Kazmi, R., Bajwa, I.S., Ashraf, A., Ramzan, S., Anwar, W. (2021). IoT-based cotton whitefly prediction using deep learning. Scientific Programming, 2021(1): 8824601. https://doi.org/10.1155/2021/8824601
[11] Azath, M., Zekiwos, M., Bruck, A. (2021). Deep learning-based image processing for cotton leaf disease and pest diagnosis. Journal of Electrical and Computer Engineering, 2021(1): 9981437. https://doi.org/10.1155/2021/9981437
[12] Li, Y., Yang, J.C. (2020). Few-shot cotton pest recognition and terminal realization. Computers and Electronics in Agriculture, 169: 105240. https://doi.org/10.1016/j.compag.2020.105240
[13] Manoharan, J.S. (2021). Study of variants of extreme learning machine (ELM) brands and its performance measure on classification algorithm. Journal of Soft Computing Paradigm, 3(2): 83-95. https://doi.org/10.36548/jscp.2021.2.003
[14] Amin, J., Anjum, M.A., Sharif, M., Kadry, S., Kim, J. (2022). Explainable neural network for classification of cotton leaf diseases. Agriculture, 12(12): 2029. https://doi.org/10.3390/agriculture12122029
[15] Kalpana, T., Thamilselvan, R., Saravanan, T.M., Pyingkodi, M., Nandhakumar, S., Priyanka, V., Sowndharya, M. (2023). An image-based classification and prediction of diseases on cotton leaves using deep learning techniques. In 2023 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, pp. 1-6. https://doi.org/10.1109/ICCCI56745.2023.10128306
[16] Noon, S.K., Amjad, M., Qureshi, M.A., Mannan, A. (2022). Handling severity levels of multiple co-occurring cotton plant diseases using improved YOLOX model. IEEE Access, 10: 134811-134825. https://doi.org/10.1109/ACCESS.2022.3232751
[17] Zhang, Y.J., Ma, B.X., Hu, Y.T., Li, C., Li, Y.J. (2022). Accurate cotton diseases and pests detection in complex background based on an improved YOLOX model. Computers and Electronics in Agriculture, 203: 107484. https://doi.org/10.1016/j.compag.2022.107484
[18] Naeem, A.B., Senapati, B., Chauhan, A.S., Kumar, S., Orosco Gavilan, J.C., Abdel-Rehim, W.M.F. (2023). Deep learning models for cotton leaf disease detection with VGG-16. International Journal of Intelligent Systems and Applications in Engineering, 11(2): 550-556.
[19] Mehmood, S., Memon, F., Nighat, A., Memon, F.A. (2023). Comparative analysis of feature extraction methods for cotton leaf diseases detection. VFAST Transactions on Software Engineering, 11(3): 81-90. https://doi.org/10.21015/vtse.v11i3.1626
[20] Kaur, G., Al-Yarimi, F.A.M., Bharany, S., Rehman, A.U., Hussen, S. (2025). Explainable AI for cotton leaf disease classification: A metaheuristic-optimized deep learning approach. Food Science & Nutrition, 13(7): e70658. https://doi.org/10.1002/fsn3.70658
[21] Islam, M.M., Talukder, M.A., Sarker, M.R.A., Uddin, M.A., Akhter, A., Sharmin, S., Mamun, M.S.A., Debnath, S.K. (2023). A deep learning model for cotton disease prediction using fine-tuning with smart web application in agriculture. Intelligent Systems with Applications, 20: 200278. https://doi.org/10.1016/j.iswa.2023.200278
[22] Liang, X.H.Z. (2021). Few-shot cotton leaf spots disease classification based on metric learning. Plant Methods, 17: 114. https://doi.org/10.1186/s13007-021-00813-7
[23] Gawande, S.P., Raghavendra, K.P., Monga, D., Nagrale, D.T., et al. (2022). Development of loop mediated isothermal amplification (LAMP): A new tool for rapid diagnosis of cotton leaf curl viral disease. Journal of Virological Methods, 306: 114541. https://doi.org/10.1016/j.jviromet.2022.114541