Comparative Analysis of Classical Machine Learning and Deep Learning Methods for Fruit Image Recognition and Classification

ABSTRACT


INTRODUCTION
In this examination of the multifaceted role played by fruits in our lives, encompassing dietary significance, cultural value, and ecological importance [1], the intricate ways in which these vibrant earthly treasures contribute to the human experience are delved into [2].A pivotal aspect involves the visual identification and classification of diverse fruit varieties, requiring the application of computer vision and machine learning (ML) techniques, commonly referred to as fruit recognition or classification [3].Substantial time, cost, and labor losses are incurred through the hand and eye sorting of fruits based on their visual characteristics [3,4].The challenges posed by small visual differences between fruits are recognized [5,6], and efforts are made in this study to address the limitations of current fruit classification methods, particularly in the agriculture sector, where real-time use is hindered by lengthy training and testing times or inaccuracies [7,8].In response to these identified gaps, the primary objective of this work is to have picture features extracted and preprocessed from the comprehensive Fruit-360 dataset, utilizing Principal Component Analysis (PCA), color, and texture features.Subsequently, a combination of classical machine learning methods (SVM, KNN, and DT) and the deep learning classification network AlexNet is employed to classify diverse fruit varieties [9,10].The analysis of classification results aims to identify the most effective ML and DL models for the Fruit-360 dataset, thereby addressing the pressing challenges in fruit recognition and classification.The organization of this paper can be outlined as follows: Section 2, titled "Related Work," is dedicated to reviewing prior research.In Section 3, we delve into the foundational theory and present our proposed system.Section 4 is focused on presenting the results and engaging in a detailed discussion.Finally, the paper culminates with a conclusion in Section 5.

LITERATURE REVIEW
The challenges in identifying fruits, arising from diverse factors such as lighting conditions, object concealment, and varied surface characteristics, have prompted numerous studies addressing fruit recognition as an image segmentation challenge.In the literature, one study [11] delves into the application of ML and pattern recognition methods for creating automatic date fruit classifiers, emphasizing the need for sorting and quality control in the food industry.Another study [12] proposes a CNN-based fruit recognition system, leveraging DL techniques on the Fruits-360 dataset.Despite achieving an accuracy rate of 99.79%, this work recognizes the need for a more comprehensive investigation into ML approaches, particularly concerning accuracy and computation times, focusing on the recognition of thirteen different apple varieties [13].Dimensionality reduction using PCA is explored in the literature, creating latent variables known as principal components (PCs).This technique [14] transforms pixel images into a reduced-dimensional representation, presenting potential for enhancing independence across variables and reducing dimensionality.Additionally, a study [15] categorizes FFB maturity degrees using texture and color characteristics with PCA-based feature selection, applying an ANN for classification.A comprehensive evaluation of Indian fruit types is conducted [16] using both SVM classifiers and deep features extracted from a CNN model.The literature introduces an alternative method based on transfer learning, comparing six deep learning architectures.Despite the notable performance of SVM classifiers with deep learning, this work aims to contribute by critically analyzing the methodologies and exploring the potential for improved fruit classification models.Enhancements to the AlexNet convolutional neural network [17] for categorizing different apple varieties exhibit improved recognition abilities.The findings highlight the viability of the modified AlexNet model for fruit categorization, showcasing increased accuracy and reduced training times.Additionally, a study [18] proposes a method for determining the quality of peanut pods, employing ResNet18 with the CSPNet module for enhanced accuracy and real-time performance.The development of a reliable system [19] using RGB-depth cameras for simultaneous and accurate identification of fruit-bearing branches in litchi clusters in large areas presents a significant advancement.Furthermore, a study [20] explores sensor data fusion from HSI and acoustic signals for identifying codling moth infestations in apples.The findings suggest that the fusion technique can significantly enhance CM detection, indicating potential advancements in detecting infestations in pome fruits.In relating these studies to the current research aims, it is evident that our work seeks to build on existing methodologies, critically analyzing their limitations and aiming for advancements in accuracy, computation times, and real-time performance.Table 1 summarizes and compares all related research in this section.

[11]
Originally composed of 898 samples of seven different types of date fruits Shape, color and morphology

SVM, NB, LMT, KNN and MLP
To assess the effectiveness of five distinct machine learning algorithms in categorizing date fruits by their external attributes, we found that MLP yielded the highest performance with the original dataset, whereas SVM proved to be the superior classifier when working with the underdamped dataset.
2019 [12] Fruits-360 Convolutional layers CNN To attain elevated classification accuracies by exploring different combinations of hidden layers and epochs across various scenarios and subsequently comparing their results.
Ultimately, the goal is to reach a test accuracy of 100% and a training accuracy of 99.79%, representing the study's desired outcomes.
2020 [14] Fruit dataset was collected through online search engine Color, GLCM texture, shape and PCA

SVM
The suggested system has wide applicability across different sectors, including the food industry, pharmaceutical and cosmetic industries, as well as the evaluation of fruit quality.
The experimental outcome showcased a noteworthy classification accuracy of 87.06%.
2021 [15] Dataset consisted of images of oil palm fresh fruit bunches located in the Paser District of East Kalimantan, Indonesia Color, texture and PCA

Naïve Bayes, SVM, and ANN
To create an automated approach employing machine vision techniques to categorize the ripeness stage of oil palm fresh fruit bunches.The classification accuracy achieved impressive results, with Naive Bayes at 96.7%, while both SVM and ANN scored an accuracy of 98.3%.2020 [16] Indian Fruits-40

Fully connected layer (CNN) SVM
To attain a combination of precision and speed in identifying fruits, with potential benefits for a wide range of uses, including fruit monitoring and classification within the production process.The result SVM get 100% accuracy.
2023 [17] Fruits-360 Fully connected layer with a global average pooling layer

AlexNet
The authors' objective was to tackle the issue of the expensive and inefficient manual sorting of apples, and they succeeded in reaching an accuracy of 98.88% by implementing a computer vision-based approach.

Improved ResNet18
architecture and incorporates the CBAM, KRSNet, and CSPNet module The study aimed to propose a cost-effective method for assessing peanut pod quality using RGB images and deep learning techniques.Results showed that the algorithm, which incorporated an improved ResNet18 architecture along with CBAM, KRSNet, and CSPNet modules, outperformed the original ResNet18 in terms of accuracy and various evaluation metrics.2020 [19] Manual labelling dataset

RGB-D Deeplabv3
To create a dependable algorithm that can identify and pinpoint fruit-bearing branches within litchi clusters in their natural habitat, the Deeplabv3 algorithm demonstrated an accuracy rate of 79.46%.
2023 [20] The experiments used organic Gala, Fuji, and Granny Smith apple samples, which were bought from a commercial market in Princeton, KY, USA

PCA-HSI AdaBoost
The article addresses the challenges of detecting common scab (CM) in apples and suggests a method with the potential to enhance detection accuracy and efficiency.

Proposed system
Three primary components make up the proposed system: (Preprocessing, Feature extraction, and classification).The suggested system's flow diagram is depicted in Figure 1.Utilizing segmentation techniques, it is crucial to isolate the fruit region from the background of the image in order to obtain the necessary features for classification.To speed up processing, the image is first downsized from 100×100 to 150×150 pixels in the pre-processing stage.By removing any unwanted contaminants, the image is subsequently converted to grayscale, gray threshold, and finally black and white.Utilize the region of interest from the segmented fruit in the final fruit mask.As shown in Figure 2. Feature extraction is a technique for discovering more precise information about significant elements in a picture [21].
A classification approach that overfits the training data and performs badly when applied to new samples is frequently needed for analyses with a lot of variables.Feature extraction is a technique for combining the variables to get around these problems while still accurately characterizing the data [22].Pattern recognition features frequently include context, PCA, shape, color, texture, or grayscale data.In image processing or machine vision, a pattern measurement from the beginning or particular consecutive measurement patterns are transformed into a new pattern feature [23].The process of classifying an object into a category or a class using higher-level data or characteristics that have been retrieved from the object is known as pattern classification.It creates the categorization strategy while also automatically identifying things in the image [24].Finding patterns, especially visual and aural patterns, is a major focus of the field of pattern recognition in computer science.It employs methods from various academic fields, such as ML, statistics, and others [24].Principal component analysis (PCA).Principal Component Analysis, a dimensionality reduction technique, can be used to solve compression and recognition problems (PCA).Other names for PCA include hoteling or eigenspace projection [25].The original data space or image is converted into a subspace collection of principal components (PCs) using PCA, where the diversity between the photos is primarily captured by the first orthogonal dimension.The final dimension of this subspace exhibits the smallest amount of variation between the photos according to the statistical characteristics of the targets [26].When characterizing the original vector using the orthogonal or uncorrelated output components from this transformation, the mean square error may be the lowest.Results from common transform techniques like PCA are not directly related to any one feature of the original test.By extracting features, PCA can identify the sample data elements that show the greatest change.From all the features, this can be utilized to select a few fascinating people.
The projection matrix   that maximizes the determinant of the total scatter matrix of the projected samples is often found using the PCA method [27] as follows.where,  is: And the total scatter matrices are ST: The Symbol µ denote the mean of vector feature for all sample in set of training and xi is the i-th of feature vector of sample and c are the amount of training sample numbers.
Color feature.Researchers have utilized the fruit's color as a defining characteristic for classification quite a bit.In this study, RGB color features that are observed in the fruit region are extracted to obtain color information.
RGB color model.The RGB model, commonly known as the three primary colors, is made up of Through the use of the three brightness channel combinations, a wide range of colors can be manipulated.three-color channels: red, green, and blue.Through the use of the three brightness channel combinations, a wide range of colors can be manipulated.Channels often have a limited range, such as 0 to 255.There are typically 24 color photos utilized [28].Additionally, eight shades of gray are used to represent each color component.The total number of colors can be expressed using three color components as follows: 288+8+8=224=16,777,216.In essence, all the colors that the human eye can see are contained in the color-coded representation of their appearance.The most popular color model is this one.
Simple color representations include the RGB color space.However, it falls short of the visual experience and lacks an intuitive expression style [29][30][31].It is impossible to comprehend the equivalent combination of colors based on the three main color components.Additionally, the three-color components must vary if you want to alter the image's color.Consequently, the finished product cannot accurately depict the color that the human eye actually sees [30].When doing a statistical texture analysis, the characteristics of the texture are calculated using the statistical distribution of the observed combinations of intensities at certain locations within the image in relation to one another.In this work, LBP types of texture characteristics are investigated.Depending on how many intensity points (pixels) there are in each pair, statistics are categorized as first-order, second-order, and higher-order statistics.
Local binary pattern (LBP).The majority of texture models have an excessively high level of computational complexity.The local binary pattern operator (LBP) is a straightforward texturing model that the authors selected as a result [32].The operator labels the pixels in an image block by setting the center value as the threshold for each pixel's neighborhood, and then interprets the output as a binary number (LBP code): where,   corresponds to the value of the center pixel (  ,   ) and   represents the grey values of the P neighborhood pixels.The function s(x) is defined as follows: Uncertainty has been given to the definition of the ℎ  in order to test for similarity between an image block texture and the backdrop texture.
where,   (  ,   ) is the texture LBP of pixel (  ,   ) in background and Gt   (  ,   ) is the texture LBP of pixel (  ,   ) in time t video frame.ℎ  is close to one if   (  ,   ) and   (  ,   ) are extremely comparable [33] show Figure 4 for example.

ML classification method
The choice of classifier is a crucial step since the same collection of features may yield different results when used with various classification approaches.In this study, we classified a variety of fruits using ML and DL algorithms.In this work, the effectiveness of ML (KNN, DT, and SVM) and DL (AlexNet) approaches for classifying fruits was compared.We selected these ML and DL methods despite the fact that there are many more that have been explored in the literature since they are popular and widely accepted as being successful.The choice of classifier is a crucial step since the same collection of features may yield different results when used with various classification approaches.In this study, we classified a variety of fruits using ML and DL algorithms.In this work, the effectiveness of ML (KNN, DT, and SVM) and DL (AlexNet) approaches for classifying fruits was compared.We selected these ML and DL methods despite the fact that there are many more that have been explored in the literature since they are popular and widely accepted as being successful.

SVM
The SVMs are also called support vector networks [34].The trained model is a non-probabilistic binary linear classifier since it categorizes newly sampled data into one of the categories.The data in SVM can be seen as points in space that are mapped to another space and then separated by a hyperplane.The new test data is then translated into the same space and classified according to the hyperplane.By utilizing a kernel, SVM can also be utilized for non-linear classification.High-dimensional feature spaces like kernel maps make separation simple [35,36].

KNN
The KNN algorithm is a widely recognized statistical method for pattern recognition [36].It is both simple and efficient.The classifier functions independently of the training dataset, and there is no complexity involved during the training phase.Nevertheless, the size of the training set significantly influences the computational complexity of the KNN algorithm.This approach is particularly effective for separating samples with numerous intersections or overlaps within the class domain, outperforming other methods [37].

DT
Instances are categorized using DT, which order instances based on the values of their features.A decision tree's branches each reflect a potential value for the node, which in turn represents a feature in an instance that needs to be categorized.Instances are grouped and sorted starting at the root node based on the feature values they contain.In data mining and ML, DT learning employs a decision tree as a prediction model to connect observations about an object to judgments about the value it should have.Classification trees or regression trees are suitable names for these tree models [38].In order to evaluate the performance of decision trees while they are pruned using a validation set, decision tree classifiers frequently use postpruning approaches.Any node may be deleted and assigned the training instances' most prevalent class [38].
In essence, the selection of SVM, KNN, and DT is based on their distinctive advantages: SVM stands out in precise binary classification and has the capability for non-linear separation; KNN proves to be efficient and proficient, especially in scenarios where class domains overlap; and DT offers a transparent hierarchical framework for decision-making relying on image features.The intent behind utilizing this combination is to harness the individual strengths of each method, thereby improving the overall accuracy and performance in the realm of fruit image recognition and classification.

DL classification method
In contrast to traditional machine learning, deep learning necessitates less manual feature engineering, as seen in algorithms like SVM and KNN.For instance, CNNs, a subtype of deep learning models, excel in the process of feature extraction.CNNs possess the ability to map input data through multiple layers, allowing them to learn from each layer individually, thereby enabling the extraction of meaningful features from large datasets.Deep learning classification models often incorporate convolutional, pooling, and fully connected layers [39].When dealing with images of plant leaves, the convolutional layer primarily extracts image features.The intermediate layer is responsible for extracting intricate texture information and a portion of the semantic details, while the deep layer captures high-level semantic features, and the shallow layer retrieves edge and texture information.A max-pooling layer is employed to preserve crucial information in the image following the convolutional layer [40].The high-level semantic features extracted by the feature extractor are categorized using a classifier composed of fully connected layers located at the model's end.In this study, the input image dimensions were set at 270×270×3, divided into several depth slices with a significant number of neurons in each slice.Think of square filters like 16×16, 9×9, or 5×5 convolution kernels as weights for these neurons, each related to a specific local region in the image from which it extracted features.
If we assume the input image size is denoted as W, the convolution kernel size as F, and the stride of the convolution kernel as S (typically S=2), padding P is used to fill the input image boundary, with P usually set to 0. The size of the image following convolution can be calculated as (W-F+2P)/S+1.Each output map feature employs convolutions to combine various input maps, and typically, Eq. ( 6) can be used to represent the result.Semantic information.The input image size for this investigation was set to 270×270×3.In the direction of depth, it was divided into numerous slices.
where, i represents the I layer, kij represents the convolutional kernel, bj represents the bias, and Mj is a collection of input maps.It seems to suggest that when implementing CNNs in a more detailed or advanced manner, many people use the sigmoid activation function, a tanh function, or an additive bias.For instance, the unit's value at the place x, y in the (j-th) feature map and the (i-th) layer, denoted as    is given in Eq. (7).
where,  (.) is the sigmoid function   is the bias for the feature map,   and   are the height and width of the kernel, and    is the kernel weight value at the position (p, q) connected to the (i, j) layer.vision.With a structure encompassing eight layers, featuring five convolutional layers and three fully connected layers, AlexNet serves as the foundational framework for contemporary deep neural networks.The convolutional layers, essential for extracting features and identifying objects, employ diverse filters, including sizes like 11×11 and 5×5, among others.Following each convolutional layer, maxpooling layers efficiently reduce spatial dimensions.The three fully-connected layers, each hosting 4096 neurons, play a vital role in object categorization, while the final SoftMax layer predicts object classes in the images.This architectural design, coupled with specified filter sizes, has become a cornerstone in the field of deep learning [41,42].The entire course of this study is depicted in Figure 5.We manually retrieved characteristics from the preprocessed fruit dataset in order to employ ML techniques to categorize fruits.It was unnecessary to manually extract features in this case because the DL classifier could do so automatically.The DL and ML networks, respectively, received input from the preprocessed images and the extracted features for training.We acquired the trained models after the training procedure was finished.The trained model was then used to classify the test dataset [42].
Essentially, the strengths of deep learning, exemplified by the proficiency of AlexNet, originate from its inherent ability to autonomously acquire intricate features, comprehend hierarchical abstractions, and provide heightened flexibility and scalability.This, in turn, leads to superior performance in image recognition when compared to conventional ML approaches.

Data set
The dataset utilized in this paper is the Fruit 360 dataset [43], comprising 131 categories of fruits and vegetables and encompassing a total of 90,483 images.The training dataset consists of 67,692 images, while the testing dataset contains 22,688 images, with each image featuring a single fruit or vegetable.All images within the dataset have been standardized to 100×100 pixels to ensure uniformity.To enhance classification accuracy, the fruits and vegetables were isolated from their backgrounds due to varying lighting conditions.To create an effective model with high accuracy, it is essential to maintain a balanced distribution of data in both the training and testing datasets.A comparison of the two datasets reveals a similar image distribution ratio across all categories, which can be verified by calculating the ratios for specific classes in both datasets.As illustrated in Figure 6, images of various fruits from 360 datasets.

Confusion matrix
The confusion matrix is a vital tool in ML and DL for evaluating the performance of classification model performance.It provides a comprehensive analysis of how a model's predictions compare to the actual labels derived from the ground truth.Although it may be used in multi-class settings as well, binary classification issues (two classes) are where it shines [44].
The confusion matrix consists of four essential components: (1).True Positives (TP): In these cases, the model successfully predicted the positive class.
(2).True Negatives (TN): In each of these situations, the model properly predicted the negative class.
(3).False Positives (FP): Here, the model predicted the positive class when the actual class was negative, which was incorrect.
(4).False Negatives (FN): In these cases, the model predicted the erroneous class, which was negative whereas the true class was positive.
The Confusion Matrix is frequently depicted as a figure, roughly resembling this: Calculating the proportion of samples that were correctly classified to all samples, as demonstrated in the example below, can be used to evaluate the classification models' accuracy.
To determine the accuracy for the TP divided by the total number of items with positive labels (TP plus FP added together), as stated in formula 9, High precision indicates that the model and categorization are producing more useful results.
Calculating the recall by dividing the total number of components that genuinely belong to the positive class TP will allow you to determine the model's sensitivity (10).
The harmonic mean of sensitivity and precision is used to determine the F1-Score.

𝐹𝐹1 = 2 *
( * sensitivity) ( + sensitivity) (11) Data scientists and ML practitioners can assess how well their models are working and make educated decisions regarding model modification and deployment using the Confusion Matrix and related metrics [45].The program was trained on two types of datasets that were obtained from the fruit-360 dataset.The first one consisted of 40 types of fruit classes, as shown in Figure 7, while the second one consisted of 59 types of fruit classes in the dataset.The experimental dataset after preprocessing includes a total of (24,636) 40-class images of different fruits.As shown in Table 2, the preprocessed dataset was divided into training and testing subsets with 80% and 20% The DL models automatically extracted fruit features through a series of convolutional operations without manual extraction from the second dataset, which contained (30,392) images of various fruits.The feature extraction method is required to be carried out manually for ML algorithms, though.The earlier manually derived features were thus only used for the ML techniques.The classification outcomes of the DL and ML algorithms were evaluated using accuracy, precision, sensitivity, and F1 score metrics (explained in previous sections)".Furthermore, from the comparison in Table 3, it can be seen that, among all the feature extraction methods, the LBP method obtained the best results in precision, sensitivity, and F1-scor of 96.339%, 95.960%, and 96.072%, respectively; This demonstrates that in the classification task, the extracted feature quality had a direct impact on the final classification outcome, as indicated by the accuracy column's LBP value of 96.072%.To effectively show research results or contrast various approaches, the Tables 4 and 5 that details the SVM and DT algorithms with various feature extraction techniques and their corresponding performance rates can be created.The columns that correspond to each row in the tables display the performance metrics (accuracy, precision, sensitivity, and F1 score) that were attained using that method.Each row represents a distinct feature extraction technique.This makes it possible for researchers to evaluate the effectiveness of several feature extraction methods and decide which one is best suited for their SVM and DT classification tasks.We can see from Table 6 the metrics of the evaluated DL network were superior to those of the evaluated ML algorithms.For instance, the accuracy of the tested ML techniques was 96.17% (KNN), 90.40% (SVM), and 88.58% (DT), whereas the accuracy of the tested DL algorithms was 998.5% (AlexNet).As shown in Table 6, AlexNet performs well at identifying objects in photos without the need for preprocessing work, and it also has the capacity to tell apart similar fruit structures.The KNN algorithm, followed by the SVM algorithm, and then the DT algorithm produced the best classification results among the three ML techniques examined.

RESULTS AND DISCUSSION
We select the best rate of accuracy, precision, sensitivity, and F1-score for each classification to compare with the DL classification technique as shown in the Table 6 because we can see that among the various feature extraction methods, each table has the highest and lowest rate between them.

Discussion
In the realm of fruit recognition, both ML and DL approaches offer distinct advantages, yet they come with their own set of considerations.ML algorithms, such as KNN, DT, or SVM, prove effective in classifying fruits based on external characteristics, providing interpretability for users to comprehend the decision-making process.Nevertheless, these models have limitations in handling the complexity of unstructured image data, leading to potential failures in intricate fruit recognition tasks.On the other hand, DL, particularly CNNs, excels at navigating complex and unstructured image data within fruit recognition tasks.They automatically learn intricate features, eliminating the need for manual feature engineering.However, the trade-off includes the requirement for larger datasets for training and reduced interpretability due to the complex architectures, leading to challenges in understanding model decisions.Navigating the choice between ML and DL depends on factors such as the specific fruit recognition task, available data, and the desired balance between interpretability and accuracy.In light of these considerations, it's crucial to provide more insights into model limitations, potential failures, and avenues for improvement.Exploring ways to address these limitations and enhancing model robustness will be pivotal for advancing fruit recognition methodologies.In this study, confusion matrix plots were employed along with training and validation plots.Classification results plots show the various classification outcomes for each categorization of fruits with various ML and DL algorithms.confusion matrix shows the best result of the three algorithms with the feature.For the KNN algorithm, the best result is with LBP, which is equal to 96.07%; in the SVM method, the height rate is 90.40% with the RGB feature; and in the third one, the RGB feature has the top rate with the DT algorithm.All the details about the rate of each feature with each algorithm have been determined in the above tables.
As seen in Figure 11, the classification of 59 fruit types requires more epochs to get stable, and the model requires more training because of the high similarity between different fruit types.AlexNet Network shows the performance to reach an accuracy of near 100% detection for the dataset used, which consists of 30,392 images of various fruits.The AlexNet (CNN) model's superior accuracy in fruit recognition, when compared to traditional ML algorithms, has significant realworld applications and consequences.This increased accuracy stems from the model's inherent ability to learn intricate features directly from image data.AlexNet's complex architecture, housing millions of parameters, allows it to grasp nuanced relationships within the data, surpassing the capabilities of traditional algorithms, even those involving meticulous feature engineering.The model's proficiency in image-related tasks is particularly noteworthy in practical scenarios.Its effective handling of spatial hierarchies ensures excellence in recognizing and categorizing fruits based on visual characteristics.Additionally, the advantage of transfer learning from pre-trained models is that it enhances its adaptability and performance in fruit recognition, especially when substantial data is available for training and fine-tuning.These results have broad implications across various domains.Firstly, the heightened accuracy of the AlexNet model implies a powerful tool for automating fruit recognition tasks, leading to increased efficiency and reduced reliance on human intervention.The model's generalization capabilities make it applicable to a broader range of fruits, fostering versatility in agricultural and industrial settings.The advancements in research facilitated by this technology have the potential to drive innovations in fruit-related studies and applications.Moreover, the economic and social impact is noteworthy, as automated and accurate fruit recognition can streamline processes in agriculture, food industry quality control, and nutritional assessments, contributing to enhanced productivity and healthier dietary practices.Overall, the practical applications of the AlexNet model in fruit recognition represent a transformative technology with diverse implications for various sectors and societal well-being.confusion matrix shows the best result of the three algorithms with the feature.For the KNN algorithm, the best result is with LBP, which is equal to 96.07%; in the SVM method, the height rate is 90.40% with the RGB feature; and in the third one, the RGB feature has the top rate with the DT algorithm.All the details about the rate of each feature with each algorithm have been determined in the above tables.
As seen in Figure 11, the classification of 59 fruit types requires more epochs to get stable, and the model requires more training because of the high similarity between different fruit types.AlexNet Network shows the performance to reach an accuracy of near 100% detection for the dataset used, which consists of 30,392 images of various fruits.The AlexNet (CNN) model's superior accuracy in fruit recognition, when compared to traditional ML algorithms, has significant realworld applications and consequences.This increased accuracy stems from the model's inherent ability to learn intricate features directly from image data.AlexNet's complex architecture, housing millions of parameters, allows it to grasp nuanced relationships within the data, surpassing the capabilities of traditional algorithms, even those involving meticulous feature engineering.The model's proficiency in image-related tasks is particularly noteworthy in practical scenarios.Its effective handling of spatial hierarchies ensures excellence in recognizing and categorizing fruits based on visual characteristics.Additionally, the advantage of transfer learning from pre-trained models is that it enhances its adaptability and performance in fruit recognition, especially when substantial data is available for training and fine-tuning.These results have broad implications across various domains.Firstly, the heightened accuracy of the AlexNet model implies a powerful tool for automating fruit recognition tasks, leading to increased efficiency and reduced reliance on human intervention.The model's generalization capabilities make it applicable to a broader range of fruits, fostering versatility in agricultural and industrial settings.The advancements in research facilitated by this technology have the potential to drive innovations in fruit-related studies and applications.Moreover, the economic and social impact is noteworthy, as automated and accurate fruit recognition can streamline processes in agriculture, food industry quality control, and nutritional assessments, contributing to enhanced productivity and healthier dietary practices.Overall, the practical applications of the AlexNet model in fruit recognition represent a transformative technology with diverse implications for various sectors and societal well-being.

CONCLUSION AND FUTURE WORK
In conclusion, our research undertook a comparative examination, juxtaposing the efficacy of deep learning (DL) with the convolutional neural network (CNN) architecture AlexNet against traditional machine learning (ML) algorithms (DT, KNN, and SVM) for the intricate task of fruit recognition and classification, utilizing the comprehensive Fruit-360 dataset.The standout revelation was the remarkable performance of the AlexNet model, achieving unparalleled accuracy rates of 99.85%, precision of 99.92%, sensitivity of 99.86%, and an F1 score of 99.89% in categorizing diverse fruits.The superiority of DL, exemplified by AlexNet, lies in its adept handling of complex image data, showcasing an innate capacity for automatic feature extraction without the need for manual engineering, a requirement often associated with traditional ML algorithms.While acknowledging the strengths of traditional ML algorithms, particularly in diverse domains, our study underscores the potency of DL, especially in image-based classification tasks, as demonstrated by the Fruit-360 dataset's diverse fruit types and the intricate nature of fruit appearances.DL models, like AlexNet, demonstrated their efficiency in addressing these complexities, suggesting their potential to significantly elevate the accuracy and efficiency of fruit recognition and classification systems, particularly in scenarios involving image data.Looking ahead, the prospect of expanding fruit classification to encompass an even more extensive array of fruit types within the same dataset holds promise.This expansion will facilitate the exploration and development of advanced CNN models that surpass the capabilities of AlexNet.Utilizing a larger dataset empowers researchers to create models that comprehend the unique features and complexities of a broader range of fruits, potentially advancing the state-of-the-art in fruit recognition and classification.This pursuit offers the exciting prospect of delivering more precise and robust solutions applicable to a wider spectrum of real-world scenarios.

Figure 1 .
Figure 1.Flow diagram 3.1.1Pre-processingUtilizing segmentation techniques, it is crucial to isolate the fruit region from the background of the image in order to obtain the necessary features for classification.To speed up processing, the image is first downsized from 100×100 to 150×150 pixels in the pre-processing stage.By removing any unwanted contaminants, the image is subsequently converted to grayscale, gray threshold, and finally black and white.Utilize the region of interest from the segmented fruit in the final fruit mask.As shown in Figure2.

Figure 2 .
Figure 2. Pre-processing steps 3.1.2Feature extraction Feature extraction is a technique for discovering more

Figure 4 .
Figure 4. Bilinear interpolation is used to estimate the values of nearby neighbors

Figure 7 . 2 * 2
Figure 7. 2*2 Confusion matrix 4.1 Results of tested ML/DL algorithms First off, multiple features that had previously been manually extracted were employed to train the same classifier for the many different types of fruit classification in order to ascertain which features had the highest classification performance.In this way, we could investigate the effects of feature extraction techniques and classifiers on the classification outcomes.We used a variety of popular forms of feature extraction, including LBP, RGB, and PCA, which are used in the identification and classification of fruits, to examine the effects of various feature extraction methods on the classification outcomes.Table 3 displays the classification outcomes for the KNN classifier using various feature extraction techniques.Table 3 displays the various classification outcomes produced by various feature extraction techniques (LBP, RGB, and PCA) for 40 classes of different types of fruits with the KNN classifier.The first column in Table 3 indicates the three types of feature extraction.and the remaining columns indicate the percentages of each feature extraction method rate depended on (precision, sensitivity, F1score, and accuracy) under the KNN classification algorithm.Furthermore, from the comparison in Table3, it can be seen that, among all the feature extraction methods, the LBP method obtained the best results in precision, sensitivity, and F1-scor of 96.339%, 95.960%, and 96.072%, respectively; This demonstrates that in the classification task, the extracted feature quality had a direct impact on the final classification outcome, as indicated by the accuracy column's LBP value of 96.072%.To effectively show research results or contrast various approaches, the Tables4 and 5that details the SVM and DT algorithms with various feature extraction techniques and their corresponding performance rates can be created.The columns that correspond to each row in the tables display the performance metrics (accuracy, precision, sensitivity, and F1 score) that were attained using that method.Each row represents a distinct feature extraction technique.This makes it possible for researchers to evaluate the effectiveness of several feature extraction methods and decide which one is best suited for their SVM and DT classification tasks.

Figure 10 .
Figure 10.DT Figures 8-10 display the confusion matrix plots of the three tested ML algorithms.When used, the dataset contains 40 types of fruits and different types of feature extraction.In a confusion matrix graphic, the ordinate represents the predicted label, and the abscissa represents the true label.The values above and below the diagonal of the confusion matrix represent the examples that were mistakenly classified, whereas the diagonal itself contains the data for the occurrences that were correctly identified.As shown in Figures 8-10, the three ML methods (KNN, SVM, and DT)confusion matrix shows the best result of the three algorithms with the feature.For the KNN algorithm, the best result is with LBP, which is equal to 96.07%; in the SVM method, the height rate is 90.40% with the RGB feature; and in the third one, the

Figure 11 .
Figure 11.Training and validation plots for 5 epochs to recognize of 59 fruits dataset Figures 8-10 display the confusion matrix plots of the three tested ML algorithms.When used, the dataset contains 40 types of fruits and different types of feature extraction.In a confusion matrix graphic, the ordinate represents the predicted label, and the abscissa represents the true label.The values above and below the diagonal of the confusion matrix represent the examples that were mistakenly classified, whereas the diagonal itself contains the data for the occurrences that were correctly identified.As shown in Figures 8-10, the three ML methods (KNN, SVM, and DT)confusion matrix shows the best result of the three algorithms with the feature.For the KNN algorithm, the best result is with LBP, which is equal to 96.07%; in the SVM method, the height rate is 90.40% with the RGB feature; and in the third one, the RGB feature has the top rate with the DT algorithm.All the details about the rate of each feature with each algorithm have been determined in the above tables.As seen in Figure11, the classification of 59 fruit types

Table 1 .
Summarizes and compares the related work outlined in section 2

Table 2 .
Name and number of each fruit

Table 3 .
The outcomes for three feature extraction techniques using the KNN classifier

Table 4 .
The outcomes for three feature extraction techniques using the SVM classifier

Table 5 .
The outcomes for three feature extraction techniques using the DT classifier

Table 6 .
Results for the tested ML & DL algorithms