Hybrid Enhanced Featured AlexNet for Milled Rice Grain Identification

Hybrid Enhanced Featured AlexNet for Milled Rice Grain Identification

Nabin Kumar Naik Prabira Kumar Sethy* | Appari Geetha Devi | Santi Kumari Behera

Department of Electronics, Sambalpur University, Burla 768019, India

Department of Electronics Communication Engineering, Prasad V. Potluri Siddhartha Institute of Technology, Vijayawada 520007, India

Department of Computer Science and Engineering, Veer Surendra Sai University of Technology, Burla 768018, India

Corresponding Author Email: 
3 March 2023
23 April 2023
3 May 2023
Available online: 
30 June 2023
| Citation

© 2023 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).



Rice is a widely cultivated grain with numerous genetic variants that can be distinguished by their unique texture, shape, and color characteristics. Accurate classification and evaluation of seed quality depend on the ability to identify these traits. In this study, we propose a novel Hybrid Enhanced Featured AlexNet model for identifying eight varieties of milled rice, including arborio, basmati, ipsala, jasmine, jhili, masoori, HMT, and karacadag. Our approach combines the use of a pre-trained AlexNet model with multilayer feature fusion to extract deep features, which are then supplied to a Support Vector Machine (SVM) for classification. Our proposed model achieves an impressive accuracy of 99.63%, sensitivity of 99.63%, specificity of 99.95%, precision of 99.64%, and an F1 score of 99.63%. Our methodology has significant potential for application in the food processing sector to determine the price of various milled rice varieties.


AlexNet, deep learning, feature fusion, identification, milled rice

1. Introduction

Rice is a widely produced and consumed crop globally, ranking third after wheat and corn. It is a rich source of carbohydrates and starch that is essential for human nourishment. The price of rice depends on several factors, including texture, shape, color, and fracture rate. Algorithms can determine these parameters and perform categorization by analyzing digital photos of rice products [1]. Machine learning and deep learning (DL) algorithms offer fast and reliable data analysis, making them an ideal tool for developing automated, inexpensive, efficient, and non-destructive ways to increase rice quality and meet food safety guidelines.

Consumers demand high-quality, safe agricultural and food goods, making strong food safety laws and mandatory product testing necessary to ensure their safety. As such, the food industry seeks fast and accurate technologies to develop harmless products that meet market quality standards. Recent research utilizes image processing and machine learning to achieve this goal. For example, a study identified the projection areas of wheat, barley, corn, chickpeas, lentils, beans, kidney beans, and soy using image processing. Image processing can accurately determine small grain projection zones [1].

Recent studies employ computerized image attributes to classify and grade rice based on geometric parameters such as length, perimeter, area, major axis, and minor axis, as well as fracture rate, whiteness, and rice grain cracks. Image-processing systems can extract grain product characteristics, and deep convolutional neural networks are particularly effective for feature extraction and picture categorization [1].

Efficiency difficulties in evaluating rice quality include physical appearance, cooking qualities, scent, taste, and smell. From a consumer's standpoint, physical appearance is the first element that stands out in packaged rice variations [2]. After manufacturing, the demand for technical approaches increases since rice calibration, type determination, and quality element separation are inefficient and time-consuming, especially for high-volume producers.

Recent studies on cereal items employing machine vision systems and image processing analyze color, texture, quality, and size. Yadav and Jindal [3] estimated ground rice quantifications by removing the grain's perimeter, length, and form. Visen et al. [4] employed an artificial neural network to analyze barley, rye, oats, wheat, and durum wheat pictures. Using the collected images, they created a classifier to recognize grains based on over 150 color and textural attributes. Baykan et al. [5] studied wheat grains and obtained images with nine morphological attributes. They classified five species with 72.62 percent accuracy, which increased to 82.65% after removing a difficult-to-classify species. Dubey et al. [6] employed three varieties of bread wheat and achieved 88 percent classification accuracy by removing 45 morphological characteristics. Zapotoczny et al. [7] used image analysis to classify five barley species using classification methods such as basic component analysis, linear and nonlinear discriminant analysis. In recent studies, various classification approaches have been used to categorize different cereal crops based on their morphological and color features. For instance, Chen et al. [8] extracted 13 forms, 17 geometric, and 28 color features from five corn cultivars and achieved a 90% accuracy rate among corn types using LDA as the classification approach. Babalik et al. [9] employed multiclass SVMs and binary PSO to categorize nine geometric and three color features from five wheat species, and attained 91.5% accuracy with multi-class SVM (M-SVM) and 92.2% accuracy with Binary Particle Swarm Optimization (BPSO). Ouyang et al. [10] created an automatic image capture band system to distinguish five rice seeds using backpropagation categorization, which achieved a success rate of 86.65%.

Farahani [11] employed five feature clusters for linear discrimination analysis by removing morphological data from five pieces of Durum wheat, and achieved 67.66% classification accuracy using 11 morphological features. Silva and Sonnadara [12] identified rice varietals using AI by extracting 13 morphological, six color, and 15 texture features from photos of nine rice varietals, and achieved a 92% success rate using all-feature categorization. Tissue features performed better than morphological and color features when classified separately. Kaur and Singh [13] classified rice using multiclass SVMs by sorting the rice with geometric features, and achieved over 86% success. Abirami et al. [14] classified basmati rice using image processing and neural network pattern recognition. They removed morphological properties of rice grains using filtering, thresholding, and edge detection, and then categorized them using neural network pattern recognition. Pazoki et al. [15] identified five rice varieties using 24 color, 11 morphological, and four shape factor criteria, and obtained 99.46% success. Szczypiski et al. [16] investigated cultivar determination on barley based on shape, color, and texture, and achieved a success rate of 67% to 86% using linear distinctive analysis and neural networks as classification approaches. AlexNet, the first CNN to win ImageNet in 2012, achieved a top 5 error rate of 16.4%, and introduced ReLUs. It consists of five convolutional, three max pool, and three fully-connected layers, and accepts a [227 227 3] image. AlexNet can represent 227 227 images as 4096-dimensional feature vectors [17]. Both transfer learning (TL) and deep learning (DL) have been used to evaluate AlexNet for rice categorization, and its architecture has been enhanced with feature fusion. Among pre-trained CNN models, only AlexNet, VGG16, and VGG19 support multilayer feature fusion. AlexNet's deep architecture and low loss function make it ideal for this research. Although VGG has more layers and a deeper architecture than AlexNet to extract features, it is slower and has more loss than AlexNet, increasing model complexity and cost. Therefore, AlexNet with multilayer feature fusion has been used to balance model complexity and performance [18, 19].

Boser et al. [20] employed a training procedure that maximizes the margin between the training pattern and decision border in their study. Kaya and Saritas [21] used 236 morphological, color, wavelet, and gaborlet features to classify vitreous, starchy durum wheat kernels, and foreign objects using several Artificial Neural Networks (ANN) trained with different features based on the feature rank list obtained with ANOVA test, achieving a classification accuracy of 93.46 percent.

SVM is a supervised binary classification method that creates a hyperplane to optimize the class margin when a training set is introduced. For linearly separable data with two classes, the system can have several hyperplanes, and SVMs identify the ideal hyperplane with the greatest distance from the support vectors. However, real-world data is often not linear, making it impossible to use a linear classifier. This led to the development of the nonlinear SVM classifier, which is kernel-based. Although the kernel technique is powerful, choosing the right kernel for a specific application or feature combination remains a challenge. In this study, a linear kernel function without any optimization was used [22].

The major contribution of this research are as follows:

•An open access milled rice grain image dataset was created for this study.

•The TL and DL method were used first time for classifying milled rice grain image.

•The multi-layer feature fusion approach is developed for classifying eight varieties of milled rice grain.

•In DL approach, the deep feature of three fully connected layer of AlexNet is evaluated. Then the feature fusion is adapted by merging top performed two sets of features.

•The Linear SVM is used to classify deep features.

The highest classification success was 99.63%, obtained from AlexNet with combination of two features, i.e., fc6 & fc7 and SVM.

The remaining article is as follows. Section 2 discussed the materials and methods. In this section, the details of the dataset and proposed methodology are described. The findings of the research are described in Section 3 with its analysis. Finally, Section 4 concludes the article with future scope.

2. Material and Method

This section detailed the dataset and proposed methodology in the appropriate subsection.

2.1 About dataset

This data set includes eight different kinds of rice, such as Arborio, Basmati, Ipsala, Jasmine, Karacadag, Jhili, HMT (Sona Masuri), and Masuri (). There are 112,000 images of milled rice, and each one has 14,000 of them. The data set was taken with a 64-Megapixel camera on a smartphone in normal daylight against a black background. When taking the pictures, extra care is taken to avoid the shadow. The dataset is public at https://data.mendeley.com/datasets/c5y6gjwdzh/1.

Some samples of milled rice grain are illustrated in Figure 1.

2.2 Proposed methodology

Within the AlexNet network, the categorization procedure can be carried out in both of two different methods. TL and DL are the two distinct pedagogical methodologies. Pedagogy refers to the methodology, that the model predicts based on their learning. In the TL approach, the available pretrained AlexNet is restructured so that it can classify data into eight different categories. When attempting to identify rice using the DL approach, each of the deep features derived from fc6, fc7, and fc8 is supplied to the SVM.  When compared to the performance of the DL strategy, it has been observed that the TL approach is performing less effectively. In addition, while attempting to extract image features using the DL approach, each fully connected layer, such as fc6, fc7, and fc8, are taken into consideration. The picture characteristics of each fully connected layer are individually provided to the SVM to be classify. After doing an analysis on the image features that were contributed by each of the three fully connected layers, the two image features that performed the best are combined in order to increase the dimension of the feature vector. Figure 2 is an illustration of the evaluation procedure that is used in both the TL technique and the DL approach

Figure 1. Samples of milled rice

Figure 2. Evaluation of transfer learning and deep learning approach of alexnet for milled rice identification

When attempting to identify rice using the DL approach, each of the deep features derived from fc6, fc7, and fc8 is supplied to the SVM in turn. Following an analysis of each procedure, the feature fusion method is introduced.  This technique combines the feature of two layers that are best contributed towards classification. In this step, the FC6 and FC7 features are extracted, and then the concatenation operation is used to merge these extracted features. After that, the combined feature vector is utilized for both the training and the prediction processes. Figure 3 provides a visual representation of this process.

Figure 3. Feature fusion of AlexNet and SVM for milled rice identification

3. Result and Discussion

The experiment is carried out in two different phases. During the initial step, both the TL approach and the DL approach are used in Alexnet. It is noticed that AlexNet did not perform very well in the TL strategy and achieved only 44.32% of accuracy. In addition, the DL strategy utilizes the support vector machine for classification rather than the SoftMax layer. In this approach, eight different types of rice are categorized based on a deep characteristic that was derived from FC6, FC7, and FC8. Each method's performance is analyzed and scored based on a number of criteria, including its accuracy, sensitivity, specificity, and precision, as well as its F1 score. Table 1 provides an explanation of the performance evaluation criterion that has been established. Both the fc6 and fc7 models of AlexNet combined with SVM performed admirably in the DL method. Concatenating is the method that is used in the second stage to integrate the deep features of fc6 and fc7. When this is done, the dimension of the feature vector will be twice as large as the dimension of any individual fc6 or fc7. The total number of features that are present in both FC6 and FC7 is 4096. The dimension of the feature has been increased to 8192 as a result of the combination of fc6 and fc7. SVM was employed to complete the classification task in order to identify the eight different varieties of rice, and fused characteristics were utilized to do so. The function known as 'fit class error-correcting output codes (fitcecoc)' was utilized so that the SVM could be trained. The Error-Correcting Output Codes, i.e., (fitcecoc)' method is a technique that allows a multi-class classification problem to be reframed as multiple binary classification problems, allowing the use of native binary classification models to be used directly. The fully trained and multiclass error-correcting output of the model is what this function gives back. The 'fitcecoc' function use a K(K-1)/2 binary SVM model and a One-Vs-All coding strategy to perform its analysis. It has been noticed that the fused feature of fc6 and fc7 of AlexNet combined with SVM performed well and improved the classification accuracy when attempting to identify the eight different types of rice.

Table 1. Performance evaluation of AlexNet in transfer learning and deep learning approach

Performance metrics in (%)

AlexNet in transfer learning approach

AlexNet in Deep learning
























F1 score





Table 2. Evaluation indicators (%) of AlexNet in feature fusion approach for rice classification

Performance metrics in (%)










F1 score


Table 2 recorded the matrix measurement of fused feature (fc6 + fc7) of AlexNet and SVM. Again, there is no question that the fused feature requires a greater amount of computational time than the individual feature does; but the other matrix measurements are high, which is exactly what is wanted.

4. Conclusion

Our study was geared toward the development of an automated method that could be used in the field of the food processing sector to detect milled rice. This was the primary objective of our efforts. The methodology that was proposed was successfully implemented over eight different types of rice, namely Arborio, Basmati, Ipsala, Jasmine, Karadag, Jhili, HMT (Sona Masoori), and Massori, with an accuracy, sensitivity, specificity, precision, and F1 score of 99.63 percent, 99.63 percent, 99.95 percent, 99.64 percent, and 99.63 percent respectively. The outcome was analyzed and found to be satisfactory. The accuracy of this work can be summarized as 99.63 percent overall.


[1] Kikuchi, M., Haneishi, Y., Tokida, K., Maruyama, A., Asea, G., Tsuboi, T. (2016). The structure of indigenous food crop markets in sub-Saharan Africa: The rice market in Uganda. The Journal of Development Studies, 52(5): 646-664. http://doi.org/10.1080/00220388.2015.1098629 

[2] Manjunatha, G.A., Elsy, C.R., Joseph, J., Francies, R.M. (2021). Molecular characterization and genetic diversity analysis of aromatic rice (Oryza sativa L.) landraces using SSR markers. Electronic Journal of Plant Breeding, 12(2): 576-582. http://doi.org/10.37992/2021.1202.081 

[3] Yadav, B.K., Jindal, V.K. (2001). Monitoring milling quality of rice by image analysis. Computers and Electronics in Agriculture, 33(1): 19-33. http://doi.org/10.1016/S0168-1699(01)00169-7 

[4] Visen, N.S., Paliwal, J., Jayas, D., White, N.D.G. (2004). Image analysis of bulk grain samples using neural networks. Canadian Biosystems Engineering, 46: 11-15. 

[5] Baykan, O., Babalik, A., Botsalı, F. (2005). Recognition of wheat species using artificial neural network. International Symposium on Advanced Technologies, Konya, pp. 28-30.

[6] Dubey, B., Bhagwat, S.G., Shouche, S.P., Sainis, J.K. (2006). Potential of artificial neural networks in varietal identification using morphometry of wheat grains. Biosystems Engineering, 95(1): 61-67. https://doi.org/10.1016/j.biosystemseng.2006.06.001

[7] Zapotoczny, P., Zielinska, M., Nita, Z. (2008). Application of image analysis for the varietal classification of barley: Morphological features. Journal of Cereal Science, 48(1): 104-110. https://doi.org/10.1016/j.jcs.2007.08.006

[8] Chen, X., Xun, Y., Li, W., Zhang, J. (2010) Combining discriminant analysis and neural networks for corn variety identification. Computers and Electronics in Agriculture, 71: S48-S53. https://doi.org/10.1016/j.compag.2009.09.003

[9] Babalik, A., Baykan, Ö.K., İşcan, H., Babaoğlu, İ., Fındık, O. (2010). Effects of feature selection using binary particle swarm optimization on wheat variety classification. In International Conference on Advances in Information Technology, Bangkok, Thailand, pp. 11-17. https://doi.org/10.1007/978-3-642-16699-0_2 

[10] Ouyang, A.G., Gao, R.J., Sun, X.D., Pan, Y.Y., Dong, X.L. (2010). An automatic method for identifying different variety of rice seeds using machine vision technology. In Sixth International Conference, Yantai, China, pp. 84-88. http://doi.org10.1109/ICNC.2010.5583370 

[11] Farahani, L. (2012). Discrimination of some cultivars of durum wheat (Triticum durum Desf.) using image analysis. International Research Journal of Applied and Basic Sciences, 3(7): 1375-1380.

[12] Silva, C.S., Sonnadara, U. (2013). Classification of rice grains using neural networks. Proceedings of Technical Sessions, 29: 9-14.

[13] Kaur, H., Singh, B. (2013). Classification and grading rice using multiclass SVM. International Journal of Scientific and Research Publications, 3(4): 624-628.

[14] Abirami, S., Neelamegam, P., Kala, H. (2014). Analysis of rice granules using image processing and neural network pattern recognition tool. International Journal of Computer Applications, 96(7): 20-24.

[15] Pazoki, A.R., Farokhi, F., Pazoki, Z. (2014). Classification of rice grain varieties using two Artificial Neural Networks (MLP and Neuro-Fuzzy). The Journal of Animal & Plant Sciences, 24(1): 336-343.

[16] Szczypiński, P.M., Klepaczko, A., Zapotoczny, P. (2015). Identifying barley varieties by computer vision. Computers and Electronics in Agriculture, 110: 1-8. https://doi.org/10.1016/j.compag.2014.09.016

[17] Krizhevsky, A., Sutskever, I., Hinton, G.E. (2012). Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems, Lake Tahoe, CA, USA, pp. 1097-1105.

[18] Krizhevsky, A., Sutskever, I., Hinton, G.E. (2017). ImageNet classification with deep convolutional neural networks. Common ACM, 60(6): pp. 84-90. https://doi.org/10.1145/3065386

[19] Sethy, P.K. (2022). Identification of wheat tiller based on AlexNet-feature fusion. Multimedia Tools and Applications, 81(6): 8309-8316. https://doi.org/10.1007/s11042-022-12286-4

[20] Boser, B.E., Guyon, I.M., Vapnik, V.N. (1992) A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, PA, USA, pp 144-152. https://doi.org/10.1145/130385.130401

[21] Kaya, E., Saritas, İ. (2019). Towards a real-time sorting system: Identification of vitreous durum wheat kernels using ANN based on their morphological, colour, wavelet and gaborlet features. Computers and Electronics in Agriculture, 166, 105016. https://doi.org/10.1016/j.compag.2019.105016

[22] Sun, C., Liu, T., Ji, C., Jiang, M., Tian, T., Guo, D., Liang, X. (2014). Evaluation and analysis the chalkiness of connected rice kernels based on image processing technology and support vector machine. Journal of Cereal Science 60(2): 426-432. https://doi.org/10.1016/j.jcs.2014.04.009