An Automated Mucilage Detection Model Using Deep Convolutional Neural Network: TuncerNeXt

An Automated Mucilage Detection Model Using Deep Convolutional Neural Network: TuncerNeXt

Mert Gurturk* Veysel Yusuf Cambay Rena Hajiyeva Sengul Dogan Turker Tuncer

Civil Engineering Department, Engineering Faculty, Adiyaman University, Adiyaman 02020, Turkey

Department of Digital Forensics Engineering, College of Technology, Firat University, Elazig 23119, Turkey

Department of Information Technologies, Western Caspian University, Baku AZ1001, Azerbaijan

Corresponding Author Email: 
sdogan@firat.edu.tr
Page: 
1333-1342
|
DOI: 
https://doi.org/10.18280/ts.420310
Received: 
24 July 2024
|
Revised: 
23 October 2024
|
Accepted: 
8 November 2024
|
Available online: 
30 June 2025
| Citation

© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

Convolutional Neural Networks (CNNs) are distinguished for their exceptional performance in image classification. A number of these models have been developed, drawing inspiration from seminal works. This research introduces an innovative CNN model that integrates attention mechanisms specifically tailored for detecting mucilage on the ocean surface. To facilitate this research, a comprehensive dataset was assembled from 15 disparate ports, segmented into three distinct categories: the presence of mucilage, sea surface without waves, and sea waves. The rationale for including the sea wave category is to augment the accuracy of the proposed CNN model by accounting for the morphological similarities between sea waves and mucilage. The developed model, termed TuncerNeXt, comprises four principal components: a stem, TuncerNeXt blocks, downsampling stages, and an output phase. The novelty of TuncerNeXt resides in its fusion of attention mechanisms with residual blocks, taking cues from the structural design of ConvNeXt's principal block. This innovative approach has resulted in TuncerNeXt being a streamlined CNN model, boasting approximately 2.1 million adjustable parameters, rendering it an efficacious approach for image classification endeavors. Upon evaluation with the compiled dataset, TuncerNeXt achieved a validation accuracy of 97.60% and a test accuracy of 98.66%.

Keywords: 

automatic mucilage mapping, computer vision, image classification, mucilage detection, TuncerNeXt

1. Introduction

Mucilage, a jelly-like substance produced by various marine organisms, poses an ecological problem [1, 2]. Mucilage can significantly impact marine ecosystems primarily as a defense mechanism against predators or environmental stressors [3]. It consists of a complex mixture of organic and inorganic substances originating from phytoplankton, zooplankton, and bacteria [4]. The formation of mucilage causes clogging of fishing nets and boat propellers. It also seriously threatens marine life by choking coral reefs and other habitats [5]. For example, it can hinder the respiratory function of fish and other marine creatures by clogging their gills and depleting oxygen levels in marine habitats, thus compromising their survival [6]. The formation of mucilage events is complex and not fully resolved. However, several contributing factors have been identified, including nutrient pollution from excess nitrogen and phosphorus, which promotes the growth of mucilage-producing organisms. Climate change, manifested as higher water temperatures and increased ocean acidification, also affects the proliferation of these organisms. Overfishing of mucilage-consuming species can lead to uncontrolled increases in mucilage production [7, 8].

Monitoring mucilage events poses significant challenges due to the laborious and expensive nature of traditional methods such as ship-based surveys and satellite image analysis [9, 10]. In contrast, automatic mucilage detection models are emerging as a more efficient and cost-effective solution. With advancing technology, advanced Convolutional Neural Networks (CNNs) have made significant progress, especially in marine conservation [11]. CNNs can achieve great success in image recognition and classification [12-14]. In particular, CNNs are used to detect complex natural formations such as mucilage on the sea surface [15]. A CNN-based approach was proposed in this study due to its ability to detect complex patterns and structures in images. From simple edge detection to increasingly complex object recognition features, CNNs can delve deeper into images, performing layer-by-layer feature extraction [16]. This approach can even distinguish subtle differences between mucilage and other natural formations on the water surface, vital for monitoring and managing marine pollution. Therefore, a CNN-based model provides a powerful, effective, and accurate solution for this task [17].

In this study, we developed TuncerNeXt to detect mucilage, a jelly-like substance found in the oceans produced by some marine plants and bacteria that pose problems for marine life and human activities. Using advanced Convolutional Neural Networks (CNNs), TuncerNeXt analyzes ocean images to identify mucilage accurately. To improve TuncerNeXt's learning process, a large dataset of ocean images from 15 different locations was compiled. Our results demonstrate the high accuracy of TuncerNeXt, making it a useful tool for researchers and others monitoring ocean health. This advance contributes to the conservation of marine ecosystems by providing new information on understanding and managing mucilage.

1.1 Related works

New machine learning models have been proposed in the literature to solve problems in different disciplines [18-20]. Some machine learning-based mucilage detection models developed in the literature are presented as follows. Yilmaz et al. [21] employed a cloud-free Sentinel-2 image and innovative water-related spectral indices (NDTI, NDWI, and AMEI) within a CNN model to detect marine mucilage formations in the Dardanelles Strait, Turkey, achieving classification accuracies up to 98%. Using the AMEI index notably enhanced mucilage detection accuracy, as validated by explainable AI techniques such as SHAP and integrated gradients, demonstrating the potential of integrating remote sensing and deep learning for environmental monitoring. Kikaki et al. [22] focused on utilizing the MADOS benchmark dataset, which comprised high-resolution multispectral Sentinel-2 data collected between 2015 and 2022. Their dataset included approximately 1.5 million annotated pixels across 174 scenes, encompassing a diverse range of pollutants, sea surface features, and global water-related thematic classes under various weather conditions. Their deep learning framework, MariNeXt, based on recent architectural advancements for semantic segmentation, was proposed and demonstrated to outperform all baselines by at least 12% in F1 and mIoU metrics. Colkesen et al. [23] suggested a comprehensive approach for mapping floating algal blooms in Lake Burdur, Turkey, by analyzing seven Sentinel-2 images selected through time series analysis on the Google Earth Engine platform. Their methodology integrated both index-based mapping, using indices like FAI, AFAI, SABI, and ABDI, and classification-based mapping with algorithms such as RF, XGBoost, and LSTM. This approach successfully detected high-density floating algae formations with an accuracy exceeding 99% for both methods, highlighting pixel-based classification's particular efficacy for low-density blooms. The findings underscore the utility of merging spectral indices and machine learning techniques in environmental monitoring tasks, notably for the precise mapping of algal blooms in freshwater bodies. Figueroa et al. [24] explored phytoplankton detection in freshwater using deep learning, employing Faster R-CNN and RetinaNet on a dataset of 293 images capturing diverse species and conditions. Their study demonstrated Faster R-CNN's superiority in precision and recall, achieving up to 95.35% recall and 94.68% precision for specific phytoplankton types. Tokatlı et al. [25] assessed organic contaminants in the Çanakkale Strait Basin, Turkey, to understand the mucilage threat, analyzing water samples for eight parameters in spring 2023. They used indices like NPI, WQI, HQ, and HI for water quality and health risk assessment. Their research found increased organic pollution from upstream to downstream, with Çanakkale Stream being the most polluted, yet indicated minimal non-carcinogenic health risks for humans.

1.2 Literature gaps

The identified literature gaps according to the reviewed literature are:

  • There is stagnation in proposing new-generation CNN [26-28] models since many researchers have used transformers to classify images due to advancements in large language models, vision transformers [29], and Swin transformers [30]. Moreover, champion CNNs already achieve very high classification performances. Therefore, most researchers have not proposed new-generation CNNs.
  • Mucilage is a global problem for the seas. Some researchers have proposed mucilage detection models using deep learning and machine learning, but the number of these studies is limited.
  • Mucilage detection models have generally been applied with two classes.

1.3 Motivation and study outline

The objective of this study is twofold: to enhance mucilage detection on the ocean's surface and to contribute significantly to advancing the domain of image classification within marine science. Mucilage, a gelatinous substance secreted by marine algae, poses increasing threats to marine ecosystems by obstructing the movement of marine organisms, reducing sunlight penetration, and decreasing oxygen levels in aquatic environments. These adverse effects highlight the critical need for effective detection and monitoring strategies, thereby motivating the development of a sophisticated model adept at identifying mucilage under the complex conditions of the sea surface.

In response to the limitations of existing detection methods and the imperative for enhanced accuracy, this research undertakes the compilation of a comprehensive dataset that accurately reflects the intricacies of marine settings. Unparalleled in its breadth and specificity, the dataset encompasses images of mucilage, unblemished sea surfaces, and sea waves sourced from 15 distinct ports across the globe. Such varied data ensures that our TuncerNeXt model benefits from exposure to a broad spectrum of maritime conditions, thereby improving its precision and reliability.

TuncerNeXt marks an innovation in CNN architecture, challenging the current stagnation in the field by merging the robust features of CNNs with the sophisticated capabilities of transformer models. By integrating a novel attention mechanism and a modified ConvNeXt block [31], TuncerNeXt accentuates essential attributes within image data, leading to markedly accurate classification results. The model's design emphasizes scalability, allowing adjustments to meet diverse computational demands without sacrificing efficacy. This scalability, coupled with the incorporation of transformer-like features, establishes TuncerNeXt as an avant-garde approach to addressing the complex challenge of detecting marine mucilage.

1.4 Novelties and contributions

The innovative aspects and contributions of this research are:

Novelties:

  • A comprehensive dataset of mucilage images has been compiled, serving as a foundational resource for future endeavors in mucilage detection research.
  • An attention-based CNN model has been introduced, with its classification efficacy validated on the newly curated mucilage image dataset.

Contributions:

  • Recognizing the challenges in computer vision, this study employs a novel deep-learning strategy to address an environmental concern: the detection of mucilage. Given the extensive areas of seas and the limited resolution of satellite images for accurate mucilage identification, our investigation presents a specialized detection model suited for deployment on Unmanned Aerial Vehicles (UAVs). This approach utilizes a distinctive dataset of mucilage images acquired through UAVs, incorporating sea wave images to bolster the model's detection robustness.
  • Additionally, this research contributes a significant methodological advancement to the field of CNNs by creating TuncerNeXt, an innovative CNN model. Despite being designed for minimal computational load, with roughly 2.1 million parameters, TuncerNeXt showcases remarkable accuracy, achieving 98.66% on test datasets and 97.60% on validation datasets utilizing the assembled mucilage images. This achievement highlights TuncerNeXt's operational efficiency and its potential impact on environmental monitoring tasks.
2. Mucilage Image Dataset

For this investigation, we developed a unique dataset consisting of mucilage imagery collected from 15 different ports with a particular focus on the Sea of Marmara. This focus was strategically chosen due to the Sea of Marmara's characteristics as a semi-enclosed sea, where instances of mucilage have been increasingly reported, largely attributed to pollution from Istanbul and surrounding urban locales. The dataset organizes the images into three distinct categories: mucilage presence, clear sea surfaces, and sea wave scenarios, aiming to support a thorough examination of mucilage detection across varied maritime conditions.

This dataset embodies a heterogeneous assortment of images, systematically classified into the aforementioned three categories, as detailed in Table 1.

Table 1. The distribution of the collected mucilage image dataset

No.

Class

Train

Test

Total

1

Mucilage

2051

680

2731

2

Clean sea cover

3374

1120

4494

3

Sea wave

1560

515

2075

Total

6985

2315

9300

As shown in Table 1, the dataset displays class imbalance, a prevalent issue in machine learning research, which our methodology seeks to mitigate. Additionally, Figure 1 presents exemplary images from each category, visually representing the dataset's diversity.

(a) Mucilage

(b) Clean sea cover

(c) Sea wave

Figure 1. Sample images of the collected dataset

Beyond mere collection and categorization, substantial effort was expended on the precise segmentation of the images into their respective classes. This process utilized automated methodologies and manual validation to guarantee the dataset's integrity and applicability for training and evaluative purposes. Focusing on the Sea of Marmara reflects the geographical significance of mucilage occurrences and augments the dataset's specificity, facilitating the development of more effective detection models. The compilation of this dataset constitutes a pivotal advancement in enhancing both scientific comprehension and technological proficiency in tackling the ecological issue of mucilage within marine settings.

3. The Proposed CNN Model: TuncerNeXt

The principal innovation of this study is the introduction of TuncerNeXt, a new-generation CNN model. To design TuncerNeXt, a strategic roadmap was developed, beginning with inspiration from attention mechanisms known to enhance classification performance significantly. In the second phase, we employed a modified ConvNeXt block and integrated it within an attention framework. Subsequently, the model incorporates a block inspired by transformer technology, specifically an inverted bottleneck block, to further refine its capabilities. To elucidate the architecture of the main block within TuncerNeXt, Figure 2 provides a graphical representation, offering a clear visual outline of its structure.

The mathematical definition of the proposed main block of the TuncerNeXt is given below:

$Att=BN\left( Sigmoid\left( C\left( In,1,F \right) \right) \right)\times BN\left( Sigmoid\left( C\left( In,1,F \right) \right) \right)$                    (1)

Figure 2. Graphical explanation of the presented main block. Herein, F: Number of filters, Concat: Depth concatenation, Grouped: Grouped convolution, BN: Batch normalization, GELU: Gaussian error linear unit

Herein, the attention ($Att$) output is derived by deploying 1×1 convolutions, batch normalizations, sigmoid functions (activations), and multiplication operators. Where $In:~$input data and $C\left( .,.,. \right)$: convolution. The convolution function takes three parameters: (i) input, (ii) filter size, and (iii) number of filters.  After that, we have proposed a modified ConvNeXt block and to add the ConvNeXt features to these attention features.

 $NeX{{t}_{1}}=BN\left( C\left( In,3,F \right) \right)$                 (2)

$NeX{{t}_{2}}=GELU\left( C\left( NeX{{t}_{1}},1,4F \right) \right)$                   (3)

$NeX{{t}_{3}}=BN\left( C\left( NeX{{t}_{2}},1,F \right) \right)$                    (4)

$NeXt=NeX{{t}_{3}}+In$                       (5)

$Ou{{t}_{1}}=Concat\left( NeXt,~Att \right)$                       (6)

We have used 3×3, 1×1, and 1×1 convolutions to create an inverted bottleneck. Moreover, we have concatenated the generated ConvNeXt features (NeXt) to attention features, and we have created the first output ($Ou{{t}_{1}}$).

In the third step of the proposed TuncerNeXt block, we have proposed a transformer-like block and we have been inspired by the swin transformer to propose this block. This block is a modified version of the swin transformer block since we have used 3×3 (depth-wise convolution) and 1×1 (pixel convolution). Also, we have used a convolution-based residual block in this step.

$T{{r}_{1}}=GELU\left( BN\left( Ou{{t}_{1}} \right)+C\left( BN\left( Ou{{t}_{1}} \right),3,2F \right)+Ou{{t}_{1}} \right)$                  (7)

$Out=GELU\left( BN\left( C\left( T{{r}_{1}},1,F \right) \right)+In \right)$                    (8)

where, $T{{r}_{1}}$: the first transformer output and Out: the output of the presented main block.

By using this main block, we have proposed a new CNN model, which is termed TuncerNeXt. The graphical explanation of the presented TuncerNeXt is depicted in Figure 3.

As illustrated in Figure 3, the architecture of TuncerNeXt comprises four main blocks and three downsampling blocks to derive feature vectors. The output block utilizes pixel-wise convolution, a convolution-based residual layer, global average pooling, a fully connected layer, and a Softmax function for classification outcomes. Furthermore, the mathematical representation of TuncerNeXt is detailed in Table 2.

Table 2 demonstrates the mathematical definition of the proposed TuncerNeXt. Per Table 2, the total trainable is calculated as 2.1 million. The explanation of these phases is given below.

Stem: This initial layer preprocesses the input image (224 × 224 pixels, RGB). It applies a 7×7 convolution with 64 filters, followed by batch normalization (BN) and GELU activation, downsampling the input to 56 × 56 × 64.

Figure 3. The graphical depiction of the proposed TuncerNeXt. Here, GAP: Global average pooling

Table 2. The mathematical depiction of the presented TuncerNeXt

Layer

Input

Operation

Output

Stem

224 × 224 × 3

7 × 7, 64, BN + GELU, stride: 4

56 × 56 × 64

Main 1

56 × 56 × 64

                                      $\left[ \begin{matrix}  3\times 3,64  \\  1\times 1,256  \\ 1\times 1,64~~  \\ \end{matrix} \right]\oplus \left[ \left( 1\times 1,64 \right)\otimes \left( 1\times 1,64 \right) \right]$

56 × 56 × 64

Downsampling

56 × 56 × 64

3 × 3, 128, BN + GELU, stride: 2

28 × 28 × 128

Main 2

28 × 28 × 128

                                      $\left[ \begin{matrix} 3\times 3,128  \\  1\times 1,512  \\    1~\times 1,128~  \\ \end{matrix} \right]\oplus \left[ \left( 1\times 1,128 \right)\otimes \left( 1\times 1,128 \right) \right]$

28 × 28 × 128

Downsampling

28 × 28 × 128

3 × 3, 256, BN + GELU, stride: 2

14 × 14 × 256

Main 3

14 × 14 × 256

$\left[ \begin{matrix}  3\times 3,256  \\  1\times 1,1024  \\  1~\times 1,256~  \\ \end{matrix} \right]\oplus \left[ \left( 1\times 1,256 \right)\otimes \left( 1\times 1,256 \right) \right]$

14 × 14 × 256

Downsampling

14 × 14 × 256

3 × 3, 512, BN + GELU, stride: 2

7 × 7 × 512

Main 4

7 × 7 × 512

                                      $\left[ \begin{matrix}   3\times 3,512  \\   1\times 1,2048  \\   1~\times 1,512~  \\ \end{matrix} \right]\oplus \left[ \left( 1\times 1,512 \right)\otimes \left( 1\times 1,512 \right) \right]$

7 × 7 × 512

Output size

7 × 7 × 512

1 × 1, 1024, BN + GELU, fully connected layer, Softmax, classification

Number of classes

Total learnable parameters

2.1 million

Main 1: This layer processes the output of the Stem layer (56 × 56 × 64). It employs a complex operation with two branches:

Branch 1: It is a modified version of the ConvNeXt (3×3 with 64 filters, 1×1 with 256 filters, 1×1 with 64 filters).

Branch 2: A single 1×1 convolution with 64 filters.

The results of these branches are combined using element-wise addition (⊕) and multiplication (⊗). The output maintains the input dimensions (56 × 56 × 64).

Downsampling: This layer focuses on reducing the image's spatial size while increasing the number of feature channels. It processes the output of Main 1, applying a 3x3 convolution with 128 filters, BN, GELU activation, and a stride of 2. This results in a smaller output (28 × 28 × 128).

Main 2:  Structurally similar to Main 1, this layer operates on the downsampled output.  It maintains the input dimensions of 28 × 28 × 128, adjusting filter sizes in its complex operation accordingly.

Downsampling (Repeated): These layers progressively reduce spatial dimensions further, following the pattern of the first downsampling layer, and prepare the data for subsequent Main blocks.

Main 3 & Main 4: These layers mirror the structure of the earlier Main blocks, operating on the increasingly downsampled feature maps.

Output size: This final phase processes the output of Main 4 for classification. It includes a 1×1 convolution (1024 filters), BN, GELU, a fully connected layer, and Softmax activation. Output dimensions are based on the number of classes in your dataset.

The design of the TuncerNeXt model is inspired by systematic experimentation and the latest generation of CNN and transformer architectures. We began with a modified ConvNeXt block and gradually integrated attention mechanisms inspired by vision transformers in a fully convolutional manner to build the architecture. Each design choice, including the number of filters and blocks, was validated through ablation studies to maximize classification accuracy on a challenging multi-class mucilage detection dataset. In the stem and main blocks, 3×3 convolutions with varying filter sizes were used to efficiently capture spatial information while keeping computational requirements low. The choice of filter size and the inclusion of residual connections across these blocks were informed by performance evaluations on validation data and insights from the ConvNeXt architecture. Specifically, the model uses 1×1 convolution within attention blocks to selectively emphasize features, optimizing focus on regions of interest. Our primary goal with this model is to achieve high performance using a small number of learnable parameters. As a result of the ablation studies, we improved feature extraction compared to traditional CNN blocks by employing bottleneck and attention mechanisms, while maintaining computational efficiency with only 2.1 million learnable parameters.

4. Experimental Results

In this study, we introduce a novel Convolutional Neural Network (CNN) model, TuncerNeXt, and detail its training process using the MATLAB Deep Network Designer. The training was conducted on a personal computer equipped with 128 gigabytes of main memory, a 3.6 GHz processor, and an NVIDIA GeForce RTX 4090 graphics processing unit. The design of our model was from scratch, incorporating 121 operations (including convolution, batch normalization (BN), activations, global average pooling (GAP), etc.) and 149 connections. Additionally, the code for TuncerNeXt is provided in the appendix.

The dataset utilized in this research was divided into two main directories: train and test. We trained TuncerNeXt on the training dataset using the default parameters provided by the MATLAB Deep Network Designer, without performing any fine-tuning operations. Specifically, the following hyperparameters were chosen:

Solver: Stochastic Gradient Descent with Momentum (SGDM) was selected for its balance between convergence speed and stability, enabling the model to avoid local minima effectively.

Initial Learning Rate: Set to 0.01, this value allowed for gradual learning without overshooting the minima. It is a moderate rate commonly used for models that incorporate complex architectures, ensuring that learning occurs steadily.

Maximum Epochs: The number of epochs was capped at 30 to avoid overfitting while allowing the model sufficient exposure to the data.

L2 Regularization: The weight decay was set to 0.0001 to prevent overfitting by penalizing large weights while still allowing the model to learn significant features from the dataset.

Training and Validation Split Ratio: A 70:30 split ensured a balanced dataset division.

Augmentation: No data augmentation was applied to show TuncerNeXt’s raw performance on real-world data.

This configuration provided a strong baseline for training without extensive fine-tuning, highlighting TuncerNeXt's capability to achieve high accuracy on mucilage detection tasks with minimal adjustments to default settings. Utilizing these parameters, the training and validation performance of the model is illustrated in Figure 4.

Figure 4. Training and validation curves of the presented TuncerNeXt on the collected mucilage image dataset

Based on the training outcomes, the final validation accuracy achieved by the model is 97.80%, with a final loss value recorded at 0.2915.

As depicted in Figure 5, the computed results are summarized in Table 3.

Table 3 illustrates that the model achieved an overall classification accuracy of 98.66%, an unweighted average recall of 98.46%, an unweighted average precision of 98.33%, and an overall F1-score of 98.38%. Notably, the Clean Sea class exhibited the highest accuracy across recall, precision, and F1-score metrics.

Metrics such as classification accuracy, recall, precision, and F1-score were employed to evaluate the classification performance of the proposed model. These metrics were calculated using the test image dataset to derive the test results. The computations for these metrics were facilitated by the confusion matrix presented in Figure 5.

Furthermore, we explored the transfer learning capabilities of the proposed model through deep feature engineering. Utilizing the pretrained TuncerNeXt, we extracted features using its Global Average Pooling (GAP) layer, yielding 1024 features per image. For feature selection, the Iterative Neighborhood Component Analysis (INCA) [32] feature selector was employed, an advanced version of the NCA feature selector that utilizes a range of iterations (100-1024) and a loss value computation function (SVM classifier [33, 34] with 10-fold cross-validation). The classification was performed using an SVM classifier. This deep feature engineering approach was applied to the test images. Figure 6 graphically represents the deep feature engineering model employing the advanced TuncerNeXt.

Table 3. The computed test classification results

No.

Class

Accuracy

Recall

Precision

F1-Score

1

Mucilage

-

96.62

99.10

97.84

2

Clean sea

-

99.73

99.47

99.60

3

Sea wave

-

99.03

96.41

97.70

Overall

98.66

98.46

98.33

98.38

Figure 5. The computed test confusion matrix. Herein, 1: Mucilage, 2: Clean sea, 3: Sea wave

Figure 6. The presented deep feature engineering model based on the recommended TuncerNeXt

In the deep feature engineering approach, the INCA feature selector was implemented to enhance the performance of TuncerNeXt. The iterative process of feature selection employed by INCA is illustrated in Figure 7.

According to Figure 7, the optimal feature vector comprises 751 features. These features were classified using an SVM classifier with the following parameters:

Kernel Function: Cubic (third-degree polynomial),

Kernel Scale: Automatic,

Box Constraint: 1,

Coding Scheme: One-vs-all,

Validation Method: 10-fold cross-validation.

Utilizing this configuration (Cubic SVM [35]), the resulting confusion matrix and classification outcomes are presented in Figure 8 and summarized in Table 4.

As illustrated in Figure 8, the derived classification results of the presented TuncerNeXt-based deep feature engineering model are summarized in Table 4.

Figure 7. Iterative feature selection process

Figure 8. The confusion matrix of the presented deep feature engineering model

Table 4. The computed test classification results of the deep feature engineering model

No.

Class

Accuracy

Recall

Precision

F1-Score

1

Mucilage

-

98.82

98.82

98.82

2

Clean sea

-

99.73

99.82

99.78

3

Sea wave

-

98.45

98.26

98.35

Overall

99.18

99.00

98.97

98.98

The deep feature engineering model achieved a classification accuracy of 99.18%, an overall recall of 99%, an overall precision of 98.97%, and an overall F1-score of 98.98%.

5. Discussions

In this study, we introduced a novel dataset of mucilage images and proposed a new Convolutional Neural Network (CNN) model named TuncerNeXt, an original contribution to deep learning models. The proposed TuncerNeXt model achieved a test classification accuracy of 98.66% and a validation accuracy of 97.80% on the collected dataset. Furthermore, we developed a new deep feature engineering approach to enhance the test classification performance of TuncerNeXt. This approach employs INCA and SVM to improve classification performance. INCA was utilized to select the optimal 751 features out of the generated 1024 features, and the best-performing classifier, SVM, was chosen for classification. To establish a benchmark for comparison with other classifiers, we presented the classification accuracies of Decision Tree (DT) [36], Linear Discriminant Analysis (LDA) [37], Naïve Bayes (NB) [38], SVM [33, 34], k-nearest neighbors (kNN) [39], Bagged Tree (BT) [40], Multilayer Perceptron (MLP) [41, 42], and Logistic Regression Kernel (LRK) [43]. The classification accuracies of these classifiers are illustrated in Figure 9.

Based on the findings illustrated in Figure 9, the SVM emerged as the superior classifier, achieving the highest classification accuracy of 99.18% on the test image dataset. Conversely, Naïve Bayes (NB) was identified as the least effective classifier with a classification accuracy of 97.54%. The Logistic Regression Kernel (LRK) classifier was the second most effective, attaining a classification accuracy of 98.83%.

The implementation of the deep feature engineering approach significantly enhanced the test classification performance of the TuncerNeXt model. This approach achieved a classification accuracy of 99.18%, compared to the 98.66% classification accuracy obtained by the TuncerNeXt model without deep feature engineering.

Furthermore, the performance of the presented model was evaluated against other models, with comparative results detailed in Table 5.

Figure 9. The classification performances of the shallow classifiers

Table 5. Comparative results

Study

Method

Number of Samples

Split Ratio

Acc (%)

Hacıefendioglu et al. [44]

ResNet-50

1635 satellite images

80:20

100.0

Kavzaoglu et al. [45]

CNN

Unspecified

60:20:20

99.49

Sanver and Yesildirek [46]

ResNet-50

2250

60:40

96.09

Our study

TuncerNeXt

9300

52.5:22.5:25

98.66

Our study

TuncerNeXt-based deep feature engineering

2315 test image out of the 9300 images

10-fold CV for test images (2315 images)

99.18

Satellite images have been analyzed, but differences between waves and sea snot have not been calculated. The places they have identified may also have waves. At this stage, clearer images need to be used, and these images are necessary for detecting waves and sea snot. Our dataset contains three classes (while others have two), which is larger than other datasets. Additionally, we have proposed deep learning and deep feature engineering models, which have achieved test accuracies of over 98.5% in classification. Table 5 indicates that the proposed TuncerNeXt model achieved satisfactory outcomes in mucilage detection. It is important to note that our dataset encompasses three classes, whereas other datasets utilized for comparison comprise only two classes.

Further discussions on the findings, advantages, limitations, and future research directions are presented in the subsequent sections.

Findings:

  • The proposed TuncerNeXt-based mucilage detection model achieved remarkable performance with a validation accuracy of 97.60% and a test accuracy of 98.66%.
  • The model's integration of attention mechanisms and residual blocks proved effective in enhancing classification efficacy, showcasing its potential for real-world applications in environmental monitoring and marine science.
  • The sea wave class achieved the highest accuracy because it has the most distinct features and the largest number of training images.
  • We have proposed a deep feature engineering model based on the presented TuncerNeXt, and the presented deep feature engineering model achieved 99.18% classification accuracy.
  • For the deep feature engineering model, the length of the selected best feature combination is 751.
  • The SVM classifier is the best classifier among the tested shallow classifiers for the presented deep feature engineering model.

Advantages:

  • TuncerNeXt is a novel CNN architecture that seamlessly integrates attention mechanisms and residual blocks, offering a sophisticated approach to image classification.
  • The model is trained and evaluated on a meticulously curated dataset comprising diverse images of sea waves and mucilage. This ensures comprehensive coverage of real-world scenarios and enhances the model's adaptability.
  • By incorporating sea wave images alongside mucilage data, TuncerNeXt demonstrates robust performance in distinguishing mucilage amidst challenging maritime conditions, facilitating accurate detection even in dynamic ocean environments.
  • TuncerNeXt exhibits strong transfer learning capabilities, enabling the extraction and utilization of valuable features from pre-trained models. This enhances the model's versatility and efficiency in adapting to new datasets or domains with minimal additional training data.
  • The proposed TuncerNeXt has 2.1 million parameters but attained high classification (over 98.5% test classification accuracies) performances. In this aspect, our model is lighter than that of MobileNetV2.
  • The proposed TuncerNeXt-based model attained satisfactory classification results on the collected three classes dataset.

Limitations:

  • Explainable results can be given.
  • A larger and more diverse dataset could enhance the model's applicability in ocean engineering. However, we tested the TuncerNeXt model on the largest possible image dataset specifically curated for this study. Unlike most other research, which typically utilizes satellite images, we assembled a high-resolution dataset to maximize the model's capability.
  • The TuncerNeXt model could also be evaluated on widely recognized datasets such as ImageNet or CIFAR-10 to assess its generalization performance on broader image classification tasks.

Future directions:

  • We plan to use techniques like attention visualization and saliency maps to illuminate the presented TuncerNeXt's decision-making process.
  • Expanding and diversifying the training dataset with images from varied geographical locations and environmental conditions is expected to enhance the model's ability to adapt to real-world scenarios and provide more reliable detection.
  • It can be beneficial to incorporate additional data sources, such as oceanographic data, satellite imagery, or environmental sensor readings, to provide valuable complementary information that may improve detection accuracy and resilience to environmental changes.
  • Techniques like neural network pruning, feature attribution methods, or model distillation are planned for investigation to increase the interpretability of the TuncerNeXt-based model without sacrificing performance.
  • Addressing challenges related to real-time processing, resource constraints, and integration with existing systems will be crucial for transitioning TuncerNeXt to real-world monitoring on UAVs or marine platforms, enabling continuous surveillance of mucilage outbreaks.
6. Conclusions

The proposed TuncerNeXt-based mucilage detection model demonstrates high classification performance in classificatiın mucilage under challenging maritime conditions. The model achieves high classification performance, with a validation accuracy of 97.60% and a test accuracy of 98.66%, facilitated by the integration of attention mechanisms and residual blocks across its 2.1 million parameters. Including sea wave images alongside mucilage data strengthens the model's adaptability and accuracy, even in dynamic ocean/sea environments. Moreover, the recommended TuncerNeXt-based deep feature engineering model achieves an improved classification accuracy of 99.18% by selecting an optimal combination of 751 features and this high classification performance highlights the introduced TuncerNeXt model’s capacity for developed feature extraction.

Although these findings indicate the TuncerNeXt model’s effectiveness as a potential tool for automated mucilage detection, further validation is required. Our research provides a foundational/pioneering step towards automated mucilage monitoring, with applications for environmental monitoring and marine science.

Author Contributions

Conceptualization, MG, VYC, RH, SD, TT; methodology, MG, VYC, RH, SD, TT; software, SD, TT; validation, MG, VYC, RH, SD, TT; formal analysis, MG, VYC, RH, SD, TT; investigation, MG, VYC; resources, MG; data curation, MG, VYC, RH, SD, TT; writing—original draft preparation, MG, VYC, SD, TT; writing—review and editing, MG, VYC, RH, SD, TT; visualization, MG, VYC, RH; supervision, TT; project administration, TT. All authors have read and agreed to the published version of the manuscript.

  References

[1] Waghmare, R., Moses, J.A., Anandharamakrishnan, C. (2022). Mucilages: Sources, extraction methods, and characteristics for their use as encapsulation agents. Critical Reviews in Food Science and Nutrition, 62(15): 4186-4207. https://doi.org/10.1080/10408398.2021.1873730

[2] Karakulak, F.S., Kahraman, A.E., Uzer, U., Gül, B., Doğu, S. (2023). Effects of mucilage on the fisheries in the Sea of Marmara. In Mucilage Problem in the Sea of Marmara, Istanbul University Press, Istanbul, pp. 1-9. https://doi.org/10.26650/B/LS32.2023.003.09

[3] Cembella, A.D. (2003). Chemical ecology of eukaryotic microalgae in marine ecosystems. Phycologia, 42(4): 420-447. https://doi.org/10.2216/i0031-8884-42-4-420.1

[4] Del Negro, P., Crevatin, E., Larato, C., Ferrari, C., Totti, C., Pompei, M., Umani, S.F. (2005). Mucilage microcosms. Science of the Total Environment, 353(1-3): 258-269. https://doi.org/10.1016/j.scitotenv.2005.09.018

[5] Rinaldi, A., Vollenweider, R.A., Montanari, G., Ferrari, C.R., Ghetti, A. (1995). Mucilages in Italian seas: The Adriatic and Tyrrhenian seas, 1988–1991. Science of the Total Environment, 165(1-3): 165-183. https://doi.org/10.1016/0048-9697(95)04550-K

[6] Rolton, A., Rhodes, L., Hutson, K.S., Biessy, L., Bui, T., MacKenzie, L., Smith, K.F. (2022). Effects of harmful algal blooms on fish and shellfish species: A case study of New Zealand in a changing environment. Toxins, 14(5): 341. https://doi.org/10.3390/toxins14050341

[7] Caronni, S., Calabretti, C., Cavagna, G., Ceccherelli, G., Delaria, M.A., Macri, G., Panzalis, P. (2017). The invasive microalga Chrysophaeum taylorii: Interactive stressors regulate cell density and mucilage production. Marine Environmental Research, 129: 156-165. https://doi.org/10.1016/j.marenvres.2017.05.005

[8] Burkholder, J.M. (2000). Critical needs in harmful algal bloom research. Opportunities for environmental applications of marine biotechnology national academy of sciences. National Research Council, Washington, DC, pp. 126-149.

[9] Yagci, A.L., Colkesen, I., Kavzoglu, T., Sefercik, U.G. (2022). Daily monitoring of marine mucilage using the MODIS products: A case study of 2021 mucilage bloom in the Sea of Marmara, Turkey. Environmental Monitoring and Assessment, 194(3): 170. https://doi.org/10.1007/s10661-022-09831-x

[10] Tuzcu Kokal, A., Olgun, N., Musaoğlu, N. (2022). Detection of mucilage phenomenon in the Sea of Marmara by using multi-scale satellite data. Environmental Monitoring and Assessment, 194(8): 585. https://doi.org/10.1007/s10661-022-10267-6

[11] Kavzoglu, T., Goral, M. (2022). Google Earth engine for monitoring marine mucilage: Izmit Bay in Spring 2021. Hydrology, 9(8): 135. https://doi.org/10.3390/hydrology9080135

[12] Sun, Y., Xue, B., Zhang, M., Yen, G.G., Lv, J. (2020). Automatically designing CNN architectures using the genetic algorithm for image classification. IEEE Transactions on Cybernetics, 50(9): 3840-3854. https://doi.org/10.1109/TCYB.2020.2983860

[13] Traore, B.B., Kamsu-Foguem, B., Tangara, F. (2018). Deep convolution neural network for image recognition. Ecological Informatics, 48: 257-268. https://doi.org/10.1016/j.ecoinf.2018.10.002

[14] Huertas-Tato, J., Martín, A., Fierrez, J., Camacho, D. (2022). Fusing CNNs and statistical indicators to improve image classification. Information Fusion, 79: 174-187. https://doi.org/10.1016/j.inffus.2021.09.012

[15] Colkesen, I., Kavzoglu, T., Sefercik, U.G., Ozturk, M.Y. (2023). Automated mucilage extraction index (AMEI): A novel spectral water index for identifying marine mucilage formations from Sentinel-2 imagery. International Journal of Remote Sensing, 44(1): 105-141. https://doi.org/10.1080/01431161.2022.2158049

[16] Gambín, Á.F., Angelats, E., González, J.S., Miozzo, M., Dini, P. (2021). Sustainable marine ecosystems: Deep learning for water quality assessment and forecasting. IEEE Access, 9: 121344-121365. https://doi.org/10.1109/ACCESS.2021.3109216

[17] Zhang, K., Zuo, W., Zhang, L. (2018). FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Transactions on Image Processing, 27(9): 4608-4622. https://doi.org/10.1109/TIP.2018.2839891

[18] Abdalla, G., Özyurt, F. (2021). Sentiment analysis of fast food companies with deep learning models. The Computer Journal, 64(3): 383-390. https://doi.org/10.1093/comjnl/bxaa131

[19] Tuncer, T., Aydemir, E., Ozyurt, F., Dogan, S. (2022). A deep feature warehouse and iterative MRMR based handwritten signature verification method. Multimedia Tools and Applications, 81: 3899-3913. https://doi.org/10.1007/s11042-021-11726-x

[20] Özyurt, F. (2021). Automatic detection of COVID-19 disease by using transfer learning of light weight deep learning model. Traitement du Signal, 38(1): 147-153. https://doi.org/10.18280/ts.380115

[21] Yilmaz, E.O., Tonbul, H., Kavzoglu, T. (2024). Marine mucilage mapping with explained deep learning model using water-related spectral indices: A case study of Dardanelles Strait, Turkey. Stochastic Environmental Research and Risk Assessment, 38(1): 51-68. https://doi.org/10.1007/s00477-023-02560-8

[22] Kikaki, K., Kakogeorgiou, I., Hoteit, I., Karantzalos, K. (2024). Detecting marine pollutants and sea surface features with deep learning in sentinel-2 imagery. ISPRS Journal of Photogrammetry and Remote Sensing, 210: 39-54. https://doi.org/10.1016/j.isprsjprs.2024.02.017

[23] Colkesen, I., Ozturk, M.Y., Altuntas, O.Y. (2024). Comparative evaluation of performances of algae indices, pixel-and object-based machine learning algorithms in mapping floating algal blooms using Sentinel-2 imagery. Stochastic Environmental Research and Risk Assessment, 38(4): 1613-1634. https://doi.org/10.1007/s00477-023-02648-1

[24] Figueroa, J., Rivas-Villar, D., Rouco, J., Novo, J. (2024). Phytoplankton detection and recognition in freshwater digital microscopy images using deep learning object detectors. Heliyon, 10(3): e25367. https://doi.org/10.1016/j.heliyon.2024.e25367

[25] Tokatlı, C., Varol, M., Uğurluoğlu, A. (2024). Ecological risk assessment, source identification and spatial distribution of organic contaminants in terms of mucilage threat in streams of Çanakkale Strait Basin (Türkiye). Chemosphere, 353: 141546. https://doi.org/10.1016/j.chemosphere.2024.141546

[26] Barua, P.D., Chan, W.Y., Dogan, S., Baygin, M., Tuncer, T., Ciaccio, E.J., Acharya, U.R. (2021). Multilevel deep feature generation framework for automated detection of retinal abnormalities using OCT images. Entropy, 23(12): 1651. https://doi.org/10.3390/e23121651

[27] Tasci, B., Acharya, M.R., Baygin, M., Dogan, S., Tuncer, T., Belhaouari, S.B. (2023). InCR: Inception and concatenation residual block-based deep learning network for damaged building detection using remote sensing images. International Journal of Applied Earth Observation and Geoinformation, 123: 103483. https://doi.org/10.1016/j.jag.2023.103483

[28] Arslan, S., Kaya, M.K., Tasci, B., Kaya, S., Tasci, G., Ozsoy, F., Tuncer, T. (2023). Attention TurkerNeXt: Investigations into bipolar disorder detection using OCT images. Diagnostics, 13(22): 3422. https://doi.org/10.3390/diagnostics13223422

[29] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.

[30] Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, pp. 10012-10022. https://doi.org/10.1109/ICCV48922.2021.00986

[31] Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S. (2022). A convnet for the 2020s. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, pp. 11976-11986. https://doi.org/10.1109/CVPR52688.2022.01167

[32] Tuncer, T., Dogan, S., Özyurt, F., Belhaouari, S.B., Bensmail, H. (2020). Novel multi center and threshold ternary pattern based method for disease detection method using voice. IEEE Access, 8: 84532-84540. https://doi.org/10.1109/ACCESS.2020.2992641

[33] Vapnik, V. (1998). The support vector method of function estimation. In Nonlinear Modeling: Advanced Black-Box Techniques, Boston, MA, USA, pp. 55-85. https://doi.org/10.1007/978-1-4615-5703-6_3

[34] Vapnik, V. (1999). The Nature of Statistical Learning Theory. Springer Science & Business Media.

[35] Jain, U., Nathani, K., Ruban, N., Raj, A.N.J., Zhuang, Z., Mahesh, V.G. (2018). Cubic SVM classifier based feature extraction and emotion detection from speech signals. In 2018 International Conference on Sensor Networks and Signal Processing (SNSP), Xi'an, China, pp. 386-391. https://doi.org/10.1109/SNSP.2018.00081

[36] Safavian, S.R., Landgrebe, D. (1991). A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, 21(3): 660-674. https://doi.org/10.1109/21.97458

[37] Kim, K.S., Choi, H.H., Moon, C.S., Mun, C.W. (2011). Comparison of k-nearest neighbor, quadratic discriminant and linear discriminant analysis in classification of electromyogram signals based on the wrist-motion directions. Current Applied Physics, 11(3): 740-745. https://doi.org/10.1016/j.cap.2010.11.051

[38] Ng, A., Jordan, M. (2001). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Advances in Neural Information Processing Systems, 14: 841-848.

[39] Maillo, J., Ramírez, S., Triguero, I., Herrera, F. (2017). kNN-IS: An Iterative Spark-based design of the k-Nearest Neighbors classifier for big data. Knowledge-Based Systems, 117: 3-15. https://doi.org/10.1016/j.knosys.2016.06.012

[40] Hothorn, T., Lausen, B. (2003). Bagging tree classifiers for laser scanning images: A data-and simulation-based strategy. Artificial Intelligence in Medicine, 27(1): 65-79. https://doi.org/10.1016/S0933-3657(02)00085-4

[41] Tolstikhin, I.O., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., Dosovitskiy, A. (2021). Mlp-mixer: An all-mlp architecture for vision. Advances in Neural Information Processing Systems, 34: 24261-24272.

[42] Windeatt, T. (2006). Accuracy/diversity and ensemble MLP classifier design. IEEE Transactions on Neural Networks, 17(5): 1194-1211. https://doi.org/10.1109/TNN.2006.875979

[43] Zhu, J., Hastie, T. (2005). Kernel logistic regression and the import vector machine. Journal of Computational and Graphical Statistics, 14(1): 185-205. https://doi.org/10.1198/106186005X25619

[44] Hacıefendioğlu, K., Başağa, H.B., Baki, O.T., Bayram, A. (2023). Deep learning-driven automatic detection of mucilage event in the Sea of Marmara, Turkey. Neural Computing and Applications, 35(9): 7063-7079. https://doi.org/10.1007/s00521-022-08097-1

[45] Kavzoglu, T., Yilmaz, E., Colkesen, I., Sefercik, U., Gazioglu, C. (2023). Detection and monitoring of mucilage formations using pixel based convolutional neural networks: The case study of Izmit Gulf, Turkey. In Mucilage Problem in the Sea of Marmara. Istanbul University Press.

[46] Sanver, U., Yesildirek, A. (2023). An autonomous marine mucilage monitoring system. Sustainability, 15(4): 3340. https://doi.org/10.3390/su15043340