© 2025 The authors. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).
OPEN ACCESS
Internet of Things is a developing technology in the modern period with uses in wide area monitoring, healthcare, smart cities, and other areas. Wireless sensor networks (WSN) form the core of these IoT. The sensors within the WSN encounter numerous obstacles, including limited battery life, Security and Privacy and Synchronization etc. This shortens the lifetime of the network, thus energy must be used wisely. In this study, the Reinforcement Learning algorithm and K-Means are used to build clusters and to elect cluster heads. Additionally, a mobile sink is proposed to collect data from each cluster head (CH), saving energy on data transmission from nodes to base station. The proposed novel clustering and cluster head election algorithms increases energy efficiency by 75% and the time complexity is reduced to O(n/2) using reinforcement algorithm. By considering the shortest edges of obstacles, energy consumption during the mobile sink’s routing is reduced, ensuring secure data transmission. The results illustrate a comparison of the time complexity between the initial cluster head election, conducted using K-Means, and the subsequent cluster head election is performed using Q-Learning. These algorithms are compared with K-Means and Fuzzy C-Means. Our proposed approach demonstrates superior performance compared to the other methods. The outcomes of the proposed work also reveal that in comparison to Clustered Routing algorithm based on forwarding mechanism optimization (CRFMO), Low Energy Adaptive Clustering Hierarchy with Improved Adaptive Cluster Adjustment (LEACH-IASA) and Improved LEACH, enhances network lifetime by increasing the number of surviving nodes and network coverage. The proposed scheme also outperforms in terms of latency compared to Software Defined Networking with Reinforcement Learning (SDN-RL) and Energy Efficient Rendezvous Ponts Selection using Deep Policy Dynamic Programming (EERPS-DPDP).
Internet of Things, mobile sink, path planning, reinforcement algorithm, K-Means algorithm
The rapid expansion of the Internet of Things (IoT) has led to an increased demand for efficient and reliable data collection methods in large-scale sensor networks. One promising approach to enhancing the performance of these networks is the use of mobile sinks [1], which can dynamically traverse the network to collect data directly from sensor nodes. This strategy can significantly reduce the energy consumption and communication overhead associated with multi-hop data transmission to a mobile sink.
In cluster-based IoT networks, where sensor nodes are organized into clusters [2] with designated cluster heads, the efficient routing of mobile sinks to collect the data from these CHs becomes crucial. Cluster- based architectures help to manage the network’s scalability and reduce energy consumption by localizing data aggregation and communication within clusters. However, the dynamic nature of mobile sinks introduces new challenges in terms of routing and coordination, necessitating advanced strategies to optimize their movement and data collection routes.
A mobile sink can choose optimal routes [3] in the network. There are many reasons behind the optimal route selection, one of those is presence of an Obstacle. Existence of an obstacle in the way of mobile sink makes the connection failures and data loss. In the presence of the obstacle the mobile sink need to find out the shortest path [4] to reach the cluster head. In the proposed work the shortest path to travel is chosen depending on the shortest edge of the obstacle. This approach allows optimized routing decisions that can effectively respond to the changing network conditions and demands.
Existing studies have primarily focused on static or semi dynamic scenarios, leaving a significant gap in understanding how mobile sinks can dynamically [5] adapt their route in response to real-time network states. While mobile sink applications offer considerable potential advantages, they remain unexplored in the context of dynamic scenarios. Most existing research has concentrated on static approaches, such as cluster head election and mobile sink path planning [6]. These strategic algorithms however, are inadequate for handling the dynamic changes that occur in networks. There dynamic algorithms are necessary.
In this research, we intend to create and implement sophisticated dynamic algorithms that utilize artificial intelligence (AI) and machine learning (ML) to enhance two primary functions within clustered networks: cluster head selection and mobile sink path planning. The cluster head election will be based on intelligent decision-making systems to ensure energy effective communication and data collection. For this in proposed work K-Means algorithm and Q-Learning are used. This elects the first cluster head as centroid. As the first cluster head is centroid the distance between all cluster members for exchanging data will be less. So the energy is efficiently utilized. The upcoming cluster head are elected using Q-Learning algorithm. As Q-Learning elects the cluster head based on highest energy as well its ability of providing feedback will provide prolonged network lifetime. Also energy is saved during cluster head election by considering only neighbor nodes to elect the cluster head instead all cluster members. In this case other algorithms Fuzzy C-Means Hierarchical clustering etc. is not suitable. Additionally, mobile sink route planning will incorporate real-time environmental considerations, such as obstacles, to ensure smooth network navigation. By applying AI and ML [7] in these areas, our solution aims to improve the efficiency and accuracy of secure data gathering from cluster heads, reducing delays and optimizing resource usage in even the most complex network conditions(well suitable for Military applications, Areas which are untouched by Humans). The key contributions of this work include:
The rest of this article is organized as follows. In Section 2, we give a summary of the most current research on WSN routing protocols in IoT. In Section 3, we illustrate proposed work with architecture flow diagram and algorithms. The performance analysis is discussed in Section 4. In Section 5, we describe the conclusion part.
1.1 Related work
Agarwal et al. [8] suggest a mobile sink-based intelligent data routing plan for wireless sensor networks. For ideal cluster formation, Particle Swarm Optimization is first applied. Following cluster creation, the ideal number of meeting locations is determined, and a tour is designed for the mobile sink to collect data. To collect the data from the cluster heads, a mobile sink-based intelligent data collection technique is also suggested. Rahman and Wahid [9] recommend a real time lightweight dynamic clustering algorithm for a WSN. This algorithm is compatible with two different scenarios: 1. mobile DSs and static SN’s and 2. Mobile SNs and static DSs. The suggested algorithm is predicated on a long-range (LoRa) interface’s residual energy as well as the signal-to-noise ratio and received signal strength indicator. Wei et al. [10] propose DSTMS routing algorithm based on 2 things. Firstly, based on LEACH protocol secondly a multi-layer transmission framework with dynamic MST to optimize data transmission route. The dynamic rendezvous layer in this framework is built by the rendezvous points that are chosen based on the mobile sink’s motion parameters, which limits the hierarchical transmission of DSTMS.
Kaur and Bhattacharya [11] suggest a mechanism for selecting and rotating rendezvous points based on AI, which enables fault-tolerant and environmentally friendly data collection in sensor networks. Additionally, a smart mobile sink-based scheduling scheme for data collection is suggested, which fixes problems with network partitioning and energy holes. Al-Sadoon et al. [12] propose a novel routing protocol to enhance MWSN performance, based on virtual network zones and the dual-tier clustering idea. The network is divided into virtual zones by the proposed protocol “Dual Tier Cluster-Based Routing” (DTC-BR), and a cluster-head mechanism chooses the most suitable SN to serve as the cluster head. Moreover, the research demonstrates scientific rigor by using MATLAB for deploying and evaluating the protocol under 3 assumptions: energy consumption, network lifetime, and scalability. Vaezian and Darmani [13] suggest using MSE, a mobility support technique, as an expansion of RPL. With the exception of the root node, all nodes can move, and the MSE maintains a seamless connection throughout. It also controls the circumstance in which a physical barrier forms between two matched nodes in a changing setting. In order to do this, it makes use of a dynamic trickle timer with two distinct ranges, a quality table for neighbor links, a function to choose the ideal parent in the event of mobility, confidence, crucial areas as well as a blacklist.
In the study [2], the authors presented for large-scale WSN, FC-CRA, or fast changes in clustering and routing algorithm, is proposed to prevent energy hole problem. In order to dynamically adapt to changes in node energy and distribution, inter- (set to prevent premature death of nodes near BS) and intra-clusters (based on path energy function to save and balance node energy) based routing is used. This enhances node lifetime, network throughput, effective node energy use, and data transmission reliability in wide area Internet of things applications. Reference [14] primarily concentrate on load distribution and balancing among base stations to ensure network endurance. To increase network longevity, this paper suggests a new SDN-based architecture. In order to reduce both traffic and energy consumption, a novel algorithm is suggested and deployed in each individual node after deployment to identify the neighboring nodes, topology, and adjacent BS. They were attempting to divide control, data, and application here. The primary node responsible for managing the network is the controller node. Hajian et al. [15] created the multi-objective new algorithm known as LSR, which takes energy efficiency, QOS, and security into account. Using network lifespan, a QOS model is suggested to choose a source node based on parameters such as average routing delay, PDR, and energy efficiency. To provide security based on average routing delay and PDR, trust values are computed by incorporating trust models. To increase connectedness between nodes, an adaptive connectivity model is suggested, and its effectiveness is gauged. Additionally, a deployment approach was suggested to solve the energy-hole issue.
Pathak et al. [16] create a routing technique that increases the WSN network’s lifespan. Energy balancing is the main focus, subject to the need that packets arrive to the base station with a predetermined probability. In order to do this, they have put forth two-hop and multi-hop routing algorithms, which determine the most optimal path for the nodes to take when communicating with the base station in order to relay the monitored data. Muhammad [17] built the software defined WSN routing algorithm, a routing method based on SDN architecture. These days, the RL algorithm is used by the SDN controller to manage routing throughout the entire network. RL has a unique feature that awards points for every activity. Thus, the next step in the routing process will be determined by each present reward, and the optimal routing path will be created by drawing on past experiences [18] sensor-oriented The scarcity of energy sources is a challenge that IoT must overcome. To address this issue, researchers have developed the Energy Harvesting technology, a routing protocol that consists of two main parts: routing operations and distributed neighbor detection. Using EH techniques, it has an effective connection selection mechanism that chooses the connection depending on the angle nearest to the destination node, extending the lifetime of the network. The experimental results demonstrate successful routing coupled with promising outcomes in terms of energy and usage approaches.
Zeb et al. [19] Introduced a green routing method, FJAPSO, to extend the sensor network's life. FJAPSO operates at two levels for self-optimization: determining the optimal number of control nodes and the best control node clustering. The experiments’ results demonstrated that FJAPSO outperforms other advanced techniques, significantly increasing the sensor network’s lifespan, offering a promising solution for practical applications. [20] CLRP-MMSs is a recommended technique that divides nodes into clusters and selects a CH using the CHCF method. This method plays a crucial role in the CH selection process. When applied to networks with moving nodes, the risk of data loss due to detaching nodes from CH nodes is significantly reduced. This not only minimizes energy loss but also optimizes the energy and data rate received. Sadrishojaei et al. [21] recommend the use of the EEMSR protocol in Internet of Things networks. This protocol not only reduces the substantial communication cost that arises from the scalability of these networks but also significantly enhances inter-cluster routing. This improvement is crucial, especially in networks that are supported by a plethora of diverse IoT entities and services as well as heterogeneous IoT networks. The EEMSR protocol's employment of multiple trust levels, such as data perception trust, data fusion trust, and communication trust, ensures a robust risk management system by computing the trust factor on the clustering and routing.
An energy-efficient routing protocol based on multi-threshold segmentation (EERPMS) algorithm were proposed to minimize the energy consumption during cluster formation and cluster head election [22]. Simulation results show that EERPMS can improve the distribution uniformity of cluster heads, prolong the network lifetime and save up to 64.50%, 58.60% and 56.15% network energy as compared to RLEACH, CRPFCM and FIGWO protocols respectively. The optimal route is determined by two energy-related metrics, 1. Transmission distance energy and 2 [23]. Average battery us- age between intermediate nodes, according to the created RDEC. SDN controllers reduce energy waste, streamline decision-making, and lengthen network lifespan. ER-SR finds energy residual node which dynamically acts as source node and finds the optimal source routing path for each common node and involves partial nodes in data transmission balances energy consumption in sensor nodes [24]. ER-SR gives higher energy efficiency but lesser performance, packet delivery ratio, and delivery delay, compared with other routing protocols in WSN’s.
Optimizes cluster heads (CHs) and relay points (RPs) selection, reducing energy consumption and preventing premature network failure [25]. By selecting energy-efficient CHs and optimizing data collection paths using the DPDP algorithm, the scheme enhances network stability and lifetime. Simulation results demonstrate that EERPS-DPDP not only outperforms existing methods in stability, energy efficiency, reduced data loss, and collection delay, but also improves network lifetime by up to 54% and reduces energy consumption by 55%. This superior performance showcases the effectiveness of EERPS-DPDP for sustainable WSN operations.
The Lie hypergraph cluster-based routing technique is proposed to extend the network's lifetime and enhance data transmission efficiency in WSNs [26]. A hypergraph represents the network, with clusters as hyperedges, and CH selection leverages hypergraph transversal properties and node-sink distances. Lie commutators from UTM Lie algebra optimize inter- and intra-cluster routing, conserving CH energy by selecting intermediate nodes for data transmission. Simulations show that the approach outperforms existing protocols in network lifetime, energy efficiency, and data delivery to the base station.
The CRFMO cluster routing algorithm does optimize forwarding in homogeneous WSNs [27]. It improves intracluster and intercluster forwarding by introducing a new formula for rational CH election and defining rules for efficient relay forwarding between CHs. These mechanisms not only balance node load, prevent early node death, and reduce energy consumption from long-distance transmissions, but also pave the way for more efficient and sustainable wireless sensor networks. The algorithm uses a single-hop transmission mode for intracluster communication. Simulations show that CRFMO enhances network efficiency, extends lifetime, and maintains high coverage compared to existing methods, offering a promising future for network optimization.
This section proposes a machine learning-based scheme for clustering and path selection in a network with a mobile sink [28, 29]. The scenario focuses on a cluster-based network structure. In a distributed network, each node is required to transmit its collected data to the base station, which can deplete the nodes’ energy in a multi hop network. The notations used in the proposed scheme are given in Table 1.
To address this issue, a cluster-based approach is adopted. Where the sensor nodes Nn (where N represents Sensor node and n represents number of sensor nodes), are loaded with Sid and key k0 and are deployed into the network. As the network is cluster based, these sensors are initially clustered and a cluster head is elected using K-Means algorithm. Due to communication burden the first CH energy will be ruined. Once the first cluster head energy is devastated next cluster heads are elected based on the Q-Learning algorithm to achieve energy efficiency and increase the network lifetime.
Table 1. Description of symbols used
|
Symbols |
Description |
Symbols |
Description |
|
CHj |
Cluster head where j=1… k |
PDR |
Packet delivery ratio |
|
CM |
Cluster member |
QOF |
Quality of service |
|
BS |
Base station |
Ni |
Sensor Nodes where i=1… n |
|
n |
number of nodes |
At |
Activity |
|
Pht |
State |
Rt |
Reward |
|
h |
Hash function |
K |
Number of cluster heads/clusters |
|
sn |
Number of surrounding |
Dth |
Distance threshold |
|
V |
Current cluster head |
L |
Level |
|
nn |
Neighbor node |
D |
Distance |
|
K |
Key |
E |
Encryption |
|
D |
Decryption |
T |
Time stamp |
Here the cluster heads are elected based on the key parameters such as Residual energy, Distance and obstacles. Not all the nodes in the network are elected as cluster heads. But the nearby nodes to current CH will be elected as next CH. so that the data transmission from nodes to CH will be done energy efficiently. Otherwise, if any end node is selected as CH, then the data transmission to long distance may decrease the energy of nodes as well as CH. There may be possibility of data loss due to long distance. Moving to the Obstacle free CH election process. If there is any obstacle near the CH it may block the data transmission and trouble MS. So, during CH selection itself an obstacle free CH is elected.
The cluster heads try to forward the collected data from sensors to base station. Where the energy of the CHs got devastated due to far transmission of data. To avoid energy draining of the CH this proposed scheme uses a Mobile sink to collect the data from cluster heads. During this process the path planning to MS plays a key role. The path may contain obstacles. Due to these obstacles the path length may prolong and the data loss may happen. The obstacle avoidance is very important. Later it finds the best optimized path to Mobile sink along with the shortest edge of the obstacle and collects the data from the CH.
Figure 1 explains the architecture of the proposed work. Where the sensor nodes are deployed into the network. Clusters are formed and the cluster heads are elected using initially K-Means later on, using Q- Learning algorithms. Where the nearby nodes to current CH is only considered for CH election depending on the key parameters such as remaining energy, distance and so on. To achieve energy efficiency a mobile sink is used to collect the data from CHs instead of sending the data individually (all nodes) to base station. The MS path is efficiently planned depending on the obstacles. When there is an obstacle the shortest edge of the obstacle is considered for the travel and mobile sink is moved in an optimized way. Later a secure data transmission happens in-between CHs and MS.
Figure 2 illustrates the data flow diagram of the proposed method. Initially, the nodes are deployed, and clusters are formed with the cluster head (CH) being selected through the K-Means algorithm. The CH is responsible for communicating with and gathering data from the other nodes in its cluster. Once the energy of the first CH is depleted, nearby nodes are considered for the election of the second CH. The selection is ased on several key factors, including neighboring nodes, residual energy, distance, and the number of packets transmitted.
Figure 1. Integrated cluster-sink network/mobile-driven cluster network
Figure 2. Data flow diagram of the proposed system
This obstacle-free CH election is managed using the Q-Learning algorithm, ensuring the CH with the highest energy is chosen. The mobile sink (MS) then moves along an optimized path to collect data from the CH’s, eventually transmitting the data to the base station (BS). The proposed scheme has 4 phases: 1. Clustering phase 2. CH election phase 3. MS path planning phase and 4. Data collection phase.
2.1 Clustering phase
The first phase in the current work is Clustering phase, where the network is clustered. Algorithm 1 describes clustering as well as overall flow of the current work. Here sensor nodes N1, N2...Nn are deployed into the network area $\mathrm{M} * \mathrm{~N}$.
The network is clustered and a CH is elected using K-Means algorithm. K-Means takes few random points (sensors) from the network and assigns remaining sensor nodes (points) to that selected points by calculating the Euclidean distance.
The elected CHs (random points) are centroids of each cluster to make the communication easier with cluster members. Here the value of K is decided by the number of clusters that are formed in the network. Later the flow of the work goes with selection of neighbor nodes for next CHs election. The next cluster heads are elected only in these neighbor nodes using RL algorithm. The data exchange happens in between CH and cluster members. At the end MS visits each and every CH in an optimized path to collect the data and forwards it to BS or end user.
|
Algorithm 1. Cluster formation and overall flow of the work |
|
1: Deployment of set of N1, N2… Nn sensors into $\mathrm{M} *\mathrm{N}$ network area 2: Formation of clusters and initial CH election using K-Means algorithm 3: Fixing the threshold values and selecting neighbor nodes, process is done (Algorithm 2) 4: CH election process using SNs is done (Algorithm 3) 5: Routing the data from cluster members to cluster head 6: Mobile sink visits each CH using optimized path and collects the data (Algorithm 4) 7: The collected data is forwarded to the end user |
2.2 CH election phase
The second phase of the current work is CH election phase. Algorithm 2 refers the same. As initially centroid is elected as CH, once the energy of it drains the next CH need to be elected using Q-Learning algorithm. For this sake first the neighbor nodes need to be selected. These nodes must be within the fixed distance dth. If there were no nodes in the threshold distance dth then the dth is increased by 1unit (1mtr) and the nodes are listed into neighbor nodes. Once Nn nodes are listed RL algorithm is used to elect next CH with highest energy efficiency among them. So the complexity of the algorithm is reduced to O(n/2). The reason behind chosing the centroid and its neighbour nodes as CH is, boundary nodes are far away from each other so it will be difficult and also it also disturbs to plan the path for MS. Since we are assuming MS moves from one CH to another CH to collect the data the distance it travels will be exceeded. In our work we will be assuming all sensor nodes are solar powered.
|
Algorithm 2. Cluster head election |
|
1: INPUT: Group of sensors 2: OUTPUT: List of cluster heads CH1, CH2, ..., CHk 3: Deployment of N1, N2, …, Nn sensor nodes and forming the topology. 4: Cluster formation using K-Means and head election using RL. CH1, CH2, …, CHk among the nodes in the cluster 5: If distance < dth Elect all nodes within distance dth and add to surrounding nodes list 6: Else increment threshold dth by 1 unit (1mtr) and repeat the above step 5 7: Applying reinforcement algorithm to Surrounding nodes process shown in Algorithm 3 |
Algorithm 3 describes the CH election process using RL algorithm. The state action pairs are given as input to RL algorithm which are initially set to 0. Energy count of each neighbour node is collected and the node with high energy will get the good reward. Immediate reward will be awarded with 0 and delayed reward will be awarded nearby 1. Update $\mathrm{Q}(\mathrm{ph}, \mathrm{A})$ values. Both $\propto$ (Learning rate) and $\gamma$ (Discount factor) are the sensitive parameters that signifacantly influence the behaviour and performance of Q-Learning algorithm. $\propto$ Controls the speed of learning by determining how much new information overwrites old information in the Q -value updates. As lower the $\propto$ value is tuned, more stable updates we get. The higher values of $\propto$ lead to faster learning but risk instability. When $\propto=0.1$ will change $Q$ values gradually and retains the past experiences longer. When $\propto=0.5$ the Q values will be quicker with new information. $\propto$ values are calculated as follows: $\propto_t=\frac{1}{1+t}$ where $t$ is a non-negative integer which indicates iteration number in the running process. It starts from 0 and incremented by 1 .
$\gamma$ determines the importance of future rewards relative to immediate reward. The lower $\gamma$ values focus on short term reward and takes faster decisions. For example, $\gamma=0.9$ prioritizes immediate rewards bit it may lead in ignoring longterm benefits. The large the $\gamma$ value i.e., $\gamma=0.99$ risks slow convergence and useful for tasks requiring long term planning, which is useful for increasing the network lifetime indirectly.
|
Algorithm 3. Reinforcement based cluster formation |
i. $$\mathrm{Q}_{\mathrm{t}+1}\left(\mathrm{ph}_{\mathrm{t}}, \mathrm{~A}_{\mathrm{t}}\right)=(1-\propto) \mathrm{Q}_{\mathrm{t}}\left(\mathrm{ph}_{\mathrm{t}}, \mathrm{~A}_{\mathrm{t}}\right)+\propto\left[\mathrm{R}^{\mathrm{t}+1}+\gamma \max _{\mathrm{A}^{\prime}} \mathrm{Q}_{\mathrm{t}}\left(\mathrm{ph}_{\mathrm{t}+1}, \mathrm{~A}^{\prime}\right)\right]$$ Where $\gamma$ is a constant lies between $0.9 \leq \gamma<0.99, \propto$ is a constant lies between $0.1<\propto<0.5, \mathrm{~A}^{\prime}$ indicates all possible actions in the sate ii. $\mathrm{ph}=\mathrm{ph}^{\prime}$ (state will change from current CH to the next CH) iii. Select action $\left(\mathrm{ph}_{\mathrm{i}}\right)=\operatorname{argmax}_{\mathrm{A}} \gamma \mathrm{Q}\left(\mathrm{ph}_{\mathrm{i}}, \mathrm{A}\right)$. Once the value is calculated next CH is elected and neighbour nodes are explored as follows: Exploration $=\frac{\mathrm{kQ}(\mathrm{ph}, \mathrm{A})}{\sum \mathrm{kQ}(\mathrm{ph}, \mathrm{A})}$ where k is a constant and $k>0$ which determines how strongly the selection favours actions. |
State is updated from ph to $\mathrm{ph}^{\prime}$. Here ph indicates the state at old CH and $\mathrm{ph}^{\prime}$ indicates the new states at new CHs. Once the next neighbour node is elected as CH the action is selected based on the $\operatorname{argmax}_A \gamma Q\left(p h_i, A\right)$ value and the neighbour nodes are explored using softmax exploration function. It is Effective in Complex Environments where small differences in Q-values matter, as actions are weighted proportionally to their Q-values. In exploration constant $\mathrm{k}>0$ which determines how strongly the selection favours actions, adjusts how sensitive the action probabilities are to the differences in Q-values. High k encourages exploration with more uniform probabilities and low k focuses on exploitation by favoring high-Q-value actions. The CHs are dynamically elected with higher Q-values, higher energy, lesser distance from $\mathrm{N}_{\mathrm{n}}$. Once the new CH is elected the state action pair moves from $\mathrm{ph}_{\mathrm{t}}\left(\mathrm{E}_{\mathrm{h}}, \operatorname{ld}_{\mathrm{Nn}}\right)$ to $\mathrm{ph}_{\mathrm{t}+1}\left(\mathrm{E}_{\mathrm{h}}, \operatorname{ld}_{\mathrm{Nn}}\right)$.
We can also use $\varepsilon$-greedy exploration here as it is simple and computationally efficient but it is suitable for simple environments. Its random exploration can waste effort on poor actions, and the transition from exploration to exploitation may be abrupt. So in this environment softmax exploration is well suitable.
2.3 MS path planning phase
The cluster heads try to forward the collected data from sensors to base station. Where the energy of the CH’s got devastated due to far transmission of data. To avoid energy draining of the CH’s, this proposed scheme uses a Mobile sink to collect the data from cluster heads. During this process the path planning to MS plays a key role. The path may contain obstacles. Due to these obstacles the path length may prolong and the data loss may happen therefore obstacle avoidance is very important. So here, the proposed scheme finds out the shortest edge of the obstacle first. Later it finds the best optimized path to Mobile sink along with the shortest edge of the obstacle. During this process if the CH is in the communication range of MS then CH computes the distance $\mathrm{d}_{\mathrm{j}}$. If the distance $\mathrm{d}_{\mathrm{j}}{ }^{\prime}$ is equal to $\sqrt{2 \mathrm{~d}}$ then the CH authenticates MS and transfers the data to the mobile sink. This mode in the proposed scheme is called as normal mode.
If the distance $\mathrm{d}_{\mathrm{j}}^{\prime}$ is greater than $\sqrt{2 \mathrm{~d}}$ from MS to CH, it means there exists an obstacle. Now MS tries to find the shortest path to reach the CH. The shortest path can be traced by finding out the shortest edge of the obstacle. And MS will travel through this shortest edge of the obstacle by finding the optimized path for secure data transmission. This mode in proposed scheme is called as Obstacle mode. Algorithm 4 describes path planning phase. In the beginning nodes are loaded with $\mathrm{S}_{\mathrm{id}}$ and key $\mathrm{K}_0$ and deployed into the network. MS is loaded with $\mathrm{MS}_{\mathrm{id}}$, key $\mathrm{K}_0$ and $4^{\mathrm{L}}+4^{\mathrm{L}-1} \mathrm{CH}$ 's depending on level L. MS starts moving in MN-Curve level trajectory from origin $\mathrm{CH}_0(0,0)$ with time stamp $\mathrm{T}_{\mathrm{CH}_0}$ in MN-Curve level trajectory to collect data from all cluster heads. The path length to CH from MS is $\sqrt{2 \mathrm{~d}}$. The current distance $\mathrm{d}_{\mathrm{j}}{ }^{\prime}$ is calculated. The current distance $\mathrm{d}_{\mathrm{j}}{ }^{\prime}$ is calculated. If the path distance $\sqrt{2\mathrm{d}}$ and current distance $\mathrm{d}_{\mathrm{j}}{ }^{\prime}$ both are same then the mobile sink will travel in the same path. If $d_j$ is greater than $\sqrt{2 d}$ then there may be an obstacle. Where the shortest edge of the obstacle is found and the optimized path selection is done for MS. After path planning Algorithm 5 is called for data collection.
During this process if the CH is in the communication range of MS then CH computes the distance $\mathrm{d}_{\mathrm{j}}$. If the distance $\mathrm{d}_{\mathrm{j}}$ is equal to $\sqrt{2 \mathrm{~d}}$ then the CH authenticates MS and transfers the data to the mobile sink. This mode in this proposed scheme is called as normal mode. If the distance $d_j$ is greater than $\sqrt{2\mathrm{d}}$ then the MS, it means there exists an obstacle. To cross obstacle first MS uses SLAM technique for localization and mapping. SLAM is a core technique in autonomous systems that enables vehicle to build a map of an unknown environment while simultaneously determining its position within that map. Accordingly, MS tries to find the shortest path to reach the CH. The shortest path can be traced by finding out the shortest edge of the obstacle. Obstacle shortest edge is calculated based on Euclidean distance and Curvature. Whenever the obstacle is in curved shape it evaluates the distance using curvature and finds out shortest path as follows, The curvature $k$ at a point on a 2D curve defined by $y=f(x)$ is, $k=\frac{\left|f^{\prime \prime}(x)\right|}{\left(1+\left(f^{\prime}(x)^2\right)\right)^{\frac{3}{2}}}$ where $f^{\prime}(x)$ is the first derivative of the function $f(x)$ (slope), $f^{\prime \prime}(x)$ is the second derivative of the function f(x).
The sharp edge obstacle shortest distance is calculated based on Euclidian distance as follows, $\operatorname{dis}_{\mathrm{pq}}(\mathrm{o})=$ $\sqrt{\left(x_q-x_p\right)^2+\left(y_q-y_p\right)^2}$ where $x$, $y$ are the coordinates of the obstacle and $p, q$ are the vertices of the obstacle (varies from 1 to 4 as obstacle has 4 vertices). Here o represents number obstacles in the MS path.
|
Algorithm 4. MS Path Planning Algorithm |
|
1: Load $\mathrm{S}_{\mathrm{id}}$ and the key $\mathrm{k}_0$ to every sensor node, deploy them in the network region 2: Load MSID and a key $\mathrm{k}_0$ to a MS 3: MS is preloaded with $4^{\mathrm{L}}+4^{\mathrm{L}-1}$ CH’s depending on level L 4: MS starts moving in the trajectory from origin $\mathrm{CH}_0(0,0)$. 5: The time stamp of origin point $\mathrm{CH}_0$ is considered as $\mathrm{T}_{\mathrm{CH}_0}$ 6: MS moves in MN-Curve level trajectory 7: At every jth CH, MS performs the following steps 8: for j = 1 to number of CHs do 9: if MS communication range lies with V then 10: V computes the distance $\mathrm{d}_{\mathrm{j}}$ 11: If $\mathrm{d}_{\mathrm{j}}^{\prime}=\sqrt{2 \mathrm{~d}}$ then 12: V authenticates MS and transfer data to MS in normal mode. Go to algorithm-5. 13: else if $\mathrm{d}_{\mathrm{j}}^{\prime}>2 \mathrm{~d}$ then 14: MS finds the shortest edge of obstacle for path as follows,
Later V authenticates MS transfers data to MS in obstacle mode. Go to algorithm-5. 15: end if 16: end if 17: end for |
And MS will travel through this shortest edge of the obstacle by finding the optimized path. Later MS sends a hello message to CH and CH authenticates MS. So the data transmission continues.
2.4 Data collection phase
In data transmission phase, secure data exchange will happen in-between MS and CHs using Keys. Algorithm 5 depicts the data collection from CHs to MS. Once MS reaches the cluster head points then MS sends an encrypted message to CH to transfer the data. CH first decrypts the message using key $\mathrm{K}_0$ and authenticates weather it is malicious or not. Once the authentication is done the CH will send a key $\mathrm{K}_{\mathrm{CH}_{\mathrm{j}}}=$ $\mathrm{K}_{\mathrm{CH}_{\mathrm{j}-1}} \oplus \mathrm{~T}_{\mathrm{CH}_{\mathrm{j}-1}}$ to MS to decrypt the Collected data.
|
Algorithm 5: Data Collection Algorithm |
|
1: V receives encrypted hello message $E_{K_{C H} j}$ from MS 2: V decrypts $D_{K_{C H} j}$ using secret key K0 3: V computes hash on h′ = h (data) 4: if h′ = h then 5: V computes the secret key as $\mathrm{K}_{\mathrm{CH}_{\mathrm{j}}}=\mathrm{K}_{\mathrm{CH}_{\mathrm{j}-1}} \oplus \mathrm{~T}_{\mathrm{CH}_{\mathrm{j}-1}}$ 6: V forwards key $\mathrm{K}_{\mathrm{CH}_{\mathrm{i}}}$ to MS 7: V transfer the data to MS 8: MS decrypts the data using $\mathrm{K}_{\mathrm{CH}_{\mathrm{i}}}$ Later CH transfers the collected data from sensors to mobile sink. Finally, MS will transfer the collected data to the base station. |
The proposed optimized mobile sink routing algorithm was conducted on 100 to 500 nodes in a 500×500 area as presented in Table 2. The nodes are deployed randomly as shown in Figure 3 and the initial cluster head is elected using K-Means as shown in Figure 4 and next cluster heads are elected using Q-Learning algorithms with 5% of CHs. The performance is measured with the time complexity and parameters such as distance and energy to analyze the energy consumed for cluster heads election. The proposed work is compared with the existing work as follows.
The work presented in the study [4] uses Q-Learning algorithm for AI-based RP selection and rotation. That means the CH’s elected are energy efficient nodes in spite of distance and time complexity (Weather it is a centroid or at the end of the cluster). When the CH is longer the distance the energy consumed to exchange the data will be more and also the Time Complexity will be more. To overcome these drawbacks the proposed work addresses K-Means algorithm to elect first CH. Through which centroid is elected as first CH and the energy efficiency is achieved by making first CH available lesser distance to remaining nodes. So that the time complexity of the first CH election is reduced, this is depicted in Figure 5.
Table 2. Network simulation parameters
|
Parameter Name |
Parameter Value |
|
Network Area |
500m×500m |
|
Number of Nodes |
500 |
|
Sink node Coordinates |
(250m, 250m) |
|
Node Initial Energy |
0.5J |
|
Energy consumption of transmitting $\mathrm{E}_{\mathrm{tx}}$ and receiving circuit $\mathrm{E}_{\mathrm{rx}}$ |
50nJ/bit |
|
Packet Size |
4000bit |
Figure 3. K-Means clustering
Figure 4. CH-election- using K-means
Figure 5. Time complexity of the first cluster head election
Figure 6. Surrounding nodes selection
Figure 7. Dynamic CH election using RL
Figure 8. Time complexity of the second cluster head election
Figure 9. Optimized data collection path for MS
Once the energy of the first Cluster Head (CH) is depleted, a second CH is elected for data exchange. In existing works, such as the study presented in the study [12], the CH is elected using LEACH enhanced with Minimum Spanning Tree (MST) concepts. However, this approach increases time complexity due to frequent MST updates. Similarly, in the study [11], the time complexity associated with CH election is not addressed. But in the current work the CH’s which are elected from second head onwards are elected using Q-Learning algorithm by considering only nearby nodes (shown in Figure 6) to the before CH. Through the Nn node concept the next CH’s are dynamically elected (shown in Figure 7) and the time complexity to elect CH is automatically reduced to O (n/2) using Q-Learning algorithm. Figure 8 shows the comparison of the time complexity of second CH election, where Q-Learning gives the best results. This Nn concept not only reduces time complexity but also increases energy efficiency by 75% while exchanging the data from the CMs as the current CH will be almost centroid of the current cluster. So that the distance to transfer the data will be comparatively less.
Whenever a new CH is elected the data need to be collected from the old CH. This data is collected through mobile sink. The work presented in the study [5] uses static Sink and they have not at all addressed about the obstacles in path. Proposed work clearly plans the shortest path for mobile sink as shown in Figure 9 and also addresses about mobile sink. In the studies [4, 30], they have used mobile sink but they assumed there is no obstacle in the path. But in the proposed work the obstacle path planning is explained in detail. Here the path is planned using the shortest path of the obstacle dynamically. This is shown in Figure 10.
Figure 10. Optimized data collection path for MS during obstacles
Moving to the energy level comparison of the nodes. The energy level of the nodes and CHs drain frequently due to data exchange. After every CH election the average energy drained is calculated to measure the performance. Figure 8 shows the nodes, average energy level comparison after each CH election.
Referring to [27, 31, 32] the simulation parameters are as follows, the variation of the number of surviving nodes with the number of rounds is a key indicator of the network’s lifetime. As depicted in Figure 11, the algorithm presented in this article consistently maintains a high number of surviving nodes over an extended period. Notably, the time of occurrence of the first dead node and the time when half of the nodes die outperform those of IMP-LEACH and LEACH-IACA. Even though the rate of node death accelerates after 670 rounds, at this point, approximately 80% of the nodes in the WSN have perished, and the network is effectively in a failed state, unable to meet the demand for sensing data acquisition. However, the Q-Learning algorithm effectively enhances the network’s lifetime, facilitating the transmission of more data within the network.
The total remaining energy of the nodes in the network gradually decreases as the number of rounds of operation increases and reaches zero when all nodes are dead. Reducing the consumption of energy can extend the lifespan of the network. A comparison of energy consumption between this algorithm and IMP-LEACH, LEACH-IACA and CRFMO is illustrated in Figure 12, which shows the overall trend of remaining energy in the network. From the graph, it is evident that the energy consumption trend of the proposed algorithm is significantly slower than the other three algorithms for the majority of the time. Additionally, the rate of change in energy consumption is relatively gentle. Although the remaining energy in the IMP-LEACH algorithm is slightly ahead in the later stages, by this time, the number of surviving nodes in the network is severely insufficient, rendering this portion of remaining energy of limited practical value.
Figure 11. Comparison of the number of survival nodes in the network
Figure 12. Network residual energy
Figure 13. Comparison of network coverage rates
WSNs cannot be moved once deployed. When designing routing algorithms, it’s essential to consider both prolonging the network’s overall lifespan and ensuring that the majority of nodes remain alive most of the time to maintain detection coverage. Figure 13 depicts the variation in coverage rates over rounds of operation for four algorithms. Although the CRFMO algorithm boasts the longest network lifespan among the three, its coverage rate is relatively mediocre, resulting in less-than-ideal monitoring effectiveness. Proposed algorithm with Q-Learning maintains a higher coverage rate most of the time, making it more suitable for practical deployment and application requirements in sensor networks. The ability of proposed algorithm to sustain high coverage rates for extended periods indicates that the current algorithm achieves more balanced node burden and more reasonable routing through the election of CHs and the establishment of forwarding rules.
Lesser the latency the performance will be high. Figure 14 shows the latency comparision between SDN-RL, DPDP and the proposed work. The proposed work achieves lesser latency compare to other algorithms. In SDN-RL, DPDP all nodes present in the network will communicate each other so the the data collection delay will be high. But in proposed work assuming that approximately half of the Nn (O(n/2)) are participating in the CH election, so the data collection delay lesser compare to other two algorithms.
Figure 14. Comparison of latency
This paper purely concentrates on 4 major things, 1. Clustering 2. Cluster head election 3. MS path planning and 4. Secure data transmission. Clusters are formed using k means algorithm. Initial cluster heads also elected using the same algorithm, where the cluster heads are centroids of the clusters in the network. Once energy of the CH ruins, dynamic cluster heads are elected using Q-Learning algorithm based on parameters such as energy and distance. The data collected by these cluster heads are securely collected using a mobile sink. The path to collect the data from all cluster heads is planned. If any obstacle is found during the tour the shortest edge is considered for the traversal. Moving to results, the initial cluster head using K-Means algorithm is compared with Q-Learning algorithm, where our algorithm achieves energy efficiency during data transmission as the cluster head is a centroid. Next the second CH is elected from neighbor nodes using Q-Learning algorithm and it is compared with K-Means and fuzzy C-Means algorithms. The results show our algorithm with Q-Learning achieves 75% of energy efficiency and the time complexity is reduced to $\mathrm{O}(\mathrm{n} / 2)$ than other algorithms. Future work: In the proposed work one mobile sink is used. When the monitoring area is large a single mobile sink cannot meet the real time requirement hence in future it is necessary to use more mobile sinks. In path planning process this paper considered shortest edge of the obstacle in future ML algorithms can be used to improve the MS path optimization.
[1] Singh, M.K., Amin, S.I., Choudhary, A. (2021). Genetic algorithm based sink mobility for energy efficient data routing in wireless sensor networks. AEU-International Journal of Electronics and Communications, 131: 153605. https://doi.org/10.1016/j.aeue.2021.153605
[2] Fan, B., Xin, Y. (2024). A clustering and routing algorithm for fast changes of large-scale WSN in IoT. IEEE Internet of Things Journal, 11(3): 5036-5049. https://doi.org/10.1109/JIOT.2023.3302874
[3] Gowda, C.S., Jayasree, P.V.Y. (2021). Rendezvous points based energy-aware routing using hybrid neural network for mobile sink in wireless sensor networks. Wireless Networks, 27(4): 2961-2976. https://doi.org/10.1007/s11276-021-02630-1
[4] Jindal, A., Agarwal, V., Chanak, P. (2021). Emergency evacuation system for clogging-free and shortest-safe path navigation with IoT-enabled WSNs. IEEE Internet of Things Journal, 9(13): 10424-10433. https://doi.org/10.1109/JIOT.2021.3123189
[5] Gutam, B.G., Donta, P.K., Annavarapu, C.S.R., Hu, Y.C. (2023). Optimal rendezvous points selection and mobile sink trajectory construction for data collection in WSNs. Journal of Ambient Intelligence and Humanized Computing, 14(6): 7147-7158. https://doi.org/10.1007/s12652-021-03566-2
[6] Agarwal, V., Tapaswi, S., Chanak, P. (2021). A survey on path planning techniques for mobile sink in IoT-enabled wireless sensor networks. Wireless Personal Communications, 119(1): 211-238. https://doi.org/10.1007/s11277-021-08204-w
[7] Redhu, S., Hegde, R.M. (2020). Cooperative network model for joint mobile sink scheduling and dynamic buffer management using Q-learning. IEEE Transactions on Network and Service Management, 17(3): 1853-1864. https://doi.org/10.1109/TNSM.2020.3002828
[8] Agarwal, V., Tapaswi, S., Chanak, P. (2022). Energy-efficient mobile sink-based intelligent data routing scheme for wireless sensor networks. IEEE Sensors Journal, 22(10): 9881-9891. https://doi.org/10.1109/JSEN.2022.3164944
[9] Rahman, G.M., Wahid, K.A. (2021). LDCA: Lightweight dynamic clustering algorithm for IoT-connected wide-area WSN and mobile data sink using LoRa. IEEE Internet of Things Journal, 9(2): 1313-1325. https://doi.org/10.1109/JIOT.2021.3079096
[10] Wei, Q., Bai, K., Zhou, L. (2022). An improved approach for wireless sensor networks with mobile sink using dynamic minimum spanning tree. IEEE Sensors Journal, 22(11): 10918-10930. https://doi.org/10.1109/JSEN.2022.3166942
[11] Kaur, G., Bhattacharya, M. (2024). Green fault tolerant aiot-enabled mobile sink data collection scheme in sensor networks. IEEE Transactions on Vehicular Technology, 73(10): 15385-15394. https://doi.org/10.1109/TVT.2024.3400880
[12] Al-Sadoon, M.E., Jedidi, A., Al-Raweshidy, H. (2023). Dual-tier cluster-based routing in mobile wireless sensor network for IoT application. IEEE Access, 11: 4079-4094. https://doi.org/10.1109/ACCESS.2023.3235200
[13] Vaezian, A., Darmani, Y. (2022). MSE-RPL: Mobility support enhancement in RPL for IoT mobile applications. IEEE Access, 10: 80816-80832. https://doi.org/10.1109/ACCESS.2022.3194273
[14] Hajian, E., Khayyambashi, M.R., Movahhedinia, N. (2022). A mechanism for load balancing routing and virtualization based on SDWSN for IoT applications. IEEE Access, 10: 37457-37476. https://doi.org/10.1109/ACCESS.2022.3164693
[15] Pathak, A., Al-Anbagi, I., Hamilton, H.J. (2022). An adaptive QoS and trust-based lightweight secure routing algorithm for WSNs. IEEE Internet of Things Journal, 9(23): 23826-23840. https://doi.org/10.1109/JIOT.2022.3189832
[16] Ekler, P., Levendovszky, J., Pasztor, D. (2022). Energy aware IoT routing algorithms in smart city environment. IEEE Access, 10: 87733-87744. https://doi.org/10.1109/ACCESS.2022.3199757
[17] Younus, M.U., Khan, M.K., Bhatti, A.R. (2021). Improving the software-defined wireless sensor networks routing performance using reinforcement learning. IEEE Internet of Things Journal, 9(5): 3495-3508. https://doi.org/10.1109/JIOT.2021.3102130
[18] Zeb, H., Ghani, A., Gohar, M., Alzahrani, A., Bilal, M., Kwak, D. (2023). Location centric energy harvesting aware routing protocol for IoT in smart cities. IEEE Access, 11: 102352-102365. https://doi.org/10.1109/ACCESS.2023.3317268
[19] Kumar, N., Vidyarthi, D.P. (2018). A green routing algorithm for IoT-enabled software defined wireless sensor network. IEEE Sensors Journal, 18(22): 9449-9460. https://doi.org/10.1109/JSEN.2018.2869629
[20] Sadrishojaei, M., Navimipour, N.J., Reshadi, M., Hosseinzadeh, M. (2021). A new preventive routing method based on clustering and location prediction in the mobile internet of things. IEEE Internet of Things Journal, 8(13): 10652-10664. https://doi.org/10.1109/JIOT.2021.3049631
[21] Zhang, Y., Ren, Q., Song, K., Liu, Y., Zhang, T., Qian, Y. (2021). An energy-efficient multilevel secure routing protocol in IoT networks. IEEE Internet of Things Journal, 9(13): 10539-10553. https://doi.org/10.1109/JIOT.2021.3121529
[22] Yao, Y.D., Li, X., Cui, Y.P., Wang, J.J., Wang, C. (2022). Energy-efficient routing protocol based on multi-threshold segmentation in wireless sensors networks for precision agriculture. IEEE Sensors Journal, 22(7): 6216-6231. https://doi.org/10.1109/JSEN.2022.3150770
[23] Almuntasheri, S.A., Alenazi, M.J. (2023). RDEC: Routing decisions through energy-cost estimation for IIoT and IWSNs in SDN-managed industry 4.0. IEEE Access, 11: 144244-144257. https://doi.org/10.1109/ACCESS.2023.3344450
[24] Xu, C., Xiong, Z.Y., Zhao, G.F., Yu, S. (2019). An energy-efficient region source routing protocol for lifetime maximization in WSN. IEEE Access, 7: 135277-135289. https://doi.org/10.1109/ACCESS.2019.2942321
[25] Ojha, A., Chaudhari, S.M., Chanak, P. (2024). A deep policy dynamic programming based intelligent data routing scheme for IoT-enabled wireless sensor networks. IEEE Transactions on Sustainable Computing, 10(3): 451-463. https://doi.org/10.1109/TSUSC.2024.3462512
[26] Sridharan, S., Venkatraman, S., Raja, S.P. (2023). A novel lie hypergraph based lifetime enhancement routing protocol for environmental monitoring in wireless sensor networks. IEEE Transactions on Computational Social Systems, 11(2): 2070-2080. https://doi.org/10.1109/TCSS.2023.3262273
[27] Sun, Q., Pang, J.L., Wang, X.Y., Zhao, Z.Y., Li, J. (2024). A clustered routing algorithm based on forwarding mechanism optimization. IEEE Sensors Journal, 24(22): 38071-38081. https://doi.org/10.1109/JSEN.2024.3467055
[28] Srivastava, H.K., Dwivedi, R.K. (2020). Energy efficiency in sensor based IoT using mobile agents: A review. In 2020 International Conference on Power Electronics & IoT Applications in Renewable Energy and its Control (PARC), Mathura, India, pp. 314-319. https://doi.org/10.1109/PARC49193.2020.236617
[29] Al-Behadili, H.A., AlWane, S.K., Al-Yasir, YI., Parchin, N O., Olley, P., Abd-Alhameed, R.A. (2020). Use of multiple mobile sinks in wireless sensor networks for large—Scale areas. IET Wireless Sensor Systems, 10(4): 175-180. https://doi.org/10.1049/iet-wss.2019.0208
[30] Kim, J.W., In, J.S., Hur, K., Kim, J.W., Eom, D.S. (2010). An intelligent agent-based routing structure for mobile sinks in WSNs. IEEE Transactions on Consumer Electronics, 56(4): 2310-2316. https://doi.org/10.1109/TCE.2010.5681105
[31] Mohammed, F.A., Mekky, N., Suleiman, H.H., Hikal, N.A. (2022). Sectored LEACH (S-LEACH): An enhanced LEACH for wireless sensor network. IET Wireless Sensor Systems, 12(2): 56-66. https://doi.org/10.1049/wss2.12036
[32] Wang, J., Du, Z.Z., He, Z.K., Wang, X.Y. (2020). A cluster-head rotating election routing protocol for energy consumption optimization in wireless sensor networks. Complexity, 2020(1): 6660117. https://doi.org/10.1155/2020/6660117