Spectrum Signals Handoff in LTE Cognitive Radio Networks Using Reinforcement Learning

Spectrum Signals Handoff in LTE Cognitive Radio Networks Using Reinforcement Learning

Kolluru Suresh BabuSrikanth Vemuru 

Department of CSE, K L University, Vaddeswaram, Vijayawada-522502, Andhra Pradesh, India

Corresponding Author Email: 
suresh11.kolluru@gmail.com
Page: 
119-125
|
DOI: 
https://doi.org/10.18280/ts.360115
Received: 
17 November 2018
|
Revised: 
20 January 2019
|
Accepted: 
29 January 2019
|
Available online: 
30 April 2019
| Citation

© 2019 IIETA. This article is published by IIETA and is licensed under the CC BY 4.0 license (http://creativecommons.org/licenses/by/4.0/).

OPEN ACCESS

Abstract: 

In this paper we build up a cognitive radio system (CRN) test bed to exhibit the utilization of support learning and exchange learning plans for spectrum handoff choices. By thinking about the channel status (inactive or active) and channel condition (as far as packet failure rate), the sender node plays out the learning-based spectrum handoff. The ideal power assignment of spectrum sharing clients is performed by Galactic Swarm Optimization (GSO) Algorithm. In reinforcement learning, the quantity of system perceptions required to accomplish the ideal choices is frequently and restrictively high, because of the complex CRN condition. At the point when a node encounters new channel conditions, the process is restarted with preparation notwithstanding when the comparable channel condition has been experienced previously. To relieve this issue, an exchange learning based spectrum handoff method is actualized, which empowers a node to gain from its neighboring node(s) to enhance its execution. The exploratory outcome will show that the machine learning based spectrum handoff performs better in the long term and adequately uses the accessible spectrum.

Keywords: 

cognitive radio network, long-term evolution, spectrum handoff, galactic swarm optimization, reinforcement learning

1. Introduction

Cognitive Radio (CR) is considered as a creative answer for relieve spectrum shortage issue by empowering Dynamic Spectrum Access (DSA), intended to placate the current clashes between the regularly expanding spectrum request development and at present wasteful spectrum use [1-2]. The fundamental thought of DSA is to give appropriate arrangements that permit sharing the radio spectrum among a few radio correspondence frameworks for upgrading the general spectrum use [3]. With the appearance of CR as a key empowering agent of DSA, a few papers broadcasted the requirement for Cognitive Radio Networks (CRNs), which permit a remote correspondence framework dependent on the alleged intellectual cycle that empowers watching nature, acting and figuring out how to enhance its execution [4]. A CRN is characterized as a remote system with the abilities of radio condition awareness, self-ruling, versatile reconfiguration of its framework and intelligent gaining as a matter of fact of a constantly changing condition to illuminate the difficulties of productive spectrum and astounding start to finish performance [5-6].

The accessible Radio Frequency (RF) spectrum required for broadband radio correspondence is reducing as the quantity of clients and the quantity of remote gadgets increases [7]. The present spectrum directions where spectrum groups are authorized to explicit Primary Users (PUs) for restrictive use, make spectrum assets rare for different clients who need to convey. It has been demonstrated that the inheritance essential clients are underutilizing the given assets [8-9]. Underutilization of assets and a blast in spectrum request has prompted the idea of spectrum sharing. In spite of the fact that there are different varieties of spectrum sharing, by and large the PU holds the most noteworthy need for a given channel, however a similar spectrum can be utilized by Secondary Users (SUs) when the PU isn't transmitting. This harmonious relationship necessitates that the SUs consistently sense the spectrum and astutely get to it when and where accessible [10], [11].

The Cognitive Radio (CR) idea includes the procurement of information about the encompassing condition so as to streamline and reconfigure the primary correspondence parameters [12]. There is extraordinary enthusiasm for applying the Cognitive Radio idea to the improvement of Dynamic Spectrum Allocation (DSA) calculations, which empower a self-sufficient assignment procedure of channel assets on a subjective gadget [13]. A testing research territory is then identified with this present reality execution of CR hypothetical ideas: Software Defined Radio (SDR) is viewed as the empowering innovation for CR configuration given the likelihood of actualizing frameworks that are totally configurable in programming [14]. Since the pragmatic usage of CR ideas on genuine stages is frequently influenced by a few programming and equipment constraints, the effective plan of SDR-based proving grounds is right now a point under scrutiny [15-18].

The rest of the paper is sorted out as pursues. Section 2 examines the Literature surbey. In Sect. 3, we portray our proposed ideal power allotment and spectrum handoff  method in detail. Basing on the proposed spectrum handoff plot, in Sect. 4, the parametric networks investigation and test consequences of proposed plan is given. At long last, Sect. 5 gives the closing comments of this paper.

2. Related Works

Sumin D. Joseph et al. [19] planned a Reconfigurable Antenna Based Cognitive Radio Test Bed For the powerful utilization of accessible recurrence transfer speed, subjective radio innovation is advancing over the most recent couple of years. In this, spectrum is shared among essential and auxiliary clients. Optional clients likewise called as unlicensed clients, makes utilization of part of spectrum which isn't utilized by essential or authorized clients.

Haipeng Du et al. built up a High Fidelity LTE Cellar Network Test bed for Mobile Video gushing here, they first plan and actualize a start to finish arranged test bed for LTE basement organization called LTE-EMU. The proving ground is a crossover arrangement by consolidating techniques for reenactment and copying. At that point, they demonstrated that the devotion of LTE-EMU is fundamentally higher than an LTE recreation condition through the loyalty show exhibited in the paper. The bases of assessing the devotion are the real start to finish transmission qualities gathered on genuine LTE arrangement through field test approach. At long last, utilizing a contextual investigation, they demonstrated the value of LTE-EMU in portable video gushing administration assessment. The proving ground is likewise freely available for different remote convention and application enhancement.

Muhammad Alam et al. [20] proposed an architecture for a setting based node and a proving ground stage for the examination of vitality utilization of heterogeneous agreeable interchanges are introduced. The definite ground includes a passageway, which gives inclusion in the foundation mode, just as nodes fit for imparting through short-extend ultra-wideband. The proving ground incorporates a setting awareness module that gives and stores data identified with various nodes in the framework. The paper indicates how setting data can be utilized to spare the vitality of cell phones and expand their battery lifetime utilizing short-run interchanges. The proving ground is utilized as a proof-of-idea for the down to earth execution of the agreeable interchanges idea.

Vuk Marojevic et al. [21] actualized an LTE Spectrum Sharing Research Test bed that has an incorporated research instrument to exploring these and other research issues; it permits breaking down the seriousness of the issue, structuring and quickly prototyping arrangements, and evaluating them with standard-agreeable gear and test methodology. The secluded proving base coordinates with broadly useful programming characterized radio equipment, LTE-explicit test gear, RF parts, free open-source and business LTE programming, a configurable RF organization and recorded radar waveform tests. It underpins RF channel imitated and over-the-air transmitted modes. The proving ground can be remotely received and designed. A RF exchanging system takes into consideration structuring with a wide spectrum of analyses that can include an assortment of genuine and virtual radios with help for Multiple Input and Multiple Output (MIMO) reception equipped activity.

Amirshahram Hematian et al. [22] present an actualized SDR test bed, which comprises of four complete SDR nodes. Utilizing the planned proving ground, we have directed two contextual analyses. The first is intended to encourage video transmission by means of versatile LTE measures.

Cognitive Radio (CR) has been advanced to make effective utilization of rare radio recurrence spectrum. It presents "adaptiveness" and "insight" to customary radios. CRN as a rule, is a system made out of CR nodes, with "savvy" organizing capacities. The general idea toward psychological systems administration was presented a couple of years prior. Despite the fact that there have been many research deals with CR and CRN, to our best information, a bonafide continuous CR framework has never been illustrated. Until this point, prior have created calculations for CR on spectrum detecting, agreeable spectrum detecting, channel state expectation, and spectrum assignment. These calculations give fundamental capacities to CR frameworks. By executing these calculations just as conventional correspondences calculations on an intellectual radio system test bed (CRN test bed) progressively, a CR framework can be illustrated. In this paper we are building a Reinforcement learning-based spectrum handoff tried in cognitive radio systems.

3. Reinforcement Learning with Spectrum Handoff in LTE Cognitive Networks

In this paper we build up a Cognitive Radio Network (CRN) test bed to show the utilization of support learning and exchange learning plans for spectrum handoff choices. By thinking about the channel status (inactive or active) and channel condition (as far as parcel failure rate), the sender node plays out the learning-based spectrum handoff. In reinforcement learning, the quantity of system perceptions required to accomplish the ideal choices is frequently restrictively high, because of the complex CRN condition.

At the point when a node encounters new channel conditions, the taking in process is restarted starting with no outside help notwithstanding when the comparable channel condition has been experienced previously. To relieve this issue, an exchange learning based spectrum handoff method is actualized, which empowers a node to gain from its neighboring node(s) to enhance its execution. In exchange learning, the node looks for a specialist node in the system. On the off chance that a specialist node is discovered, the node asks for the Q-table from the master node for settling on its spectrum handoff choices. On the off chance that a specialist node can't be discovered, the node learns the spectrum handoff technique all alone by utilizing the reinforcement learning. The test result will be show that the machine learning based spectrum handoff performs better in the long haul and successfully uses the accessible spectrum.

3.1 System model

In this area, the general definition of the spectrum sharing issue in cognitive radio systems is presented. At that point, this detailing will be adjusted to consolidate vital angles which can be utilized to facilitate the proposed arrangement advancement.

3.1.1 General spectrum sharing problem.

Consider a lot of a few auxiliary clients and a lot of essential client channels. Every essential client channel has a condition of being either Idle or Busy contingent upon essential client activities.

$S={s_1,s_2,.........,s_{|s|}}$, $P={p_1,p_2,......,p_{|p|}}$.

Each ui sees channels states as indicated by its detecting capacity where it relies upon probabilities of location and false alert.

Let O is the lattice which speaks to every client sees channel inhabitance states. Components in (Oij indicates the jth channel state as indicated by the ith client) are either 0 if the channel state is Idle or 1 in the event that it is Busy. 

Figure 1. Cooperative sensing in cognitive radio environment

The transmitter in each optional client macth ui utilizes pik as trasmission control on channel ck, where 0≤pik≤pk max. Expect that essential client transmission contral on channel ck is a lot bigger than the greatest trasmission intensity of optional clients (pk>>pk max). The divert gain from trasmitter in auxiliary client combine uj to the collector in optional client mactch on channel ck is meant by gijk, while the channel gain from essential client transmitter to the recipient in optional client combined with ui is signified as gk.

In Figure 2, ${{h}_{s,i}},h_{i,k}^{s},{{h}_{p,k}}$ and $h_{k,i}^{p}$ respectively represent channel gain between the ${{i}^{th}}$ transmitter and receiver of SUs, the ${{i}^{th}}$ transmitter of SUs and the ${{k}^{th}}$ transmitter of PUs, the ${{k}^{th}}$ transmitter and receiver of PU and the ${{k}^{th}}$ transmitter of PU and the ${{i}^{th}}$ receiver of SUs.

Moreover, ${{\left\{ h_{j,i}^{c} \right\}}_{i\ne j}}$ are representation of interference between SUs. ${{y}_{s,i}}$ and ${{y}_{p,k}}$ are respectively the received signal at the receiver of the ${{i}^{th}}$ SUs and the ${{k}^{th}}$ PUs:

${{y}_{s,i}}={{x}_{s,i}}{{h}_{s,i}}+\sum\limits_{j=1,j\ne i}^{N}{\,\,\,\,\,\,\,h_{j,i}^{c}}{{x}_{s,j}}+\sum\limits_{k=1}^{M}{h_{k,i}^{p}{{x}_{p,k}}+{{n}_{s,i}}}$ (1)

${{y}_{p,k}}={{x}_{p,k}}{{h}_{p,k}}+\sum\limits_{i=1}^{N}{h_{i,k}^{s}}{{x}_{s,i}}+{{n}_{p,k}}$ (2)

Figure 2. PUs and SUs. fading channel

$\sum\limits_{i=1}^{N}{{{p}_{i}}f_{i,k}^{s}}\le {{Z}_{k}},\,\,\,k=1,2,.....,M$ (3)

We additionally limit the intensity of every client to ensure least of QoS. At the end of the day, we characterize for least of flag to obstruction and commotion proport signal to interference and noise ratio (SINR) of a client.

${{\mu }_{i}}=\frac{{{p}_{i}}{{h}_{s,i}}}{\sum\nolimits_{j=1,j\ne i}^{N}{{{p}_{j}}h_{i,j}^{c}+\sum\nolimits_{k=1}^{M}{{{p}_{p,k}}h_{k,i}^{p}+\sigma _{s,i}^{2}}}}$ (4)

where, $σ_{s,i}^2$ is the noise variance. Our purpose is to generate the following condition for every user:

${{\mu }_{i}}\ge {{\mu }_{i,\min }}\,\,\forall i\in 1,......,N$ (5)

3.1.2 Non-orthogonal multiple access (NOMA) for cooperative spectrum sharing

Various access strategies are utilized to enable various versatile clients to have a similar spectrum. It was once generally trusted that superposition coding is the best approach to accomplish most extreme limit, ruling symmetrical measures as time division (TD) or frequency division (FD). Nonetheless, TD, FD and superposition coding have a similar limit area under the limitation of entirety control in Medium access channel (MAC). So NOMA isn't a possibility for 5G uplink.

In the downlink, the two user model can be expressed as

$y_1=h_1 (x_1+x_2 )+n_1$ (6)

$y_2=h_2 (x_1+x_2 )+n_2$ (7)

where, x1 and xare signals of user one and two, h1 and h2 are the channel gain, n1 and n2 are white noise, y1 and yare the received signal. In case of symmetric channel, i.e.  h1= h2, such a broadcasting channel is equivalent to a MAC, then NOMA can achieve no capacity gain over orthogonal schemes.

3.2 Galactic swarm optimization for optimal power allocation

The GSO calculation mirrors the development of cosmic systems and super groups of worlds in the universe. Right off the bat all people or arrangements from every one of the subpopulations are affected by the best arrangements in every subpopulation. At that point every subpopulation is spoken to by the best solution in every one of the subpopulations and treated as super swarm. The super swarm is created with the best arrangements of every subpopulation. Along these lines all of the people or arrangements will be pulled in towards the worldwide best arrangement. In this paper we use GSO calculation for ideal power distribution spectrum sharing clients.

Level 1

Rather than having a solitary swarm to investigate towards a specific bearing, when they have a few swarms they have a synergistic impact that outcomes in a superior investigation.

Every sub warm investigates the inquiry space freely individually. This procedure starts by figuring the speed and position of the particles said articulations to refresh the speed and position are:

$v_{j}^{(i)}\leftarrow {{\omega }_{1}}{{v}^{i}}+{{c}_{1}}{{r}_{1}}\left( p_{j}^{(i)}-x_{j}^{(i)} \right)+{{c}_{2}}{{r}_{2}}\left( {{g}^{(i)}}-x_{j}^{(i)} \right)$ (8)

$x_{j}^{(i)}\leftarrow x_{j}^{(i)}+v_{j}^{(i)}$ (9)

where $v_{j}^{(i)}$ is velocity of particle, $p_{j}^{(i)}$ is best solution found, $g^{(i)}$ is best global solution, $x_{j}^{(i)}$ is position of current particle, c1 and care acceleration constants, $ω_1$ is weight, and $r_1$ and $r_2$ are random numbers.

Level 2

The worldwide best arrangements in the following phase of grouping to frame the super swarm. Another Super swarm Y is made through the assemblage of the worldwide best arrangements of the sub swarms xi.

$y^{(i)}∈Y:i=1,2,⋯,M$ (10)

$y^{(i)}=g_{(i)}$ (11)

The super swarm uses the worldwide best arrangement officially determined by the sub swarms and in this manner exploits the effectively determined data. The worldwide arrangements of sub swarms have impact on the super swarm, however there is no criticism impact or stream of data from the super swarm to the sub swarms to monitor the decent variety of arrangements. The speed and position of the super swarm in the second bunching dimension are refreshed by the conditions appeared as follows:

$v^{(i)}←ω_2 v^i+c_3 r_3 (p^{(i)}-y^{(i)} )+c_4 r_4 (g-y^{(i)} )$ (12)

$y^{(i)}←y^{(i)}+v^{(i)}$ (13)

where, $v^{(i)}$ is velocity with Yi, $p^{(i)}$ is best solution, $ω_2$ is weight, $r_3$ and $r_4$ are random numbers.

Figure 3. Channel State information for reinforcement learning

Heterogeneous optional clients will choose the proper transmission control dependent on the neighborhood procured data with respect to the encompassing remote condition. At the end of the day, every client might want to clear its line inside a particular most extreme time allotment ($\tau$imax) to accomplish the required nature of administration level. Consequently, the transmission power ought to be alloted dapendent on the present line will be cleared before the finish of the finish of the time span. Commotion is determined dependent on CSI, while $\tau$imax can be figured dependent on a few factors, for example, postpone affectability of correspondence. 

3.3 Reinforcement learning

3.3.1 Action representation

Support learning furnishes a formal system worried about how specialists take activities in a situation in order to amplify some thought of total reward. Formally, RL characterizes a lot of activities A that an operator takes to accomplish its objective; a lot of states' S that speaks to the specialist's understanding/data of the present condition; and a reward work R that learns an ideal arrangement to manage the specialist's activities dependent on its states.

Table 1. Q-Table description and rewards for each state-action 

States

Channel condition

Reward (Bset action)

Reward (for wrong action)

[0,0]-1

Very Poor

5(Spectrum Handoff)

1(Transmit)

[0,1]-2

Poor

5(Spectrum Handoff)

1(Transmit)

[1,0]-3

Poor

5(Spectrum Handoff)

1(Transmit)

[1,1]-4

Very Good

10(Transmit)

-5(Hand off)

3.3.2 Reward representation

The agent transforms a bounding box according to a set of actions. The action space is given by {∅, 1, . . . , M}, Each action makes a discrete change to the feature by a factor relative to its current size. The reward function r (action, s → s’) is defined for an agent when it takes the action a to move from states to s’.

$r\left( s,a,{{s}^{,}} \right)=\left\{ \begin{align}  & 0\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,channel\,state\,\,\det ection \\ & E\left[ \sum\limits_{t=0}^{\infty }{{{\gamma }^{t}}\,{{r}_{t}}} \right]\,\,\,otherwise \\\end{align} \right\}$ (14)

where $\gamma \in \left( 0,1 \right)$ is a discounting future rewards. With the action set, state set and reward function defined, as applied deep Q-learning to learn the optimal policy. They also proposed an interesting design for setting masks in the image after taking the trigger action. This design allows for effective detection of multiple instances of the same class. Finally, the authors applied a post fuzzy rule set to all windows in the trajectory to boost performance.

3.3.3 Policy Representation

Amid last runtime use, i.e., when learning has finished, the on-screen character related with the most astounding anticipated Q-esteem is chosen and its proposed activity is connected for the given state.

${{\nu }^{*}}=\arg \,\max {{Q}_{\nu }}\left( s \right)$ (15)

$\pi \left( s \right)={{X}_{{{\nu }^{*}}}}\left( s \right)$ (16)

The data sources are institutionalized before being utilized by the system. The mean and standard deviation for this institutionalization are resolved utilizing information gathered from an underlying irregular activity arrangement. The outcomes are additionally trailed by the reverse of a comparable change so as to enable the system to learn institutionalized yields. We apply the accompanying change:

${{X}_{\nu }}\left( s \right)=\sum{\overset{\_}{\mathop{X}}\,\left( s \right)}+{{\beta }_{\nu }}$ (17)

Where $\overset{\_}{\mathop{X}}\,\left( s \right)$ is the output of each actor sub network and it is$\overset{\_}{\mathop{X}}\,\left( s \right)$ remains approximately within [-1, 1].

Reinforcement learning algorithm steps

i. Set $Q(i,a)=0, V(i,a)=0, ∀i∈S,a∈A(i)$.

ii. Set $k=0,k_max,A=cons tan t$.

iii. Calculate initial state i.

iv. Perform action a and determine equivalent reward $r(i,a,j)$ and do following updates: $V(i,a)←V(i,a)+1,a=A/(V(i,a)$).

v. Update Q-factor connect to state i and action a as: $Q(i,a)←(1-a)Q(i,a)+a[r(i,a,j)+λmax_{b∈A(j)}$ Q(j,b)

vi. Set $k=k+1, i=j$. I$f≤k_max$, back to the step iv, else go to the step viii

vii. Calculate decision at every state $a^* (i)$ as: $a^* (i)∈arg⁡max_{b∈A(j)} Q(j,b)

In reinforcement learning algorithm steps, $Q(i,a)$ is state–action pair value, $V(i,a)$ is the number of selecting action $a$ at state $i$. $S$ is set of states and $A(i)$ is set of all admissible actions at state $i$. ${{k}_{\max }}$ is maximum number of learning iteration that is equal to the number of fuzzy combination rules.

Figure 4. Reinforcement learning flow diagram

4. Results

So as to assess the execution of the proposed CR method, we can perform broad reenactment tries in a heterogeneous spectrum condition with single-channels and multi-channels. The reenactment experiments have been accomplished with the discrete event test system (MATLAB) apparatus to dissect the aggregate and the minimal marginal delay for the optional clients, for both of the  situations.

By and large, the framework normal all out administration time for reproduction investigations can be assessed utilizing the accompanying recipe:

$E[T]=E[X_s]+E[N]^*(∑Handoff Delays)/(Number of Ineterruption)+T_WS+T_ha$ (18)

Where, Tha is the handshaking time and Tsw speaks to the exchanging defer which are both thought to be disregarded. Though, E[Xs] and E[N] allude to the mean administration time and the normal number of intrusions of the auxiliary clients, individually. Thus, can be composed as:

$E[N]=(Number of Ineterruption)/(Number of USs Arrivals)$ (19)

All in all, for recreation tests the total normal holding up times spent in the circle before getting the administration can be determined utilizing the accompanying formulae:

Table 2. Simulation Parameters 

Parameters

Value

Number of SUs

5

Number of PUs

3

Available Date Channels

4

Date rate

720 kbps

Packet size

1500 bytes

PU arrival rate

0.05-0.30

SU arrival rate

1.0-6.0

PU service rate

0.60

SU service rate

0.40

PU arrival rate increment

0.05

As from Figure 5, an examination has been directed so as to explore the execution of the new spectrum handoff conspire which was actualized utilizing support learning with existing non exchanging handoff method. The examination has been made as far as normal handoff delay against essential clients' handling rate. The resultent outcomes demonstrate that the reinforcement learning based proposed plan outflanks contrast with non-exchanging handoff plans.

Figure 5. Average handoff delay with respect to PUs arrival rate

The normal holding up postponement in the circle is drawn versus the optional clients' entry rate so as to accomplish the comparisonit can be seen from Figure 6. The diagram results can speak to average holding up postponement regarding SUs entry rate. Of course, as the SU handling rate builds, the comparing postpone increments too. In addition, diagrams demonstrate that the outcomes got from recreation analyzes our proposed plan beat over existing non-exchanging handoff method.

Figure 6. Average waiting delay with respect to SUs arrival rate

The accomplished outcomes in Figure 7 demonstrate that, as the SUs' handling rates increment, the packet error increment too. Subsequently, this will build the packet error rate in the circle for the SUs. Frame the charts results perception our proposed plan packet error rate is less contrasted with existing non-exchanging handoff plans.

Figure 7. Packet Error Rate with respect to SUs arrival rate

This is valid, since in the different line case, it is conceivable to locate some intruded on clients sitting tight for administration in the line while alternate directs are in the inert state, or it has just optional clients in the low-need line hanging tight for administration. Obviously this will squander some transmissions for the intruded on users which will prompt an expansion in the handoff delay and the all out administration time too. Then again, when we consider the mutual line case, intruded on clients in the common line will be taken care of by the principal channel, which winds up inert. Additionally, there will be no optional client served before any interfered with clients exist in the framework. Clearly, the all out administration time will be brought down on account of utilizing shared lines contrasted and the instance of utilizing separate line.

Figure 8. Total Service Time with respect to SUs arrival rate

As it tends to be seen from Figure 7, an examination consider has been led so as to research the execution of the new spectrum handoff plot which was actualized utilizing reinforcement learning with existing non exchanging handoff conspire. The correlation has been made as far as complete administration time against essential clients' entry rate. The practiced outcomes demonstrate that the reinforcement learning based proposed plan outflanks contrast with non-exchanging handoff plans.

Figure 9. Total packet error rate with respect to SUs delay constraint

As it very well may be seen from Figure 7, a reenactment examine has been led so as to explore the execution of all out packet blunder rate which was actualized utilizing support learning regarding optional client defer limitations. The examination has been made as far as different defer due dates. The cultivated outcomes demonstrate that the packet mistake rate is zero as for no postpone limitation.

The proposed handoff scheme efficiently lessen the PUs handoff deferral and SU (Secondary User)waiting delay brought about by spectrum handoff. In addition, we consider the impacts of direct conditions in perspective of packet error rate (PER) as for entry rate of SUs. To upgrade the nature of administration, it is critical to consider both the handoff delay and the transmission channel quality, while picking channels for spectrum handoff.

5. Conclusion

The proposed reinforcement learning based handoff method utilizes NOMA and multiuser CR have thought about their points of interest in spectrum use. Be that as it may, by suitably joining NOMA with multicast CR organizing, further execution enhancement as far as high productivity can be accomplished. The proposed handoff method means to reduce the SU (Secondary User) handoff delay brought about by spectrum handoff. In this paper we consider the impacts of other divert conditions in perspective of packet error rate (PER). To upgrade the nature of administration applications, it is imperative to consider both the handoff delay and the transmission channel quality, while picking channels for spectrum handoff. The outcomes assessment dependent by and large holding up of all SU associations and all out administration time of all PU association.

  References

[1] Liao Y, Song L, Han Z, Li Y. (2015). Full duplex cognitive radio: A new design paradigm for enhancing spectrum usage. IEEE Communications Magazine 53(5): 138-45. https://doi.org/10.1109/MCOM.2015.7105652

[2] Arafat AO, Al-Hourani A, Nafi NS, Gregory MA. (2017). A survey on dynamic spectrum access for LTE-advanced. Wireless Personal Communications 97(3): 3921-41. https://doi.org/10.1007/s11277-017-4707-0

[3] Hawa M, AlAmmouri A, Alhiary A, Alhamad N. (2017). Distributed opportunistic spectrum sharing in cognitive radio networks. International Journal of Communication Systems 30(7): 20-25. https://doi.org/10.1002/dac.3147

[4] Raschellà A, Umbert A. (2016). Implementation of cognitive radio networks to evaluate spectrum management strategies in real-time. Computer Communications 79: 37-52. https://doi.org/10.1016/j.comcom.2015.11.009

[5] Kumar R, Darak SJ, Sharma AK, Tripathi R. (2016). Two-stage decision making policy for opportunistic spectrum access and validation on USRP testbed. Wireless Networks 1-5. https://doi.org/10.1007/s11276-016-1420-y

[6] Pérez-Romero J, Raschellà A, Sallent O, Umbert A. (2016). A belief-based decision-making framework for spectrum selection in cognitive radio networks. IEEE Transactions on Vehicular Technology 65(10): 8283-96. https://doi.org/10.1109/TVT.2015.2508646

[7] Jia M, Liu X, Gu X, Guo Q. (2017). Joint cooperative spectrum sensing and channel selection optimization for satellite communication systems based on cognitive radio. International Journal of Satellite Communications and Networking 35(2): 139-50. https://doi.org/10.1002/sat.1169

[8] Hassan MR, Karmakar GC, Kamruzzaman J, Srinivasan B. (2017). Exclusive use spectrum access trading models in cognitive radio networks: A survey. IEEE Communications Surveys & Tutorials 19(4): 2192-231. https://doi.org/10.1109/COMST.2017.2725960

[9] Kumar K, Prakash A, Tripathi R. (2016). Spectrum handoff in cognitive radio networks: A classification and comprehensive survey. Journal of Network and Computer Applications 61: 161-88. https://doi.org/10.1016/j.jnca.2015.10.008

[10] Chen YS, Cho CH, You I, Chao HC. (2011). A cross-layer protocol of spectrum mobility and handover in cognitive LTE networks. Simulation Modelling Practice and Theory 19(8): 1723-44. https://doi.org/10.1016/j.simpat.2010.09.007

[11] Dixit S, Periyalwar S, Yanikomeroglu H. (2013). Secondary user access in LTE architecture based on a base-station-centric framework with dynamic pricing. IEEE Transactions on Vehicular Technology 62(1): 284-96. https://doi.org/10.1109/TVT.2012.2221753

[12] Ramzan MR, Nawaz N, Ahmed A, Naeem M, Iqbal M, Anpalagan A. (2017). Multi-objective optimization for spectrum sharing in cognitive radio networks: A review. Pervasive and Mobile Computing 41: 106-31. https://doi.org/10.1016/j.pmcj.2017.07.010

[13] Mohamedou A, Sali A, Ali B, Othman M, Mohamad H. (2017). Bayesian inference and fuzzy inference for spectrum sensing order in cognitive radio networks. Transactions on Emerging Telecommunications Technologies 28(1): 12-15. https://doi.org/10.1002/ett.2916

[14] El Tanab M, Hamouda W. (2017). Resource allocation for underlay cognitive radio networks: A survey. IEEE Communications Surveys & Tutorials 19(2): 1249-76. https://doi.org/10.1109/COMST.2016.2631079

[15] Park JS, Yoon H, Jang BJ. (2016). SDR-based frequency interference analysis test-bed considering time domain characteristics of interferer. Advanced Communication Technology (ICACT) 517-521. https://doi.org/10.1109/ICACT.2016.7423453

[16] Castro-Hernandez D, Paranjape R. (2017). Optimization of handover parameters for LTE/LTE-A in-building systems. IEEE Transactions on Vehicular Technology. 5260–5273. https://doi.org/10.1109/TVT.2017.2711582

[17] Won SH, Cho S, Shin J. (2017). Virtual antenna mapping MIMO techniques in a massive MIMO test-bed for backward compatible LTE mobile systems. Advanced Communication Technology (ICACT) 971-978. https://doi.org/10.23919/ICACT.2017.7890253

[18] Malkowsky S, Vieira J, Liu L, Harris P, Nieman K, Kundargi N, Wong IC, Tufvesson F, Öwall V, Edfors O. (2017). The world’s first real-time testbed for massive MIMO: Design, implementation, and validation. IEEE Access 5: 9073-88. https://doi.org/10.1109/ACCESS.2017.2705561

[19] Joseph SD, Manoj S, Waghmare C, Nandakumar K, Kothari A. (2017). UWB sensing antenna, reconfigurable transceiver and reconfigurable antenna based cognitive radio test bed. Wireless Personal Communications 96(3): 3435-62. https://doi.org/10.1007/s11277-017-4117-3

[20] Alam M, Trapps P, Mumtaz S, Rodriguez J. (2017). Context-aware cooperative test bed for energy analysis in beyond 4G networks. Telecommunication Systems. 64(2): 225-44. https://doi.org/10.1007/s11235-016-0171-5

[21] Marojevic V, Nealy R, Reed JH. (2017). LTE spectrum sharing research testbed: integrated hardware, software, network and data. arXiv preprint arXiv:1710.02571. https://doi.org/10.1145/3131473.3131484

[22] Hematian A, Nguyen J, Lu C, Yu W, Ku D. (2017). Software defined radio testbed setup and experimentation. In Proceedings of the International Conference on Research in Adaptive and Convergent Systems Sep 20, pp. 172-177. https://doi.org/10.1145/3129676.3129690