A comparative analysis: optimal node selection in large data block transmission in VANET using various node relay optimization algorithms

VANETs (Vehicular Ad-hoc Networks) are a kind of Mobile Ad-hoc Networks (MANETs) with increased mobility as well as occasional geographic changes. Clustering is used in VANETs to divide networks into groups of adaptable vehicles to optimize routing, information gathering, as well as routing. As large data block transmission contains several stages and inside those, one of the important stages such as optimal model selection was these selections of quintessential features helps in a further stage called data transmission much more effective. This stage will also bring energy optimization and also improves the effectiveness of the optimal path for data transmission. This paper brings a comparative analysis of this optimal relay node selection. Here, we use Linear Scaling based Red Colobuses Monkey technique and are contrasted with other state-of-the-art models over various measures. Analysis proves that the proposed optimal relay node selection technique boosts the data transmission model.


Introduction
The Vehicular Ad hoc Network (VANET), which aims to enable faultless Internet connectivity between vehicles, serves as the backbone for Intelligent Transportation Systems (ITSs).Due to developments in the intelligent vehicle and new generation wireless communication protocols, vehicles along with wireless interfaces can supply ITS services [1] like traffic monitoring, vehicle navigation, nearby data services, as well as mobile vehicular cloud computing.
Because of the highest mobility and unequal spatial distribution of vehicles in VANETs, the establishment of a reliable network and communication management is the most difficult challenge (Figure 1).Clustering technology has proven to be a potential technique for improving routing reliability along with scalability by grouping comparable vehicles into virtual clusters [2].Each cluster contains a capital vehicle called the cluster head, which is in charge of the cluster's communication.Vehicles within a cluster can make contact with one another through intracluster communication.Further, vehicles from other clusters can interact with others by cluster heads [3].
Recently, Vehicle Ad hoc Networks (VANETs) have been investigated as a viable option for establishing a long-term virtual network for IoT-based data sharing as well as transfer applications [4].The opportunistic network, for instance, permits vehicles to retain as well as transfer data towards other linked vehicles utilizing their low-power communication abilities.
For example, VANETs are frequently used to share information with all neighbouring devices via broadcast [4][5][6][7][8][9].However, there are various obstacles to delivering data from a source to a destination, including unclear vehicle motion, constrained storage, broadcast storms, as well as network congestion.The vehicles that are utilized throughout the procedure will primarily be volunteers who aren't even guided by data sources or destinations, thus they will go at random.The key problem is in determining how to utilize data available, including the number of vehicles within every vehicle's range, position, and/or speed, to select cars that would transport the messages without altering the destination in any manner.A second barrier is to overcome which signifies the ability to change our protocols relies upon the nature of the environment, like vehicle density as well as the distance across them.Addressing these issues, this research provides an effective comparative analysis of node selection methods that improve and optimize the data transmission path in VANETs.

Key highlights
This paper focuses on bringing a comparative analysis over optimal mode selection, in which the following are the key objectives: • Comparative analysis is shown using various stateof-art models with Linear Scaling based on Red Colobuses Monkey.• The proposed optimal node selection technique outperforms various measures, improves the optimization and energy consumption for data transmission.

Organization of the paper:
We already came across the overview of VANETS and its respective stages in Section 1. Section 2 depicts the literature review, Section 3 discusses the Methodology, Section 4 illustrates the performance analysis and finally winds up with the conclusion in Section 5.

Literature review
Usha and Ramakrishnan [10] used the improved Optimal Link State Routing Protocol (MMPR-OLSR) in conjunction with the GSA-PSO (Gravitational Search-Particle Swarm Optimization) scheme along with the cognitive radio method to develop the enhanced Optimal Link State Routing Protocol (MMPR-OLSR).Vehicular Sensor Networks can benefit from this strategy.MMPR-OLSR together with SA-PSO optimization makes it easier for the MMPR-OLSR protocol to find the best member nodes by employing an optimal search strategy.
Elhosoney and Shankar [11] proposed the K-Medoid Clustering approach for clustering vehicle nodes as well as identifying energy-efficient nodes for compelling communication.To achieve energyefficient communication, a metaheuristic algorithm including the Enhanced Dragonfly Algorithm (EDA), which improves the parameter as least energy consumption in VANET, is utilized to locate efficient nodes from each cluster.When contrasted to existing systems, the results reveal that V2V communication enhances energy efficacy throughout every vehicle node while also consuming reduced time to implement.
Manimozhi et al. [12] proposed that real-time High Definition (HD) video streaming be provided along with both enjoyments and necessary security information.Although this criterion offers a lot of advantages, it also has a lot of drawbacks that make it difficult.The video streaming quality is determined via temporal also with spatial parameters.With all the considerations, this study proposes a new protocol for deploying receiver-based relay node selection and congestion control techniques.
For effective communication between vehicles, Gorai and Banerjee [13] offered a robust forwarding node selection mechanism.To do so, a path is first chosen that avoids obstructions using the Delaunay Triangulation.Then the path has been optimized by deleting Torricelli points that were previously within the Delaunay Triangulation.
Gawaw et al. [14] demonstrated a unique Cross-Layer based Reliable Vehicular Routing model (CL-RVR) to promote reliable routing in VANETs by combining physical and network layer factors to cater to QoS requirements for targeted applications.All the important performance criteria are analyzed using extensive experimental simulations.

Methodology
Figure 2 signifies the overall structure of the suggested system, with the following stages: (a) clustering vehicles, where cluster heads will be chosen utilizing Swap Displacement and Reversion -Fertile Field Algorithm (b) clustering vehicles, where cluster heads will be selected utilizing Swap Displacement and Reversion -Fertile Field Algorithm (SDR-FFA); Then, utilizing SD-Kmeans, the clustering stage assists in determining the distance between vehicles and cluster head.The (c) ideal relay node will then be determined using Linear Scaling based Red Colobuses Monkey (LS-RCM) and several other state-of-the-art models to enhance energy efficiency.The Linear Scaling technique is introduced to the Red Colobuses algorithm to increase selection performance.The relay node will be chosen from among the vehicles.After that, the useful features will be extracted, such as Speed, Distance, and Link Residual Energy.
Once these features have been extracted, they will be passed to Tan h-Adaptive Neuro-Fuzzy Inference System Th-ANFIS for (d) optimal path discovery.The Tan h activation function will be used in the existing ANFIS algorithm to minimize training errors.If there are any challenges with the transmission timing, the backup path will be considered.To increase energy economy and transmission time, the huge files will be (e) divided into blocks and encoded utilizing the Base64 technique.The sender receives a chunk after sending the encoded blocks; if the sender does not acquire the chunk, the connections will be lost.As a result, for retransmission, the backup path will be chosen.As a result, for retransmission, the backup path will be chosen.Finally,  all of the divided encoded file blocks are reassembled.Finally, using Base64, all of the split encoded file blocks are assembled and decoded.
The stages of data transmission in a VANET are listed below, with the relay node optimum stage and its associated approaches being the focus of this work.

Clustering vehicles
As illustrated in Figure 3, the Cluster Head (CH) performs an important function throughout the development of a cluster in VANET clustering.Based on the input metrics, a cluster can be constructed in a variety of ways.Cluster Members (CM) are the vehicles that make up a cluster.Apart from CH as well as CM, Cluster Gateways (CGs) are methods that use two CMs to interact with some other clusters on behalf of the CH.Unless otherwise specified, every cluster member is referred to as CMs.A cluster can have one CH, zero/one/two CGs, as well as any number of CMs.Within a VANET cluster, the cluster head serves as a mobile router, whereas cluster members serve as a mobile node.Between CH and CM, CG plays a role.The cluster is constructed using parameters like the vehicle's average relative velocity, acceleration, position, direction, vehicle degree, vehicle density, transmission range, and so on.The cluster head is taken out across the participants' vehicles that are the most stable.The remaining cars become CMs and join the cluster [15,16].As a result, CH selection is integrated into the cluster creation method, and hence no additional CM selecting criteria are required.CH, as well as CMs, retain a routing table comprising data about the cluster's CH and CMs for intra-cluster communication.The CM, on the other hand, doesn't maintain other clusters' routing tables, which are kept track of by the CH if essential.
As a result, a large network is defined as a collection of tiny networks or clusters.Based on this, an ideal cluster head was chosen, using the Swap Displacement and Reversion Fertile Field Algorithm to increase cluster performance (SDR-FFA).In addition, to improve clustering accuracy, the Supremum distance technique is included in the K-Means algorithm.

Cluster head selection
The Swap Displacement and Reversion Fertile Field Algorithm (SDR-FFA) was designed for cluster head selection in the proposed system.The positions are modified using this Swap Displacement and Reversion Technique to improve the fertility evaluation in the fertile field algorithm.Plants have a long and illustrious history on this planet.Their evolved survival pattern can serve as a unique source of inspiration for the development of an evolutionary optimization algorithm.Plants can adapt to a variety of climate conditions to survive and thrive.Seeding is the most prevalent way of reproduction for all plants.Some seeds fall under the plants during the seeding process, while others are disseminated in other zones by natural forces such as wind and animals.In this way, adequate growth and development possibilities for new plants in various portions of the field can be supplied [17,18].
Fertility has a direct relationship with plant development in the field.On the other hand, the fertility of all field zones is rarely equal.When a seed is planted in a fertile zone of the field, it has access to all of the necessary conditions for growth and development.As a mature plant, it might contribute seeds or pollination beyond the growth period.New plants can then be produced in the fertile zones, but if the area is not fertile, the seed will not be able to grow fast enough to reproduce and will be squandered.As a result, it will be organically eliminated from the life development cycle.Plants in more productive locations are denser after consecutive generations of fresh plant growth.An evolutionary-based optimization method can be created using this natural pattern as inspiration.Based on the fertility levels of zones, the distribution of plants in different sections of the field can be replicated using this natural pattern.The fertility of a point in the field is taken to be equal to the objective function value at that time in the proposed procedure.The (a) initial seeding procedure, (b) Fertility evaluation of the spots based on the objective function, (c) Seed regeneration, (d) Seed dispersal process, and (e) Convergence criteria evaluation comprise the Fertile Field algorithm as a nature-inspired optimization method.

Clustering using SD-KMEANS
Bansal et al. [10]  parameters are considered to form the clusters: x dimension and y dimension, i.e. the position of the vehicles.The number of clusters is specified as an input, and the vehicles are divided into clusters using a modified k-means method.The cluster's centroid, as well as various security concerns, are used to select the CH.To boost security, the packets are encrypted or decrypted using a hashing algorithm.Following the selection of the centroid as CH, other remaining vehicles join the cluster as the cluster's CMs [19].Since the clusters in intelligent clustering containing the k-means algorithm, cannot overlap, there is no need for a separate maintenance phase.The suggested technique improves PDR and throughput while increasing routing overhead when compared to the original k-means algorithm.Nevertheless, while the number of clusters given as an input to this algorithm varies, the density also the number of vehicles may change in distinct scenarios.As a result, the number of clusters must be treated as an independent variable which will be able to rise or fall in response to vehicle density and the number of vehicles.Clustering is done using the k-means method [20], which takes into account distance and direction, as well as message size, validity, and type.There were 2 kinds of control techniques employed: open-loop and closed-loop.Closed-loop techniques regulate congestion after it has been detected, whereas open-loop solutions avoid congestion before it occurs.Rather than using vehicles, RSU clusters messages using characteristics, the number of clusters, also the number of iterations as input.The number of clusters, on the other hand, is already allotted, as well as initial centroids were assigned on a first-come, first-served basis, which is unfit for cluster stability also for longevity.As a result, SD-K Mean Clustering is presented above, along with its algorithm; Algorithm 2. SD-KMEANS algorithm Step 1: Select cluster head randomly.
Step 2: Using the supremum distance metric (equation 1), compute the distance among the vehicles as well as the cluster head.
Step 3: The cluster head with the shortest distance from all other cluster heads is given a vehicle point.
Step 4: New point is calculated as follows: Where N denotes the number of vehicle points in the ith cluster.
Step 5: Each vehicle point's distance to the newly obtained point is recalculated.

Linear scaling based red colobuses monkey (LS-RCM)
The Red Monkey's behaviour is being mimicked by the RCM algorithmic software.To imitate these interactions, each cluster within the monkey area unit must wander over the search area [27].

Position update.
Each Red Monkey in a group's position is updated depending on the position of the group's best red monkey.This behaviour was discovered utilizing the following formulas: where, • PB denotes the monkey's body power (a number between −5 and 5); • PA indicates the monkey's battle power (a number between 0 and 1); • W leader signifies the leader's weight.
• W i specifies the monkey's weight random values between; • X i symbolizes the red monkey's position; • X best represents the leader's position.Rand, on the other hand, denotes any integer between [0, 1].
The following equations are used to update the position of the red monkey's children: where, • PBch stands for the child's body's power rate; • PAch represents the kid's fighting power rate; • WCh leader indicates the leader's child's weight.
• Wchi represents the child's weight, with all weights stated as random values in the range of [4,6]; • Xch indicates the child's position; • Xchbest defines the location of the leader child; • "rand" denotes a random number in the [0.1] range.In addition, in all iterations, this job must be renewed.

Artificial-bee colony
The artificial honey bee colony in ABC is split into 3 categories: employed honey bees, observer honey bees, as well as scout honey bees.The hired honey bees are taught to abuse food sources, while watchers select meals depending on the likelihood of continuing to abuse it, while scout honey bees' principal goal is to identify another food source once the hunt has been caught in a neighbourhood ideal.The region of an irregular food source contrasts to a stochastic arrangement of the growing problem, as well as the honey the quantity of the food resource communicates the wellbeing esteem.The number of hired honey bees is equal to the number of observer honey bees; the scout honey bees are only one.Suppose that in D-dimensional space, SN indicates the number of food resources, and Xi = (Xi1, Xi2, . . .XiD), i ≤ SN is the area of ith nourishment source.The following advancements are part of the process of using an artificial honey bee colony to search for the best food sources.
In basic ABC computation, every engaged honey bee and spectator honey bee develops a new source of food within the vicinity of their current scenario using the accompanying pursuit condition.A looker honey bee chooses a food resource according to the roulette wheel selection strategy throughout the pursuit phase and then used honeybees to share information about food resources.The following are the probabilities of value concerning food resources: Here F it V i,j is the food source value of the fitness function for the link V i,j , and the fitness equation is as follows: Where the F it function is a comparison of the food resource i, j with the goal capacity.If the food source isn't replenished beyond the furthest point periods, the scout honey bee will develop a new one: x V,j,j = l i,j + rand(0, 1)(u i,j , −l i,j ) ( Where the li, j and ui, j denote the upper as well as the lesser limits for measurement of the I j, individually.The target capacity could be characterized by the utilized condition if the ABC computation was being performed to execute a hub restriction.
When addressing the issue of hub restriction, the ABC technique seems to be an iterative technique for scanning an optimal solution within the arrangement space, as well as this will eliminate the dependency on the underlying worth.

PSO
PSO is a metaheuristic SI technique that optimizes using a stochastic population.It is based on real-life social behaviour such as fish schooling and birds flocking while searching for food.It imitates the physical movements of individual particles in a swarm, where each particle is led by its optimal position as well as the best position of all the swarms.It compares the initial solution with all of its adjacent neighbours to identify the optimal solution for each iteration.Each swarm particle moves within his or her own personal and global finest.
The packet is forwarded to the neighbouring car with the high fitness value, which is picked as the next forwarding vehicle.Iteratively, the operation is continued until the packet hits Dv.Each particle's fitness function is determined based on its position and velocity [15,16].The fitness function is shown in Eq. ( 1) below.
Where w1, w2 are weight parameters, ni, nj are cars, and dist(ni, nj) is the Euclidian distance between them.The average speed of cars to maximum speed (ni, nj) is the distance between nearby vehicles and speed(ni, nj).Each particle is treated as a vehicle, with its position referred to as particle position and determined using the Eq. ( 1).V1 and V2 indicate the particle's velocity, which is given in the Eq.(2).
Where p(t + 1) represents the particle's updated position, V1(t + 1) denotes the particle's updated velocity as in the current iteration, V1(t) denotes the particle's velocity in the prior iteration, w, c1, c2 indicate acceleration coefficients, r1, r2 signifies random uniform numbers in the range of 0-1, xdiff specifies particle's previous position (local best), and Xdiff g denotes the particle's updated position (global best).The Eq. ( 3) is used to determine a vehicle's speed.

Gravitation search algorithm
Isaac Newton's laws are the only topic of GSA.It's a fully revamped computer optimization study based on gravity's law.Both mass interaction and gravity were established by Rasedi.Instead, GSA sees each agent as an item, and all items communicate with humans through GF.The mass in proportion to the provided object is used to analyze the object results [8].The motion of a lighter mass object against a heavier mass object owing to GF with the highest benefit of best fitness is the ideal option in space search.The GF "F" stands for fitness, and it's also a fitness value for the particles in the finding space.The distance between the two objects in the search area is determined by "R." GC as "G" in the "k" generation is explored as follows: Both G0 and the superscripted "a" in the equation before it are constant variables.The variables G0 and "a" are initialized at the beginning."K" denotes the number of value objects present in the search space, while "k" indicates the total amount of value objects visited in the search space to date.Newton's Law of Gravity could be used to measure the severity of the gravitational force "F." This is a law that includes the GF, "F" magnitude concerning the masses of the particles.Every object's volume is divided into three categories: first: inertial mass, second: active GM, and finally, passive GM.First, the "Mi" value indicates the object's inertial mass; second, the "Ma" value denotes the object's active GM; and finally, the "Mp" value signifies the object's passive GM.Fij is a gravitational pressure in which the active GM Ma of chosen "object j" exerts some force on the reactive gravitational mass Mp of chosen "object I".The Gravitational Force is calculated as follows:

Grey wolf optimization (GWO)
The GWOA can predict the position of the prey as well as the positions of unseen NLOS nodes, letting their positions be estimated as well as surrounded (localized).It's advantageous throughout adjusting the spot of reference nodes across a network which accomplishes steps to keep moving towards an unknown In general, grey wolves (search agents) assault the prey until it stops moving (explore the search population, the search solutions do not change).The primitive vector "vc" represented in Eq is used to quantitatively model this property of grey wolves (search agent).Based on Equation 12, the random vector is thought to range between [−vc'] and [vc'], with its value decreasing from 2 to 0 for varying numbers of iterations.
If "vc" < 1' in this case, the search agents concentrate on the exploitation of the search space.If "vc" > 1', on the other hand, the search agents focus on the discovery of the effective location of NLOS nodes using reference nodes.The use of Alpha, Beta, and Delta wolf search agents is used to achieve this procedure of locating unknown NLOS nodes.

Biogeography based optimization
Dan Simon created the BBO for the first time in 2008.Each geographical zone of BBO is identified using the Habitat Suitability Index (HSI).The geographical attribution of biological species seems to be the driving force behind BBO.Suitability Variable (SIV) is an additional index that is used to describe the habitat region and living conditions.The habitat's fitness value is equal to the HSI rating and the number of species.The features of the higher HSI solution are adopted for improving the lower HSI solution.The habitat's immigration and emigration rates are lambda, as well as the singlespecies model is specified using Mue.Equations ( 2) and (3) illustrate the immigration and emigration rates,  accordingly.
Where I represent the highest immigration rate, the count of species inside the habitat is represented by k, while the maximal count of species within the habitat is indicated by sn.

Dragonfly algorithm
Dragonflies represent the smallest predators within the wild, hunting down practically every other tiny organism.Static and dynamic swarming practices are at the heart of the DA algorithm's inspiration.These two swarming techniques were substantially the same as the two main phases of meta-heuristic optimization: exploration as well as exploitation.Enhanced Function: The concept of ignoring a neighbourhood ideal is well-known to the Cauchy mutation operator.It is effectively prepared for reducing the chances of catching into a nearby optimum.The global best mutation probability is zero, and it improves as fitness lowers when this new mutation probability is applied.The concept of ignoring a neighbourhood ideal is well-known to the Cauchy mutation operator.vk depicts the velocity of the k-th individual.
V denotes the enemy's place among both food sources as well as the enemy.V + denotes the spot of the food source.(iii) Updating process: Two vectors are used to update the spot of artificial dragonflies within an inquiry space as well as imitate their development: step (-D) also the position (V).A progression vector in Particle Swarm Optimization (PSO) is quite similar to the speed vector.The DA method is built upon the structure of the PSO algorithm.The progression vector depicts the course of the dragonfly's growth and is labelled.

Optimal node selection
The Adaptive Neuro-Fuzzy Inference System (ANFIS) represents a hybrid form that integrates neural networks with fuzzy logic features.The settings associated with the membership functions in ANFIS will vary as the learning process progresses.A gradient vector is utilized to compute these parameters, which assesses how effectively the fuzzy inference system models the link between input and output for a given set of data.After obtaining the vector, an optimization pattern could be used to minimize the error.As a result, the Fuzzy Inference System (FIS) will be capable to predict the output of a new input domain.The block diagram of an Adaptive Neuro-Fuzzy Inference System is shown in Figure 4. ANFIS generates the fuzzy rule automatically as well as picks the rules with the highest firing strength.
Once the optimal relay node was selected by the proposed system, the large-size data block transmission has to be performed.To do that, several efficient and discriminate features such as speed, distance as well as link residual energy were extracted for all vehicle points.In the end, with the efficiently extracted features, multiple paths have been created, among which the optimal and energy-efficient path has to be identified for efficient transmission of data blocks.For optimal path selection, the rule-based Tanh-ANFIS algorithm has been introduced in the proposed system.In which, the tangent hyperbolic (tan h) activation function has been incorporated via an Adaptive Neuro-Fuzzy Inference System (ANFIS) [28].Table 5 depicts the performance analysis of various optimization methods for an end-to-end delay during the transmission of data from the instant of sending to the reception.Figure 6(a-d) depicts graphical representation of various methods under various transmission ranges like 100, 200, 300 m.Table 6 shows the overall analysis of various models over packet delivery ratio to achieve better performance.Figure 7(a-d) depicts the graphical representation of various models over the packet delivery ratio.
Table 7 illustrates the overall analysis of various optimization methods over cluster head time which perform the duty of the head cluster.If the method has a high velocity, it will decrease the lifetime of CH. Figure 8(a-d) depict graphical representation of various models over CH lifetime.

Conclusion
This paper brings an effective comparative analysis of node optimal selection.We proposed a novel method LS-RCM and contrasted it with other state-of-the-art models.We initially did the clustering and cluster head selection.Once these are selected, optimal relay node selection happens and it's the quintessential stage for energy optimization.This also gives the impact to further stages such as extraction and the data transmission blocks in VANET.Experiment evaluation happens over various node selection methods in which the proposed model outperforms very well over various measures.This paper will be also beneficial for other research specialists to dig deep and get an understanding and also bring an even more effective algorithm for better performance.

Figure 2 .
Figure 2. The architecture of the proposed framework.

Figure 5 .
Figure 5. Throughput of optimization models vs a) Density, b) Velocity, c) Packet size.
(i) Initialization: Set up the dragonfly population (vehicle nodes) in terms of Vi Vi = V1, V2, V3, . . .Vn, where i = 1, 2, 3 . . .n. (3) (ii) DA Behaviour Analysis: Dragonfly Behaviour: Separation, alignment, cohesiveness, allure towards a food source, as well as diversion radially outward from an opponent are the five steps that can be used to explain dragonfly behaviour.Description of the parameter: In separation, Sei represents the separation of i-th individual, V represents an individual's present position, Vk represents the position of the k-th individual, and N indicates the total count of surrounding individuals inside the search space.Ali denotes the alignment of i-th neighbouring person, while

Figure 6 .
Figure 6.End to end delay of optimization models vs a) Transmission 50 m, b) Transmission 100 m, c) Transmission 200 m, d) Transmission 300 m.

Figure 7 .
Figure 7. Packet delivery ratio of Models vs a) transmission 50 m, b) transmission 100 m, c) transmission 200 m, d) transmission 300 m.

Figure 8 .
Figure 8. CH life time of models vs transmission 50 m/s, b) transmission 100 m/s, c) transmission 200 m/s, d) transmission 300 m/s.
offer a k-means-based clustering approach to separate the vehicles into clusters.Three

Table 3 .
Benchmark instances of BBO and DA.

Table 4 .
Throughout the analysis of various optimization methods.

Table 5 .
Overall analysis of methods under end-to-end delay.

Table 6 .
Overall analysis under Packet delivery ratio.