Cloud based multicasting using fat tree data confidential recurrent neural network

ABSTRACT With the progress of cloud computing, more users are attracted by its strong and cost-effective computation potentiality. Nevertheless, whether Cloud Service Providers can efficiently protect Cloud Users data confidentiality (DC) remains a demanding issue. The CU may execute several applications with multicast needs. In Cloud different techniques were used to provide DC with multicast necessities. In this work, we aim at ensuring DC in the cloud. This is achieved using a two-step technique, called Fat Tree Data Confidential Recurrent Neural Network (FT-DCRNN) in a cloud environment. The first step performs the construction of Fat Tree based on Multicast model. The aim to use Fat Tree with Multicast model is that the multicast model propagates traffic on multiple links. With the Degree Restrict Multicast Fat Tree construction algorithm using a reference function, the minimum average between two links is measured. With these measured links, multicast is said to be performed that in turn improves the throughput and efficiency of cloud service. Then, with the objective of providing DC for the multi-casted data or messages, DCRNN model is applied. With the Non-linear Recurrent Neural Network using Logistic Activation Function, by handling complex non-linear relationships, average response time is said to be reduced.


Introduction
Current progresses in data-as-a-service (DaaS) and cloud computing maintain data from numerous sources into tremendous repositories for queries. However, CSP is frequently a third-party delegate of the data owner, results of query related to integrity is not guaranteed and hence indispensable be authenticated. Privacy-Preserving Authentication framework for Aggregate (PAPA) [1] queries intended authenticated aggregate queries. PAPA affords integrity of query results and preserves the DC. It comprises authenticated index Merkle Grid tree (MG-tree) which has primitive multiset operations and the combination of aggregate query processing algorithms. Thus, it lessens computation and communication overheads for server and client. However, the current cloud environment constructs virtual overlay network with the objective of enabling the communication amongst several virtual machines that in turn connects the physical machines wasting network resources and creating bottlenecks, compromising the throughput and efficiency of a cloud service. A potential solution is to design the Multicast model using Fat Tree structure. The Fat Tree structure for the multicast model has very good potential at propagating traffic on multiple links.
Another main topic receiving remarkable attention is Network Virtualization. To this extent, multicast in a cloud computing environment has become a distinguished research area, with precise observation to resource allocation issue of Multicast Virtual Networking (MVNs). Multicast Virtual Networking Embedding (MVNE) problem in [2] formulated Integer Linear Programming (ILP) with the objective to balance the load in the network. The MVNE first formed ILP.
With this, a 3-step approach, pruning the network, finding sub-graphs with potential feasible solutions and node mapping was performed. Finally, a Dynamic Programming (DP) approach was proposed to provide an optimal solution in polynomial-time with resource requirements in a homogeneous manner. However, with complex non-linear relationships, the average response time gets increased with the improper balancing of a load in the network.
A potential solution is to introduce Deep Neural Network (DNN) with which, a machine learning method, as opposed to task-oriented algorithms model complex non-linear relationships. Measures were taken to ensure DC for sensing big data streams using Selective Encryption [3] method. Despite DC, multicasting was not ensured. In [4], Cloud-based Data Centers were exploited to address multicasting, for minimizing the mapping cost.
Recently, several DC techniques have been designed in a cloud computing environment. However, communication between several virtual machines and non-linear relationships requests remains a major issue to be addressed. Therefore, we present a novel DC technique. Initially, the virtual overlay network that in turn connects physical machines wasting the network resources is addressed by introducing a Fat Tree Data Structure that propagates traffic on multiple links, therefore improving the throughput of the cloud service.
Next, the design of multicast model with Fat Tree Data Structure using Degree Restrict Multicast Fat Tree construction algorithm improves the efficiency of the cloud service. Then, to attain DC, Data Confidential Recurrent Neural Network model is used. Here, by only creating the simulated packet for each CU, DC is said to be attained. Finally, using Logistic Activation Function, average response time is said to be further improved.
This paper is ordered as follows. Section 2 describes the related work. Section 3 describes the DC technique. Section 4 formally analyzes the experiments and focuses on simulation and evaluation. Finally, section 5 concludes the work.

Related works
In the past few years, with the swift deployment of bandwidth sensitive cloud-based applications, the demand for traffic demand in Internet data centre networks has been increasing exponentially. Hence, it necessitates the requirements for multicasting. In [5], adaptive modulation selection was applied with a fixed approximation ratio to improve the quality of transmission in cloud-based applications. However, the mapping cost remained unaddressed. To solve this issue, Survivable Multicast Service Oriented Virtual Network Mapping (SMVNM) [6] was examined to minimize the mapping cost and mapping time. Despite minimization involved in mapping, the DC was not focused. Attribute-based Encryption and Decryption algorithm were designed in [7] based on hierarchical authorization structure, ensuring scalability and fine-grained access control.
With the increase in the availability of diverse remote sensors, it has now become possible to acquire a diversified variety of information from several materials on the Earth. To name some of them are information regarding spectral acquired through passive sensors, acquiring height and information regarding shape through a light detection and ranging (LiDAR) sensors for a cloud. In [8], Convolutional Neural Network with Deep Learning was investigated for accurate classification for cloud analysis. Yet another DNN using Hierarchical Fused Fuzzy component was presented in [9] for the data to be classified.
Ensuring reliability support for services hosted in cloud computing environments is a well-studied area that has gained the attention of researchers from the literature in recent years. However, the existing methods did not provide mechanisms for the communication mode that these hosted services may exhibit. In [10], the ILP model was designed with the objective of restoring failed services in terms of restorable ratio. However, the failure prone nature of CSP poses threats. To address this issue, Optimal Polynomial Time algorithm was investigated in [11] for multicast services residing in cloud networks. Yet another multicast tree was designed in [12] for maintaining requested Quality of Service (QoS), ensuring restoration ratio and resulting in considerably fast execution time.
In recent years, the requirement of resources is increasing exponentially and hence processing and maintaining enormous data in several fields remains a major concern to be addressed. However, acquiring these resources and conserving them is considered to be a tremendous task. With Cloud computing, these resources are said to be obtained on demand with meagre management effort. However, ensuring confidentiality of data is considered to be promising challenge. In [13], homomorphic encryption was applied using RSA ensuring DC. Certain issues related to DC, privacy, and integrity were presented in [14].
Cloud computing environment lessens computational cost and overhead for CU. Though, supplementary technical safeguards are presented to secure and confidential for CU. In [15], cryptographic functionality was exploited for user privacy, data minimization and authentication of stored and processed data.
Asymmetric scalar-product-preserving encryption scheme was studied in [16], to control the privacy of data. Successful processing of data and analysis for Geological Information Services was addressed in [17] with the aid of aggregation. Challenges related to security and privacy was analyzed in [18]. However, with the increase in the search space, computation overhead also got increased. To address this issue, Oblivious Similarity-based Search was applied in [19] by only encrypting the relevant data and therefore avoiding unnecessary computation. As a result, computation overhead was said to be minimized by applying an encrypted bloom filter and probabilistic homomorphic encryption. Yet another method based on parallelization was introduced in [20] to address the DC and computation overhead involved during the same process. This was said to be achieved using hypothesis model.
The telemetry system for the diagnosis of Asthma and Chronic obstructive pulmonary disease (COPD) was presented in [21]. However, the time diagnosis of the disease was high. Recovery algorithm for wormhole and the isolation of the black hole was introduced in [22] for enhancing the performance of the network. However, the attack detection rate was not improved. Secure framework for storing data on the cloud was developed based on an encryption scheme in [23]. But, the throughput was remained unaddressed.
Nevertheless, the flexibility of DC has not been fully explored for multicast provisioning. Therefore, in this work, DC is addressed by modifying the existing multicast virtual network in a cloud computing environment with the aid FT-DCRNN technique.

Methodology
In this section, a new technique named FT-DCRNN which is used to increase the DC by modifying the existing multicast virtual network method for efficient multicast in a cloud computing environment is presented. The proposed algorithm provides several advantages such as average response time, throughput, efficiency and adaptability during multicasting for cloud provisioning.
Here, intelligent nodes, called, primary cloud users, a relay node and edge nodes are used for constructing Fat Tree based on the average delay and to find the multicast model with low cost. Figure 1 given below shows the block diagram of FT-DCRNN technique.
Next, with the constructed Fat Tree, hidden layer vector and output vector with the activation function separately are identified to ensure DC. It is executed by discovers activation function at different time intervals, improving DC. FT-DCRNN technique implemented by Cloudsim simulation and confirms that the proposed algorithm affords better performance than the other existing algorithms.

Fat tree data structure
There are numerous types of content to be distributed in a cloud computing environment, such as newsfeed, updates related to social, weather forecasting information and so on. The CSPs distribute the content to only an insignificant fragment of cloud users. Then, this CU walk around and communicate other CU strategically and interchange the information through several means.
As far as throughput and efficiency of the cloud service are concerned, designing an efficient data confidential method that delivers the information to the intended CU becomes a multicast problem in a cloud computing environment. In a cloud computing environment, a relay node "RN = {rn 1 , rn 2 , . . . , rn n }" is an encountered node that store-carry-and-forward the message or information to the cloud servers (CS) or destinations swiftly with a small overhead. Therefore, designing a satisfactory formula or benchmark to identify good relay nodes becomes a demanding issue in cloud service multicast provisioning. This is performed in the proposed method using a Fat Tree data structure.
In a Fat Tree https://en.wikipedia.org/wiki/Data_ structure data structure, every branch has the same thickness or breadth, despite of its positioning in the hierarchy. In other words, every branch is said to be skinny, i.e. low-bandwidth. In a Fat Tree https://en.wikipedia.org/wiki/Data_structure data structure, branches that are neighbouring the top of the hierarchy are thicker than branches further down the hierarchy. In the cloud environment, the branches are otherwise referred to as data links and the heterogeneous thickness or breadth (i.e. bandwidth) of the data links allows for ensuring DC. This is attained by enabling communication between several virtual machines, therefore improving the throughput and efficiency of the cloud service. Figure 2 shows the Fat Tree data structure.
As illustrated in Figure 1, the Fat Tree Data Structure includes a three-degree tier, involving primary cloud user "CU", relay node "RN" and edge node "EN" respectively. The three-degree tier has a primary node or primary CU in the root of the tree, an assembling node or relay node (i.e. relay cloud user) in the middle and an edge node at the third level of the fat tree. In this work, the multicast problem in a cloud computing environment using Fat Tree Data Structure is considered.
Multicast is a service where a cloud user CU = {cu 1 , cu 2 , . . . , cu n } "CU = {cu 1 , cu 2 , . . . , cu n }" sends the message or information "M = m 1 , m 2 , . . . , m n " to all of the cloud service providers "CSP = {csp 1 , csp 2 , . . . , csp n }" or the destinations. In the FT-DCRNN technique, the problem of multicast is solved by proposing a Degree Restrict Multicast Fat Tree construction algorithm. The pseudo-code representation of Degree Restrict Multicast Fat Tree construction algorithm is given below. Measure average delay 'Avg Delay ' using equation (1)  4: Measure the number of relay nodes using equation (3) and (4)  5: Measure total number of data links using equation (5)  6: If message 'M' arrived then 7: If data link ∈SET 'EN − RN' THEN 8: Let 'DN− > Destinationnode' 9: Let 'SN− > Sourcenode' 10: Let next node = 'Lookup(DN, SN)' using equation (3) 11: Else 12: If link ∈ SET 'RN − CU' Then 13: Let 'NextDataLink' = unused available link 14: Let next node = 'Lookup(NextDataLink)' using equation (3)    As given in the above algorithm 1, in this Fat Tree data structure, that there is no single bottleneck for replication and traffic transmission while using a relay node. In addition, the Fat Tree data structure reduces the traffic overhead on the cloud network, and control the maximum and average delay, therefore increasing the throughput and efficiency of cloud service. Let us measure the average delay during multicast between "(i, j)" and "(m, n)". It is mathematically formulated as given below.
From the above equations (1 and 2), the average delay "Avg Delay " is measured using the relay index "RI(R j i , R j+1 i )", the "jth" index node on the path from the CU "i.e. source" to CS. Now let us rewrite "D a i " and "D t i " as follows.
From the above equations (3 and 4), a number of relay nodes during the entire period of transmitting a packet or message or information from the CU to the CS is given by "D a i ". On the other hand, "D t i " symbolizes the total number of links during the entire period of transmitting a packet or message or information from the CU to the CS. Then, the above equations (1 and 2) is rewritten as follows.
The function "Lookup", that occurs in step 10, step 14 and step 17, is most important in this algorithm. It is a reference function used to assign the minimum of average delay between to data links. It is mathematically formulated as given below Lookup = MIN(Avg Delay (i, j), Avg Delay (m, n)) (6) From the above equation (6), the result of the lookup function remains the minimum of the average delay between two data links. The minimum average delay incurred is considered as the final data link through which multicast is said to be performed. In this way, the throughput and efficiency of the cloud service are said to be improved.

Data confidential recurrent neural network model
Preserving DC is a critical issue when disseminating microdata related geographical oceanographic and surface meteorological readings for public-use in a cloud computing environment. Despite throughput and efficiency being achieved using Fat Tree data structure, preserving DC, with complex non-linear relationships has to be addressed. In this work, the problem is approached by using Recurrent Neural Network model that produces simulated data closely related to raw data as given by CU but different for each item. As only simulated data are produced for the closely related raw data, both average response time and DC rate is said to be improved. Figure 3 shows the Data Confidential Recurrent Neural Network model. The above figure involves the input layer, hidden layer, and output layer. From the above figure, for four input layer "x 1 , x 2 , x 3 , x 4 ", with two hidden layers "H 1 and H 2 " exploit complex non-linear relationships resulting in the output layer "y 1 , y 2 ". The pseudo-code representation of Data Confidential Recurrent Neural Network is as given below. Measure hidden layer vector using equation (7) 4:

Algorithm 2. Data Confidential Recurrent Neural Network algorithm
Measure output vector using equation (8)  5: Measure activation function for hidden layer vector using equation (9) 6: Measure activation function for output vector using equation (10) 7: End for 8: End As given in the above algorithm, let "x(t)" and "y(t)" represents the input layers for time series "t" respectively with weight matrices representing "W ih ", "W oh " and "W hh ". Here, "W ih " corresponds to the weight between the input and hidden layers, "W oh " corresponds to the weight between output and hidden and "W hh " corresponds to the weight between two hidden layers. The hidden layer vector and output vector is symbolized as below.
From equations (7 and 8), hidden layer vector "H t " and output vector "Y t " are acquired from the activation function of hidden vector "AF H " and output vector "AF Y " respectively. To this, also, the hidden vector "H t " and input vector "x t " are also considered, where "W", "Y " and "b" corresponds to the parameter indices. Figure 4 shows the non-linear recurrent neural network structures at different time intervals. As shown in the above figure, Non-linear Recurrent Neural Network is employed to send the data to the cloud users with higher confidentiality. The input layer consists of time series with a weighted matrix. The initial or first layer has a weight which acquired from the input layer, each layer acquiring the weight from the previous layer. Logistic Activation Function in the hidden layer and the output layer is used to safeguard CU data with the objective of reducing average response time. It is mathematically represented as given below.
From the above equations (9 and 10), with separate activation function for both the hidden layer vector and output layers at different time intervals and also using simulated data, DC is said to be achieved.

Simulation and performance analysis
This section analyzes the impact of the proposed FT-DCRNN technique via Cloud Sim simulator. The cloudsim goal is to provide a global and extensible simulation framework that compares the proposed FT-DCRNN technique with the existing PAPA [1] and MVNE [2]. The experiment is conducted to measure and evaluate the FT-DCRNN technique on the factors such as throughput, efficiency, average response time and DC. Dataset exploited is ENSO dataset. ENSO dataset comprises Wind, Oscillation Index, Sea Surface Temperature and Outgoing Long Wave Radiation. The dataset comprises oceanographic and surface meteorological readings acquired from series of buoys located in equatorial Pacific. With this dataset, CU spread throughout the world interacts with the other CU through CSP in predicting weather conditions, variables possessing greater effect on the climate variations and so on.

Performance evaluation
Performance of FT-DCRNN technique is evaluated in this section and compared with state-of-the art methods.

Impact of throughput and efficiency
Throughput and efficiency of FT-DCRNN technique is evaluated with PAPA [1] and MVNE [2] using the number of user requests as input in the cloud environment in the comparison experiments. The two imperative factors are considered for evaluating infrastructure services afforded by the cloud environment are Throughput and Efficiency. Throughput computes number of CU requests or number of tasks finished by CSP at a particular interval of time. Several factors influence the rate of throughput and hence resulting in a positive or negative impact on the task execution.
Let us consider a scenario with CU possessing "i" tasks or user requests and it submitted to execute on "j" machines from CSP. Consider "Exe time " corresponds to the execution time of "i" tasks on "j" machines and let "OH time " symbolizes overhead time, owing to numerous factors like cloud service environment initiation delays and inter-task communication delays. Thus throughput of Cloud service is given below.
From the above equation (11), the throughput value "T" is measured based on the execution time "Exe time " and the overhead time "OH time ". Table 1 given below provides the throughput and efficiency distribution of all the three techniques, FT-DCRNN, PAPA [1] and MVNE [2]. This experiment is set to verify the high effectiveness of the throughput against the CU requests placed in a cloud environment. In this experiment environment, the Fat Tree Data Structure is combined with the Multicast model to achieve the high effectiveness of the throughput. The throughput is compared with PAPA [1] and MVNE [2] by capturing the same amount of CU tasks. Table 1 given above provides the comparison of throughput for diverse number of CU requests in setup phase, from which it is inferred that: 1) the throughput is proportional to the number of CU requests, and 2) to handle the same number of CU requests, FT-DCRNN technique achieves higher amount of throughput than PAPA and MVNE. From the table, rate of throughput increases with th eincrease in number of CU requests and includes insignificant gaps when the number of CU requests is greater than 40. That is to say, the rate of throughput in FT-DCRNN technique is higher than those in PAPA and MVNE. This is because of the application of Fat Tree Data Structure in the FT-DCRNN technique. Motivations for which is twofold. Initially, it stores each data in CS using linear programming methods over various substrate networks. On the other hand in FT-DCRNN technique, a cloud service multicast provisioning was applied to measure the relay index. The relay index was measured using Degree Restrict Multicast Fat Tree construction algorithm depending on the agreement between CU and CSP.
Second, for performing multicast, existing PAPA and MVNE techniques used virtual overlay network for communication between CU and CSPs, which is said to be affected in the case of the huge traffic. On the other hand, in FT-DCRNN technique, multicast was first said to be performed based on the average delay. Then, with the minimum average delay, and then the Fat Tree Data Structure was constructed. This, in turn, improved the throughput using optimization for storage was first said to be attained using FT-DCRNN technique by 3% compared to PAPA and 6% compared to MVNE.
Next, the cloud system efficiency of FT-DCRNN technique is evaluated and compared them with PAPA and MVNE based on a number of CU requests in a cloud environment. Cloud system efficiency indicates the capable use of lease services. Hence, higher efficiency denotes lower overhead. System efficiency is mathematically given as below.

SE =
Exe time (i, j) Exe time (i, j) + OH time (12) From above equation (12), while measuring the cloud system efficiency "SE" both, execution time "Exe time " and overhead time "OH time " are considered. Cloud system efficiency during multicast is one of the challenges to be addressed during rendering cloud service. The comparison of system efficiency for FT-DCRNN technique is measured and compared with PAPA and MVNE is listed in Table 1. The results reported in Table  1, prove that with enhance in the number of CU request, the efficiency is enhanced. As provided in Table 1, the FT-DCRNN technique performs relatively well when compared to two other techniques PAPA and MVNE. The system efficiency using FT-DCRNN technique is improved by comparing the requirements of the CU requests with that of the data link based on the relay node and edge node. The advantage of arriving at data link using Degree Restrict Multicast Fat Tree construction algorithm remains in deriving the next node or next data link, that avoids the single CU request being held up by the CS, given scope for other CU requests. So, every user is ensured with message transmission and therefore improving the overall efficiency of the cloud using FT-DCRNN technique by 3% compared to PAPA and 6% compared to MVNE.

Impact of average response time
The efficiency of cloud service availability is evaluated with average response time. It is evaluated as given below.
From above equation (13), the average response time "AvgRes time " is measured with the ratio of time between cloud users "i" requested for a service or sent a message to another CU and is actually accessible. The total number of service requests are symbolized by "n". It is evaluated in milliseconds (ms).  Besides, by increasing the number of cloud users, data to be allocated to the CU gets increased and therefore, the average response time also gets increased. Hence, the number of CU is inversely proportional to the average response time. This is because of the application of Data Confidential Recurrent Neural Network model. By applying Data Confidential Recurrent Neural Network model, the CSP presents only the simulated data closely interconnected to the input raw data as given by cloud user. But the simulated data produced by the CSP is different for different cloud users. With this generated simulated data average response time using FT-DCRNN technique is reduced by 15% compared to PAPA and 29% compared to MVNE.

Impact of data confidentiality
DC in the cloud computing environment or cloud service provisioning refers to protecting information or message from being accessed by unauthorized cloud users. In other words, only the CU who is authorized to do so can gain access to sensitive information or message. Table 2 summarizes the DC achieved using FT-DCRNN technique, PAPA and MVNE respectively. In this table, DC achieved for the different number of CU ranges from 20 to 200.
As provided in the above table, the FT-DCRNN technique provides better DC level as compared to PAPA [1] and MVNE [2]. Besides, while increasing the number of cloud users, DC also gets increased, but the DC level fell down with 100 CU using FT-DCRNN, 120 CU using PAPA and 100 CU using MVNE. But comparative analysis shows that the DC using FT-DCRNN technique is improved as compared to other existing techniques. This is owing to Data Confidential Recurrent Neural Network algorithm in FT-DCRNN technique where only simulated data are obtained by the CSP using Recurrent Neural Network method. Finally, with separate activation function for both the hidden layer vector and output layers at different time intervals, DC using FT-DCRNN technique is said to be improved by 7% when compared to PAPAP and 8% when compared to MVNE respectively.

Conclusion
FT-DCRNN technique is designed for attaining DC in a cloud environment. This technique improves the throughput and efficiency by performing multicasting with the help of Fat Tree data structure. The technique uses Degree Restrict Multicast Fat Tree construction algorithm for optimal communication amongst several virtual machines. By applying the multicast model in FT-DCRNN technique, traffic on multiple links are readily propagated. Finally, by applying Recurrent Neural Network simulated data are produced, that is then sent to the cloud users, ensuring DC and minimizing average response time. A series of experiments are conducted to test the throughput, efficiency of cloud service, average response time and DC. The throughput and efficiency of cloud service are improved to 9% with the average response time being reduced to 29% when compared with the state-of-the-art methods.