Multi-keywords fuzzy search encryption supporting dynamic update in an intelligent edge network

In an intelligent edge network, data owners encrypt data and outsource it to the edge servers to prevent the leakage of data and user information. It is a research issue to achieve efficient search and data update of the ciphertext stored in the edge servers. For the above problems, we construct a fuzzy multi-keyword search scheme based on a two-level tree index, where the first-level tree index stores keyword tags and the second-level stores encrypted file identifiers and counting bloom filters (CBFs). By constructing a two-level tree index, efficient multi-keyword search is realised, and it also has obvious time advantages when performing the single-keyword search. In addition, the proposed scheme supports index updating by introducing the CBF, and it can realise fuzzy search by calculating the inner product between the CBF in index and search trapdoor. At last, the proposed scheme is proved to be semantically secure under the known ciphertext attack model. Theoretical analysis and simulation test indicate that the proposed scheme spends lower computation overhead than other related schemes, especially in the search phase; as the number of matching files or query keywords grows, it costs less time, at around 2–10 ms.


Introduction
As mobile devices and the Internet of Things develop, mobile communications have brought huge storage and computing pressure to cloud servers, and some new server models have sprung up, such as an intelligent edge network. Information perception has mushroomed into mobile devices and the Internet of Things. It is a hot point to store and calculate large amounts of data in the intelligent edge network (Hu et al., 2021;Mao et al., 2017;Zhao et al., 2021). Intelligent edge computing can solve the data processing problems of the cloud centre and edge servers, which relieves the computing and storage pressure of the cloud server, but privacy leakage has also become an urgent problem for the intelligent edge .
In some scenarios, the edge server is honest but curious, so the private data and identity information outsourced to the semi-honest edge servers can not be guaranteed security.
Edge servers can process the data as required, but these edge servers are able to record and analyze private information. Data encryption technology has become a common method that can effectively prevent privacy leakage . However, it cannot support the exact retrieval operation on ciphertext in edge servers. The proposed searchable encryption technology (Dawn et al., 2002) realises the purpose of protecting data privacy, and supports authorised users to search for ciphertext in the cloud server.
The data and tasks are processed by the edge server that is close to the data source on the edge computing, and the edge server may be any entity from the data to the data source. The data owner outsources the encrypted data to the edge centre, and the nearest edge server receives the requests and processes the requests locally, which can reduce the computational burden of the edge centre. Besides reducing the processing pressure on the edge centre, problems with data storage can also reduce the possibility of information leakage, namely, storing only secure indexes on the edge servers. In recent years, many researchers have done some excellent work on searchable encryption, which focuses not only on security and efficiency, but also on functionality, especially in the multi-keyword fuzzy search.
We propose a multi-keyword fuzzy search scheme supporting dynamic updates in the intelligent edge network, which solves the problems of encrypted data retrieval and dynamic data updates. The query request is submitted and processed by the nearest edge server, which would reduce the time and avoid information leakage in the transmission process. In addition, our scheme provides secure search services to edge computing, while edge computing speeds up query processing and relieves the pressure on the edge centre. As follows, there are three main contributions to our paper.
• We develop a novel two-level tree index based on counting bloom filter (CBF). We put the file identifiers containing the same keyword under the leaf node and generate the CBF for each file identifier node. With the two-level tree index, the proposed scheme can provide a more efficient search service for multi-keywords. In particular, when searching for a single keyword, the proposed scheme only needs to search the first-level index to obtain the results. • We use CBF to realise multi-keyword fuzzy search, what is more, users can set the threshold to control the accuracy of the fuzzy search. Due to the changeability of CBF, we can realise the addition and deletion of CBF in the index. In order to demonstrate that the proposed scheme supports a dynamic update, the correctness of the add and delete operation is proved. • It can be proved that the proposed scheme is semantically secure under the known ciphertext and background models. Moreover, theoretical analysis and simulation results indicate that the proposed scheme has a low overhead in the search phase.
The remainder of this paper is organised as follows. Section 2 introduces the related works. Section 3 gives some preliminaries including 2-gram, Locality-Sensitive Hashing and CBF. Section 4 shows the system model and the security model. Then, we present the details of the proposed scheme in Section 5, and prove the correctness and security of the proposed scheme in Section 6. Section 7 analyzes the performance through theory and experiment. Section 8 presents the conclusions of our scheme.

Related work
There are two main methods to achieve fuzzy search at present. One is to pre-generate a set of fuzzy keywords for each keyword, the other is to quantify the similarity of keywords. Xue et al. propose a fuzzy search scheme, which constructs a fuzzy keyword set for each keyword by introducing the cuckoo filter (Xue & Chuah, 2015). Ge et al. construct a linked list as the secure index, which generates a fuzzy set for each index vector (Ge et al., 2018). Although these two schemes achieve higher search accuracy and search efficiency, the required time and the storage of index generation are huge.
In order to reduce the generation time and storage overhead of indexes, some schemes have been proposed that are dedicated to quantifying the similarity between keywords. These schemes have not only improved the performance in efficiency, but also better performance in security. Chen et al. propose mapping keywords into Bloom Filter, and then calculating the inner product of keywords to quantify the similarity . In addition, the scheme constructs forward and inverted indexes and stores the keywords by Bloom Filter, which improves the search efficiency but does not support dynamic updates. Zhang et al. propose using Tanimoto distance to quantify the similarity between keywords, which employs the word2vec algorithm to construct a fuzzy matching model, but it is only suitable for small data sets . In addition, Guo et al. propose a similarity search scheme over encrypted non-uniform datasets, which has a higher recall and precision and improves the security by hiding the distribution of the query set (Guo et al., 2020). Zhang et al. propose measuring the degree of keywords similarity by edit distance, which controls accurate search results by threshold . However, Zhang's scheme needs a private cloud to rank the search result. Liu et al. realise the approximate keyword matching by exploiting the indecomposable property of primes and the extended scheme realises the semantic search by calculating the matrix multiplication between keywords (Liu et al., 2021). Because this scheme constructs an index matrix for each document, it cost a lot in the index generation.
In order to resolve the dynamic update, Liu et al. propose a verifiable searchable encryption and construct a two-keyword index to realise a dynamic update (Liu et al., 2018). However, this scheme needs to determine the authorised user set in the initialisation and cannot authorise users in subsequent phases. Wang et al. propose a fuzzy searchable encryption that constructs a dynamic index by pseudo-random padding and realise fuzzy search by extended fuzzy Bloom Filter, which realises querying in high-dimensional data (Wang et al., 2018). Wang et al. put forward a dynamic and verifiable search scheme, which achieves dynamic updates by adding and deleting tree-index nodes and constructs Merkle tree to check the completeness and correctness of results (Wang et al., 2020). However, the establishment and updating of the Merkle tree entail much time. Li et al. propose a dynamic searchable symmetric scheme, in which the original documents' index needs no change during the update processes so that the cost of updating process is constant . However, this scheme requires a re-encryption operation, which takes up a certain amount of overhead.
In short, the existing fuzzy multi-keyword search schemes are difficult to achieve dynamic updates under the premise of ensuring high efficiency and security. Subsequently, we develop an efficient multi-keyword fuzzy search scheme that supports a dynamic update.

2-gram
n-gram is a technique for dividing a given string into sequences of segments of length (Gervás, 2016;Lavanya et al., 2020). Our scheme chooses 2-gram,which cuts the keyword into strings of 2 characters long. For example, for the keyword "search", the 2-gram sequence is {se, ea, ar, rc, ch}. Then, store the sequence in an array of size 26 × 26.
The first and second bits of each element in the sequence, according to the alphabetical order, set the corresponding row and column of the matrix to 1. Finally, the binary sequence is converted into a 26 × 26 dimensional matrix. For example, in the element "se", s is the nineteen in alphabetical order and e is the fifth; then we set the nineteen row and the fifth column of the matrix to 1. Following the method, a keyword is converted into a matrix.

Locality-sensitive hashing
Locality-sensitive hashing (LSH) (Wong et al., 2009) is a text similarity measurement algorithm that can select a similar element from a lot of elements. LSH is able to map similar parameters into the same bucket quite probabilistically. A LSH hash function family H is (P 1 , P 2 , R 1 , R 2 ) − sensitive and is defined as follows: where d(m, n) denotes the distance between m and n, R 1 and R 2 denote distance thresholds, and P 1 and P 2 denote probability thresholds. LSH has many application scenarios, such as p-stable distribution. Our scheme uses a p-stable distribution LSH, and it is defined as follows: where s is a vector and t is randomly selected from [0, k], k ∈ R. The p − stable LSH family puts similar vectors into the same bucket, then realises separating similar vectors from other vectors.

Counting bloom filter
Bloom filter (BF) (Lavanya et al., 2020) is a set composed of m-bit array, which is used to detect whether an item belongs to the group by mapping it to the binary vector via k hash functions. As shown in Figure 1, CBF (Lavanya et al., 2020;Wu et al., 2021) is a data structure based on the Bloom Filter, which expands a counter for each bit in the BF. To determine whether an item belongs to the group, the item needs to be processed by k hash func-  is increased by 1. When an element is deleted, the value of the corresponding position counter is decreased by 1. Figure 2 shows the system model; the system is composed of three different entities: data owner, data user and the edge network. First, the data owner extracts the keyword set from the file set and encrypts the local files, then outsources them to the edge centre. At the same time, the data owner constructs a secure index of keywords and files and uploads it to the edge network. Second, the data user submits the search trapdoor to the nearest edge server, and the edge server performs the search operation. Then, the edge server requests the corresponding encrypted documents from the edge centre to send to the data user. Finally, the data user uses the symmetric key to decrypt the ciphertext result returned by the edge server.

Security model
It is assumed that in the scenario, the edge server is honest but curious, which means that the edge server will honestly execute every step of the algorithm, but it would record and leak file content, keywords and other information. In view of the need for privacy protection, two main threat models are analyzed as follows (Qi et al., 2020). Known ciphertext model: The edge server can only know and record encrypted files, secure indexes, search trapdoors and search results.
Known background model: The edge server can obtain additional background information, such as the correlation of two search trapdoors. Through analyzing the correlation relationship on a large scale, the edge server can infer the keywords in a query. When the adversary A knows the size of the file set |D|, the number of keywords |W|, the inner product of the search trapdoor and the index and the encrypted file C k , a simulated game between a simulator B and an adversary A is used to prove that the scheme is secure. We define a leakage function to describe the leakage situation and the leakage function is defined as LI = {|D|, |C|, CBF q · I 2 [id], C k }. As follows, the simulation game is defined.
Real π A (λ): Challenger C initialises the system, then adversary A sends file set D to the challenger C. According to the encryption algorithm, the challenger C generates the security index I and the encrypted file C k , then sends them to A. Adversary A performs a polynomial number of queries Q = (w 1 , w 2 , . . . , w q ), and then the challenger C generates a trapdoor T w j for each query w j ∈ Q and sends them to A. Finally, adversary A outputs I, C k and T, where T = (T w 1 , T w 2 , . . . , T w q ).
Ideal π A,B (λ): The adversary A outputs file set D and gives a leak function LI. The simulator B generates indexes I and encrypted file set C k , then sends them to the adversary A. When receiving I and C k , A performs a polynomial number of queries Q = (w 1 , w 2 , . . . , w q ). Based on the leak function LI, the simulator B returns the corresponding search trapdoor T w j for each query w j ∈ Q, and sends them to the adversary A. Finally, adversary A outputs I , C k and T , where T = (T w 1 , T w 2 , . . . , T w q ).

Overview
In this section, we construct a multi-keyword fuzzy search scheme with a dynamic update, which is based on a two-level tree and CBF. In addition, the files uploaded by the data owner are encrypted by symmetric encryption. We encrypt the files by the AES encryption algorithm with a 256-bit key, which ensures that the encrypted files are secure.

Scheme detail
Our scheme is constructed by six polynomial-time algorithms, i.e. setup(λ), GenIndex(D, SK, K), GenTrapdoor(SK, W q ), Search(I, Tw q ), Decrypt(K, C k ), and Update(d , SK, K). The detailed structure of the scheme is as follows. (1) Setup(λ): The data owner generates two random invertible matrices M 1 , M 2 ∈ R m×m and a random vector S ∈ R m that length is equal to m, where m denotes the length of CBF. Given a secret parameter λ, the algorithm will generate K 1 , K 2 ← {0, 1} λ and output the secret key SK = {K 1 , K 2 , S, M 1 , M 2 }. The data owner generates a secret key K ← {0, 1} λ to encrypt the file set. We define the LSH hash function where n denotes the number of files.
(2) GenIndex(D, SK, K): As shown in Figure 3 near here, the data owner generates a twolevel tree. The algorithm maps the file set and the keywords selected from the set to the two-level tree, and then encrypts the two-level tree to generate a secure index I = (I 1 , I 2 ). The construction process of the two-level tree index is as follows.
For each keyword w i ∈ W, w i is converted to a temporary value v i by computing a 26 × 26 bit array generated by a 2-gram sequence and using a LSH function f(), v i is transformed into a keyword tag tag i by a pseudo-random function, i.e. tag i = F (K 1 , v i ). After processing all the keywords, the keyword tags set Tag = {tag 1 , tag 2 , . . . , tag n } is stored in the first-level tree index, where the position of the keyword tag is related to the number of corresponding files. The fewer the files are, the farther left the keyword tag is on. For each keyword tag tag i ∈ Tag, the file set containing the keyword w i corresponding to tag i is F i . For each identifier fid j (j ∈ [1, |F i |]) for each file in F i , the data owner encrypts fid j by a symmetric encryption algorithm and gains the encrypted identifier, i.e. e j = Enc(K e , fid j ), where K e = F (K 2 , v i ). After that, the data owner stores e j in the second-level tree index, i.e. e j → I 1 [tag i ]. The data owner attaches a CBF to each e j node and employs k hash functions (3) GenTrapdoor(SK, W q ) A data user first searches multi-keywords W q , then processes each keyword to a temporary value v i by using the same method as GenIndex().
Choosing one temporary value v randomly, the data user transforms v to the keyword tag tag using the same way as in GenIndex(D, SK, K), maps all temporary values other than v to one CBF by k hash functions, i.e. h 1 (v), . . . , h k (v) and sets CBF q [j] = 1, j ∈ {h 1 (v), . . . , h k (v)}. Initialise two m-bit vectors CBF and CBF , then execute the following operations.
Data user sends the search trapdoor Tw q = {tag, M 1 −1 · CBF q , M 2 −1 · CBF q , K e , thr} to the edge server. However, when searching a single keyword, the data user only sends tag and K e as a trapdoor to the edge server.
(4) Search(I, Tw q ) The algorithm is executed by the edge server. When obtaining the search trapdoor, the edge server matches the trapdoor's tag and the first-level tree index's tag. If the trapdoor only contains the keyword tag, the algorithm returns all encrypted file identifiers under the tag to FID q , where FID q denotes the encrypted file identifier set. When the trapdoor contains not only the keyword tag, the edge server finds the encrypted file identifier matched successfully by the trapdoor's CBF and index's CBF. For each encrypted file identifier e i under the matched tag, the edge server calculates the inner product of the trapdoor's CBF and the identifier's CBF, and if the inner product is more than the trapdoor's thr, it means e i is matched successfully and returns e i to FID q . For each encrypted file identifier's CBF under the matched tag, the matching operation is as follows.
If I T · CBF q ≥ thr, it means the match is successful. For each e i ∈ FID q , the edge server decrypts e i to gain the file identifier fid i by K e , i.e. fid i = Dec(e i , K e ) and returns the encrypted file to the set C k in line with fid i . Finally, the edge server returns C k to the data user.
(5) Decrypt(K, C k ): When receiving C k , the data user executes the algorithm and decrypts it by the secret key K.
The edge server adds an index generated by the addition operation to the existing index and generates a new index, i.e. I up,2 = I 2 + I 2 = {M T 1 (I 2 + I i ), M T 2 (I 2 + I i )}.
Analogously, when deleting the file d , the data owner extracts keywords W = (W , W ) from d , where W represents the keyword set only existing in d and W means the keyword set existing in several files. For each w i ∈ W , the data owner deletes all the child nodes which belong to the keyword tag tag w i converted from w i . For each w j ∈ W , the data owner deletes e d and the corresponding CBF which is under tag w j converted from w j . Similarly, the edge server deletes the index generated by deletion operation from the existing index and then obtains the new index.

Correctness analysis
Since CBF is used when constructing the index and the search trapdoor, it can well support the keyword update. Taking the adding operation as an example, the index that is generated by adding operation is added to the leaf nodes of the two-level tree index to generate an updated encrypted index. Here, it is proved that CBF supports keywords update. The specific analysis process is as follows.
is the result of mapping updating keywords and existing keywords to CBF, which is equal to the index of the updated CBF. It can be deduced that traversing j(j ∈ 0, 1, . . . , m − 1), the final result is Q · I up,2 .
Similarly, it can be proved that when the data owner deletes the file, the result is also true. It can be seen that the index constructed by CBF can realise the addition and deletion of keywords.

Security analysis
Theorem 6.1: Since F() is a secure pseudo-random function, the proposed scheme is semantically secure under the known ciphertext model. The edge server only knows the number of keywords and files, as well as the inner product of the search trapdoor and the secure index and encrypted files. Except for these, the edge server cannot obtain other information.
Proof: In order to prove Theorem 6.1, we imitate a simulator B that cannot be distinguished from the actual scheme. As follows, the secure game between adversary A and challenger C is defined under the known ciphertext model.
(2) C randomly selects b ∈ {0, 1}, and then computes where negl(λ) is negligible. Next, A will analyze the ciphertext, the index and the discrimination of the search trapdoor generated by B to win the game, which proves that Theorem 6.1 is valid.
(1) Simulated encrypted file set C . B uniformly randomly generates n, a simulated encrypted file set C = {C 1 , C 2 , . . . , C n } with a length of |D i | i∈ [1,n] bits. Because the files are encrypted by the symmetric encryption algorithm, it can ensure that C and C b are indistinguishable, i.e. | Pr[Enc(D, (2) Simulated secure index I . Because the CBF is processed by the KNN algorithm, Chen's scheme has proved that the algorithm is secure under the known ciphertext model , so I 2 and I 2 are indistinguishable. Next, the process of simulating I 1 is defined as follows.
(a) Initialised I 1 as a two-level tree structure; |W|], then store tag i in the first-level tree's node, and initialise the second-level tree's node as is a random sequence of w i ∈ W.Since a pseudo-random function is used to process keywords, it ensures tag i be secure. What is more, because the symmetric encryption algorithm is used to encrypt Fid j , it makes sure e ij is secure. Without revealing the secret key SK, A cannot distinguish between I and (3) Simulated search trapdoor Tw . For each q i ∈ Q, i ∈ [1, n], generate K i $ ←− {0, 1} λ , then transform q i to tag R[i] , encrypt Enc(CBF q ) by KNN encryption algorithm, and generate thr i = Min(ip) i , where ip is the inner product of CBF q and I 2 [Fid]. Finally, generate the search trapdoor Tw i = (tag R[i] , Enc(CBF i ), K R[i] , thr i ), where thr i is the smallest inner product of CBF q and I 2 [Fid]. Since the keywords are processed by a secure pseudorandom function F() and CBF q is encrypted by the KNN encryption algorithm, tag i and CBF q are secure. Without revealing the secret key SK, A cannot distinguish between Tw i and Tw b , i.e. | Pr[GenTrapdoor (SK, W q As A tries to analyze the ciphertext, index, and search trapdoor to win the game, Adv(A(C)) represents the advantage of A to distinguish between encrypted files and simulated encrypted files, Adv(A(I)) represents the advantage of A to distinguish between the real index and simulated indexes, and Adv(A(Tw)) represents the advantage of A to distinguish between real search trapdoors and simulated search trapdoors. The probability advantage that A can win the game is defined as follows.

Proof:
Under the known background model, we allow the edge servers to record and analyze some query trapdoors. Edge servers analyze the connection between trapdoors by observing the trapdoors that have been collected. When the user submits the same query twice, there is a high probability of selecting different keywords to generate keyword tags, which will cause the different values of CBF generated by the two queries. In addition, in the trapdoor generation phase, the different random values t will be chosen for KNN encryption, so the different trapdoors for the same query will be generated. Any PPT adversary cannot perform linear analysis of the KNN encryption algorithm. Therefore, the trapdoors cannot be linkable, which ensures that edge servers cannot analyze whether the queries are identical or not (Xiao et al., 2020). It is demonstrated that the proposed scheme is semantically secure under the known background model.

Theoretical analysis
As shown in Table 1, we compare the proposed scheme with Chen's scheme, Fu's scheme and Zhong's scheme in function realisation and the computation cost Fu et al., 2017;Zhong et al., 2020). Chen's scheme can support fuzzy search, but not realise dynamic updates. Fu's and Zhong's schemes implement fuzzy search and dynamic update, but the efficiency is lower than the proposed scheme.
In the comparison of the computation cost, n indicates the number of files, |n| represents that all files are traversed once, m represents the number of keywords extracted from each file, H represents the time of LSH operation, E 1 represents the time used to execute symmetric encryption once and B represents the time used to establish a Bloom Filter. C denotes the time required to establish CBF once, where C is slightly larger than B. G denotes the average number to perform a depth-first search. T represents the time used to calculate the correlation of a keyword and a file, M represents the time used to calculate the inner product once, s denotes the number of query keywords, and t denotes the average value of files containing the same keyword, t n. In the index generation, Fu's scheme requires LSH to deal with all keywords, computes the relevance score of each keyword and the files, and then generates one Bloom Filters for each file. Chen's scheme uses LSH to generate keyword tags for each keyword, and all tags should execute symmetric encryption, and then it needs to generate Bloom Filter for each file. For Zhong's scheme, it needs to calculate the relevance of keywords and documents to realise top-k search, which consumes some extra time of index generation. Our scheme is similar to Chen's, but we use CBF in the index generation stage. As C is slightly larger than B, the cost of our scheme's index is slightly larger than Chen's. However, CBF supports a dynamic update, so it is worth sacrificing less time overhead to get the update function.
In generating search trapdoors, both Fu's and Zhong's schemes perform LSH operations on each query keyword, map the query to BF, and calculate the relevance score for the query keyword. In Chen's scheme, if a user queries a single keyword, the scheme only needs to generate a keyword tag for that keyword; if the user queries multiple keywords, the scheme needs to perform LSH operations on all keywords and generate a BF. Our scheme is similar to Chen's scheme, except that we utilise CBF instead of BF.
In the search phase, Fu's scheme needs to compute the inner product of the trapdoor's BF and all files that require a large computational cost. Chen's scheme requires t times symmetric decryption operation in the first level of the index, and traverses |n| times to find the corresponding file identifier, and then performs the inner product calculation of the trapdoor's BF and the index's BF. Zhong's scheme computes the inner product of trapdoor's BF and index's according to the depth-first strategy. In the proposed scheme, when querying a single keyword, the edge server searches all file identifiers under the tag, and the calculation cost is constant. When querying multiple keywords, the edge server only needs to calculate t times the inner product of the index's CBF and trapdoor's CBF.
In the update phase, Fu's and Zhong's schemes need to perform LSH operations for the keywords that are added or deleted from the file and calculate the correlation score between keywords and the file. Our scheme needs to perform LSH operation for the keywords of adding or deleting files, generating CBF and encrypting the file identifier of the file. Therefore, the calculation cost of the update phase of our scheme is lower than that of Fu's scheme.
In summary, from the five aspects of index generation, trapdoor generation, search phase, dynamic update, and function, there exist obvious advantages for the proposed scheme in functionality and efficiency.

Efficiency
Through the above-mentioned theoretical analysis, the proposed scheme has obvious advantages in the comprehensive performance of function and efficiency. To assess the efficiency more accurately, we compare our scheme with Fu's scheme, Chen's scheme and Zhong's scheme in aspects of index generation, trapdoor generation and search phase.
The data set used in the experiment contains 129,000 abstracts describing the NSF basic research awards from 1990 to 2003. Total 3000 abstracts are selected as experimental data; moreover, stop words are moved by using the NLTK library. Extract the most frequent keywords from each file, and repeat each experiment 10 times, and then take the average of each result as the final experimental result. The scheme simulation in our paper is implemented by Pycharm. The experimental results are obtained by using Python language on a Windows10 server with a Core12 CPU running at 3.0 GHz. Figure 4(a) shows the changing trend of index generation time with the number of files. We extract 10 keywords from each file because the time of the index construction is related to the number of extracted keywords. It is shown that under the same number of files, our scheme spends less time to build an index than Fu's and Zhong's schemes. In addition, when the number of files increases, the rate of increase in index generation time is much slower than that of Fu's and Zhong's schemes. Note that although the cost of index generation in our scheme is slightly higher than Chen's, we implement the dynamic update of files in the edge server. Figure 4(b) shows the changing trend of trapdoor generation with the number of query keywords. Fu's and Zhong's schemes need to generate BF for query keywords, perform LSH operation, and calculate the relevance score of each keyword, which costs large amounts of time. Our scheme and Chen's are extremely efficient when generating single-keyword trapdoors. As the number of query keywords increases, the time-consuming growth of our scheme is slow, but the time of Fu's scheme is growing rapidly. Especially when the user queries 10 keywords, the trapdoor generation process takes over three-fourths less time than Fu's scheme.
Since our scheme is based on the two-level tree index generated by all files, the elements affecting the search time include the number of files, query keywords and extracted keywords, so these factors are respectively analyzed by experiments. Figure 4(c) shows the changing trend of search time with the number of files. Each file extracts 10 keywords and queries 3 keywords. Fu's scheme spends about 400 ms, when the number of files is 1000. As to Zhong's scheme, it spends about 180 ms by depth-first search. When searching the same number of files, Chen's scheme needs 10 ms, while our scheme only takes about 5 ms. What is more, in our scheme, as the number of files grows, the search time increases very slowly. Figure 4(d) shows the search time with the changing trend of the query keywords; 1500 files are searched, and 10 keywords are extracted from each file. When searching a single keyword, Fu's and Zhong's schemes follow a multi-keyword approach, while Chen's scheme needs about 4 ms and our scheme only takes about 2 ms. When searching multikeywords, the search time of our scheme remains relatively stable, despite the increase in search keywords. Figure 4(e) shows the search time with the increase of extracted keywords. As the number of extracted keywords increases, the results matched to the first-level index will also increase. It can be clearly seen that the search time of our scheme grows linearly and slowly. In addition, the search time of our scheme is much lower than that of Fu's and Zhong's schemes, and grows slower than Chen's scheme. Figure 4(f) presents the search time when querying a single keyword. The search time of our scheme is constant and hardly affected by the number of files. Compared with Chen's scheme, the efficiency of our scheme is higher, because it does not need decryption after the first-level index.
Based on the above analysis, the actual performance of our scheme is consistent with the theoretical performance analysis. Compared with Fu's and Zhong's schemes, our scheme has higher efficiency in index generation, trapdoor generation, search phase and update phase. Compared with Chen's scheme, although index generation efficiency is not dominant, our scheme has more advantages in terms of search efficiency, which is demonstrated by the experiment above. In addition, the extra time consumed in the index generation is worth it, because it reserves space for building CBF to enable dynamic updates. Therefore, the proposed scheme has obvious advantages in aspects of the function and performance.

Result accuracy
Since the proposed scheme supports fuzzy search, we have to consider the accuracy of search results. The precision P is used to represent the accuracy of search results and P = s p s p +t p , where s p represents the true positive and t p represents the false positive. To obtain fuzzy keywords, some keywords are selected randomly and modified.
For the proposed scheme, there are two factors that affect the result accuracy. One is the number of query keywords, and the other is the threshold thr in the trapdoor. We set thr = k(|W q | − 1), where k represents the number of hash functions and |W q | represents the number of query keywords. Figure 5 shows the precision of search results. For exact search, when the number of query keywords is from 1 to 10, the precision gradually declines slightly from 100% to 93% due to the occurrence of false positives. For fuzzy search, when searching one keyword, the precision can reach up to 91% which is higher than that of multi-keyword, because it is not affected by the false positive. When two keywords are searched, the precision decreases because of the dual effect of false positives and LSH mapping. When the number of query keywords is from 2 to 10, the precision slowly increases from about 85% to 89%. Because as the number of query keywords increase, the impact of the false positive generated by fuzzy keywords will decrease.
Based on the above analysis, the proposed scheme could meet the precision demand of searching in an intelligent edge network.

Conclusion
In the paper, we focus on how to realise multi-keyword fuzzy search in the case of data encryption protection in an intelligent edge server. First of all, based on the existing technology, we construct a two-level tree index and CBF, which not only realises multi-keyword fuzzy search, but also guarantees the search efficiency and supports the dynamic update of the edge server. Secondly, we prove the correctness and security under the honest and curious edge server model. Finally, we make comparisons in terms of the function and efficiency, and then carry out simulation to prove that our scheme has certain advantages.