Secure medical digital twin via human-centric interaction and cyber vulnerability resilience

As a fundamental service in near future, medical digital twin (MDT) is the virtual replica of a person. MDT applies new technologies of IoT, AI and big data to predict the state of health and offer clinical suggestions. It is crucial to secure medical digital twins through deep understanding of the design of digital twins and applying the new vulnerability tolerant approach. In this paper, we present a new medical digital twin, which systematically combines Haptic-AR navigation and deep learning techniques to achieve virtual replica and cyber–human interaction. We report an innovative study of the cyber–human interaction performance in different scenarios. With the focus on cyber resilience, a new solution of vulnerability tolerant is the must in the real-world MDT scenarios. We propose a novel scheme for recognising and fixing MDT vulnerabilities, in which a new CodeBERT-based neural network is applied to better understand risky code and capture cybersecurity semantics. We develop a prototype of the new MDT and collect several real-world datasets. In the empirical study, a number of well-designed experiments are conducted to evaluate the performance of digital twin, cyber–human interaction and vulnerability detection. The results confirm that our new platform works well, can support clinical decision and has great potential in cyber resilience.


Introduction
Digital twin will be an important 6G service in the near future (Samsung, 2020), which aggregates the new technologies such as artificial intelligence (AI) (Boden, 2017), Internet of Things (IoT) and big data analytics. Digital twins can be the virtual replicas of many physical entities, such as systems, devices, people and even places. One benefit of digital twin is that, before actual actions are performed on physical entities, professionals can run simulations and tests of using digital twins Zhang et al., 2020). We can see digital twin applications in the sectors of manufacturing, automotive and healthcare, due to the explosion of IoT sensors. Digital twins can significantly reduce maintenance burdens with the real-time monitoring capability.
A medical digital twin (MDT) is a virtual replica of a person. Employing a life-long data record and AI-powered models, MDT can predict the health state and give clinical suggestions (Corral-Acero et al., 2020). For example, an MDT can predict the outcome of a lung cancer diagnosis. In the real world, a GE MDT platform combined discrete events and applied agent-based methods to deal with the nuances of care delivery and predict a patient's flow. An MDT can track a patient's records, crosscheck them against known patterns, and analyse the disease features through AI. With the support of real-time data acquisition and processing, advanced data analytics and machine learning algorithms can produce accurate outcomes (Cao et al., 2021;Topol, 2019). An MDT can produce lifetime growth for healthcare organisations to benefit from the cost savings dramatically. It can also improve the healthcare organisations' preventive maintenance, e.g. predicting cardiopulmonary, respiratory arrest and lung cancer.
We aim to develop a new medical digital twin (MDT) with a focus on lung biopsy. Traditional lung biopsy navigation systems have two stages, preoperation and interoperation. The interoperation has two concerns: the operational tool's real-time location and the next stage of the operation. During the whole process, each of the guiding clues should be available for the surgeon in real time. For professional surgeons, the real-time force feedback from the hands may be more sensitive and important than the image guiding clues. In the new MDT, the haptic sensation is introduced in the interoperation stage to assist the image guiding for lung biopsy. We propose a new haptic-AR-enabled guiding method with deep learning to improve the robustness of the lung MDT.
In recent years, video-assisted thoracoscopic surgery (VATS) has become a popular method for lung cancer surgery. VATS is a minimally invasive operation, which only requires few small incisions in the patient's skin. It reduces the exposure of internal organs to external pollutants during surgery. Compared with traditional surgery, VATS has better cosmetic effects, shorter postoperative recovery time and short-term painkillers. Therefore, our new lung MDT chooses minimally invasive surgery as reference. Recently, the machine-human interaction through virtual reality has attracted a number of research. For example, Delp et al. proposed the application of VR technology in medical practice (Delp et al., 1990). However, there is lack of quantitative research on machine-human interaction in MDT with VATS. The well-known laparoscopic surgery simulator and evaluation method (Veronesi et al., 2016) doesn't consider the key characteristics of thoracoscopic surgery MDT, the narrow space, dangerous operation, and complex details. In this paper, we will bridge the gap through a new MDT research.
Cyber resilience (Chen et al., 2020;Miao et al., 2022;Samsung, 2020;Zhang et al., 2021) is a big concern in real-world critical applications including the development and applications of MDT, such as vulnerability tolerant is a fundamental requirement (Ahamad & Pathan, 2021;Liu et al., 2018). Exploitable MDT vulnerabilities are important security threats to the healthcare organisations and impact on a large number of users Qiu et al., 2021). To deal with the limitations of conventional static and dynamic techniques, machine learning (ML) becomes popular and has been applied in software vulnerability detection. Compared to traditional techniques, ML-based software vulnerability detection has a great potential of discovering unknown vulnerabilities and their variants Sun et al., 2019). Existing ML-based methods use programming code features, traditional ML algorithms (Coulter et al., 2020a) and the programming-related data such as developer activities and code commits (Lin et al., 2018) to build a detection model. As we know, feature engineering in the traditional ML depends on programming experience, vulnerability expertise and IoT domain knowledge (Coulter et al., 2020b), which are unreliable. To address this problem, deep learning is proposed and has demonstrated excellent performance in various scenarios (Liu et al., 2019;Wang et al., 2020). Existing methods includes applying FCN, CNNs and RNNs to model code characteristics and recognise vulnerability patterns in software. However, in the real world, we need an end-to-end solution to protect MDT by taking human-centric into account.
The innovation and contribution of the paper are threefold.
• We developed a novel Haptic-AR intervention MDT for advanced lung biopsy.
• We conducted a new study on XR-based machine-human interaction for lung biopsy prediction. • We proposed a novel vulnerability tolerance scheme using CodeBERT for cyber resilience of lung MDT.
The healthcare industry can use the new lung biopsy focused MDT to revolutionise clinical processes and hospital management. It will enhance medical care with digital tracking and advancing modelling of the human body. In industrial applications, the innovative haptic sensation can assist the image guiding for lung biopsy. Moreover, the new vulnerability detection technique will secure industrial software and significantly improve the cyber resilience of the healthcare industry.
We organise the rest of the paper as follows. Section 2 reviews the related work. A new MDT framework and a new cyber vulnerability resilience solution are presented in Section 3. Section 4 reports the experiments and results to demonstrate the effectiveness of the new MDT and techniques. Section 5 concludes this paper.

Related work
This section reviews the related work to our research, which includes three parts, MDT with XR and AI, MDT and its security, and Neural models for vulnerability detection. Recently, XR and AI have been applied to build MDTs and provide intelligence and new functions. Cyberattacks have targeted networked MDTs to compromise critical medical infrastructure. It is emergent to research cyber resilience when developing new MDTs, which motivates our work reported in this paper.

MDT with XR and AI
MDT with XR (Ratcliffe et al., 2021) can support doctors with more information during the operation, e.g. rebuilding the three-dimensional virtual patient for 3D surgical guiding. There are two key components, the three-dimensional reconstruction with CT/MRI images and the model-patient registration. Recent research focuses on two critical problems affecting system performance, physiological patient motion and surgical instrument interactions (Villanueva et al., 2020). Existing commercial systems adopt either visual-guide or opticalguide mechanisms based on infrared-based NDI Polaris. A precise registration between the reconstructed model and the patient is difficult because of the posture impurity and the lesions' heterogeneity. The IR-based navigation is significantly affected by signal blocking during the actual operations. We will research MDT with XR with high performance and provide a new proposal.
MDT systems use deep learning such as detecting COVID-19 (Burrer et al., 2020) pathogens and defending cyber-attacks. For example, leveraging the Internet of Things was proposed to collect real-time physiological data and the data was encrypted to ensure the safety of patient privacy. Researchers studied new anonymous IoTs models and developed RFID proof-of-concept systems. In the relevant scenarios, the blockchain technique was applied to protect contract deployment and function execution. AI plays an important role in risk prediction and prognosis treatment in MDT. To relieve the burdens on hospital staff and other health personnel, blockchain-based applications was developed to monitor and manage COVID-19 patients digitally. 5G technology can support the improvement of virus tracking, patient monitoring, data collection and analysis. Along with other synchronous technologies such as IoT and AI, we can see a great potential to revolutionise healthcare (Ting et al., 2020).

MDT and its security
The number of MDT deployed has been constantly increasing . Although the application of MDT has a huge potential, it must take practical constraints into consideration. Vending machines can be combined with the IoT technology to facilitate a healthy lifestyle. However, cyber-attacks to MDT will lead to terrible consequences to the critical medical infrastructure. With the widespread application of MDT, posing a severe threat to the secure operation not only to medical devices but also to the entire MDT ecosystem (Coulter et al., 2020a).
To protect MDT security, cyber researchers and professionals have been working on secure systems and solutions to combat the increasing cyber-attacks Wang et al., 2020;Xie et al., 2021). Extensive efforts have been made to ensure MDT security and privacy, providing practical guidance for medical industries. Recent survey papers discussed the opportunities and possible threats that MDT face at home and in hospital and the classification of MDT targeted cyber-attacks (Fu et al., 2017;Yang et al., 2017). A new method (Boejen & Grau, 2011) uses utilised unmanned aerial vehicles to launch an attack in a smart hospital environment, which can compromise wearable healthcare sensors. Deep learning approach was proposed for real-time cyber-attack detection for MDT security protection, and reported high accuracy and significant time-saving (Sethuraman et al., 2020).
In recent years, MDT has become widespread to combine the Virtual Reality (VR) technology with medical-related majors. Unity software was combined with a brain-computer interface to control the VR environment and MDT devices for future medical applications (Coogan & He, 2018). To improve the operation performance of the entire medical platform, our new research proposes to integrate the VR technology IoT and smart medical devices to duplicate the clinical process of lung cancer with pulmonary embolism. The new solution in the education field can significantly improve the learning experience by seamlessly combining their conceptual learning with practical experience.

Neural models for vulnerability detection
Fully connected network (FCN) was first applied to code representation learning in software vulnerability detection. Compared with the traditional machine learning, FCN is able to create a complex model of capturing nonlinear characteristics from a large data set Liu et al., 2021). Researchers have used FCN to analyse source code and identify the in-variance of security vulnerability. FCN can take various forms of input data, i.e.input structure independent, which enables researchers to explore different handcrafted features and information for software vulnerability discovery.
However, with existing FCN-based methods, the vulnerability detection performance relies on manual feature engineering. FCN is not good at processing sequentially dependent data, such as the program code. However, the joint action of context in the code control flow normally leads to vulnerable programs. Hence, researchers utilised the contextaware neural network structures for software vulnerability detection. Considering program code share semantics and syntax similarity with natural languages, natural language processing (NLP) techniques have been applied to learn the patterns of vulnerable code. With the capability of dealing with structured spatial data (Ramsundar & Zadeh, 2018), convolution neural network (CNN) is proposed to recognise semantically similar pixels for image classification. CNN has also achieved certain success in text classification by capturing the semantics in the context window. This motivates researchers to design CNN-based solutions to learn the context-aware code semantics.
RNN is naturally developed to model context dependence and process sequential data, so it has been applied in vulnerability pattern recognition. LSTM, a variant of RNN, was used to predict vulnerabilities in binary programs and the research results show LSTM is better than traditional multi-layer perceptrons in this task . With the consideration of that LSTM only considers one-direction code relationship, a bidirectional LSTM (Bi-LSTM) network was proposed for detecting the vulnerabilities with 'code gadget' (Li et al., 2016). The new model has some ability to capture the semantics of some software vulnerabilities, e.g. buffer overflow, which are associated with multiple code fragments. Some other researches preferred to extract the abstract syntax trees (ASTs) from source code and used ASTs in representation learning through the Bi-LSTM network (Lin et al., 2021(Lin et al., , 2017(Lin et al., , 2018.

Human centric smart MDT with security
In a future era with AI, 6G and smart sensors, the healthcare system can seamlessly link the real-world patient and digital replicate through MDT to achieve precision medicine (Hamet & Tremblay, 2017;Saad et al., 2019;Wang et al., 2020). For example, future doctors have the capability to explore and monitor the patient, and detect and predict health problems remotely. Figure 1 shows a new secure MDT linking real-world patient, AI prediction and telemedicine through a data-driven approach. Specifically, deep learning is applied to support intelligence in human-centric training and vulnerability tolerant security. The new MDT provides certain trustworthiness through intelligent detection of software vulnerability. In the following three subsections, we will introduce the technical details about AI for MDT, machine-human interaction and secure MDT with advanced vulnerability detection.

Data-driven intelligent MDT
A real-world dataset is created by collecting clinical data of lung cancer patients from three hospitals in Yunnan and Chongqing. We conducted essential data analysis for evidencebased medicine, including risk factor analysis, meta-data analysis, epidemiology analysis. The results are a set of features, which consist of the age, gender, leukocyte, pathological type, risk factors, d-dimer level, ECG, TNM stage, pathological location, and thoracic CT findings.
The new MDT supports smart telemedicine and surgery navigation. We achieve the visual rendering through a patient-specific CT image reconstruct mesh, which use a physicsbased rendering algorithm. Our new method can build up a microfacet model with more flexibility during the visual rendering. Due to the new fresnel term, the diffuse and specular reflection in our visual rendering have different forms and can accurately simulate the specific glistening effects of human organ. Moreover, we apply a multi-platform integrated game development tool to develop an unity software that allows healthcare managers to easily create interactive content. Our unity is developed for across platforms, e.g. Android, ios, PC and Web. It can deploy an APK of a VR project to an Android device and run the project and display through a headset.
Our MDT uses a customised CNN network to predict anomalies, e.g.pulmonary embolism in lung cancer. The neural network takes the labelled clinical data as input and builds a prediction model to support a treatment decision. It has five layers for convolution, full-connection and sigmoid activation. Compared to relevant work, our technique uses relatively little pre-processing for visual rendering. Instead of hand-engineered filtering and construction, the new method optimises the filters through automated learning and improves simulation performance. This data-driven approach is independent of prior knowledge and human intervention and has a significant advantage in feature extraction.

Machine-human interaction
Considering machine-human interaction through the proposed MDT, we implement a VatsSim-XR surgery simulator that integrates VR, AR and MR modes for clinical training. The training system consists of a laptop computer, two force feedback devices, a helmetmounted display, and two positioners. The computer-simulated virtual environment integrates cognitive and motor skills to achieve VR-based clinical training. AR simulation and high-quality cameras are used to implement digital twin and achieve comfortable machine-human interaction. We use a panoramic camera to record the natural environment of an operation room and combine it with a virtual environment shown in Oculus helmets for MR-based doctor training. There are three learning modules in our MDT. The virtual simulator has two primary functions, visual rendering and tactile stimulation. The visual rendering can reconstruct the surgical instruments and environment. The tactile simulation utilises force feedback to implement tactile-visual interaction through displacement and force update timely. The surgical instrument can trigger the force feedback device and the contact to generate corresponding force feedback.
We invite tens of doctors in the real-world use experiments. A t-test analysis is performed on the data collected in the experiments. The novice doctors receive 2 weeks of training after first try and then do the experiments again in order to conduct a comparative study. All novice and expert doctors must learn how to operate the MDT simulator before officially participating in the experiments. They must understand the operating rules of peg transfer, blood vessel cutting and shearing, and rope perforation. The training experiments include peg transfer, blood vessel cutting and shearing, and rope perforation. All doctors conduct three-module experiments, VR, AR and MR.
The participating doctors are given the operating requirements and the scoring specification before conducting the experiments of machine-human interaction in the DMT. In the module of pet transfer, the task is to use the left surgical forceps to pick up the left object and transfer it to the right forceps. It also asks the participant to do the same task starting with the right forceps. The times of failing to correctly move the objects in the experiments is recorded for analysis. In module of blood vessel trimming, the task is to first use the left surgical clip to clamp the left end of the blood vessel. Then it repeats the same task with left surgical clamp. Finally, the task is to use the right surgical clamp to cut the blood vessel in the middle and catch the rope at the other end of the small tunnel. The number of ropes dropped in the experiments is recorded for analysis. In the module of rope perforation, the task is to use the left surgical clamp to grab one end of the rope and pass it through the small hole. The task is repeated with using the right surgical clamp.

Secure MDT with advanced vulnerability detection
Our new research methodology is shown in Figure 2. Several real-world software projects and a large number of synthetic vulnerable functions are used in the learning and validation of deep neural models. We process the source code and transform it to work compatible with advanced deep embedding such as CodeBERT . The source code files are loaded and processed to generate sequence data and labels, which are passed to the next stage and produce the code embedding vectors in a high-dimension space. These vectors will then be partitioned and fed to such as the GRU neural network for the deep learning process. We feed CodeBERT with the synthetic data to obtain a fine-tuned model, which can output the vulnerable probabilities. Based on the fine-tuning model, we investigate the impact of several key parameters such as the length of the input sequence, batch size, epoch and learning rate.
Our research will address the following critical thinking. What is the impact of deep code embedding on the vulnerability resilience of MDT? We design a new scheme to facilitate different embedding models, including CodeBERT, Word2vec, GloVe and FastText. The novelty involves code processing, transformation and representation. Can synthetic data help to improve the robustness and effectiveness of embedding models? We apply a new approach to combine a real-world dataset and a synthetic dataset to achieve finetuning models such as fine-tuned CodeBERT and explore the nature of the data and model dependence. How does the input sequence length affect code embedding? We analyse the impact of various sizes of input code sequences on the detection performance of deep neural models.
Context information is crucial for analysing various vulnerable functions. For different contexts, a certain variable has various meanings. CodeBERT can be customised to extract high-level code representations for the task of detecting vulnerable C functions. Conventional embedding models do not consider the value variation of a variable and convert a word into a fixed vector, which has the limitation in terms of code semantics. CodeBERT has two objects in training, masked language modelling and replace token detection. For vulnerability detection, we take advantage of the second objective and use a large number of unimodal source code. We leverage the encoder to retain useful information through residual connections design. To learn the patterns of C language, we incorporate transfer learning to our new scheme and obtain relevant syntactic and semantic information from the syntactic data other languages.
Checking the contiguous information is generally insufficient to generate semantically rich vector representations. Vulnerable functions may contain declarations, assignments, control flow and other operational logic. Muti-head attention enables the model to focus on multiple key points, which facilitates the capture of potentially vulnerable functions. In addition, long-distance contextual information is crucial for vulnerability detection. Positional encoding layer is added to our new scheme. The occurrence of a vulnerable code fragment usually has constituent parts linked or connected to either previous or subsequent code, or even to both. Hence, the new design of our neural embedding network take these into consideration and detect a long-term dependency of both forward and backward by offering services.
We propose a fine-tuning solution to allow a deep neural model to learn the syntax and structure of the C programming language and understand more code semantics. Through transfer learning, the customised CodeBERT can learn characteristics of C language from the synthetic functions. Our synthetic dataset has sufficient quantity and diversity, better than real-world open-source projects and time-consuming labelling. It includes basic code patterns and syntax and accurate labels for balanced training and optimisation.
In our fine-tuning method, we use CodeBERT as the pre-trained source model on the source data set. Then, we build a target deep neural network, which replicates all source model structures and parameters except the output layer. Next, we add a fully connected output layer to the target model, whose output size is the number of classes in the target data set. Finally, we train the target model of using the target data set to achieve the finetuning purpose.
Source code functions have varying lengths, which exerts tremendous influence on the detection performance. The over-long code is needed to truncate with the consideration of excessively long vectors and information loss. There are long-distance dependencies, given that some vulnerable functions may lie many sentences away from their locus of attention. Due to the Bidirectional structure, previous information may slip from the model's memory in recent embedding models.
We conduct a new study on the length of input sequences for deep neural model and focus exactly on the key points of function vulnerability.

Experiments and results
We design and carry out different experiments to evaluate the performance and security of the new MDT and demonstrate the effectiveness of proposed techniques.

How does our MDT affects surgical training?
Our MDT has its built-in metrics in different modules of surgical training and performs data recording and analysis. The peg transfer module records and analyses the total procedure time and instrument pathway. The module of blood vessel clipping and cutting uses the metrics of the total procedure time, instrument pathway, and the error of clipping and cutting. In the rope perforation module, we consider the total procedure time and instrument pathway. In addition, we also investigate some other facts including the number of successfully transferred small objects, the number of times the rope successfully passed through the small holes and the number of failures. A large number of experiments are conducted to evaluate the effectiveness of the new MDT through comprehensively analysing the data from the novice group and the expert group, and the data collected before and after the training. Our questionnaire uses a 5-point Likert scale to assess the visual and tactile sensations of the four training methods. The questionnaire is filled out by senior doctors.
In our study, descriptive statistical methods are used to analyse the questionnaire data. First, we conduct reliability analysis on the questionnaires data and consider it credible with Cronbach's Alpha coefficient greater than 0.7. In terms of construct validation, an independent sample t-test is applied and the simulator effect is considered significant with p-value less than 0.05. Regarding to the improvement, we use the histogram to show whether each doctor improves his skills after training. The entropy method can be used to determine the index weight of each evaluation item in different modules. We apply the scope method to eliminate the dimensions of each evaluation item, eliminate the influence of physical quantities and compute the coefficient of variance to obtain the index weight.
To demonstrate our new MDT system working well, the participating doctors conduct required operations step by step and the MDT outputs their performance based on pre-defined criteria. The performance computed automatically by MDT can accurately differentiate novices from experts. It also shows that novice doctors can improve their surgical skills after training with using our MDT. In the module of small object transferring, expert doctors perform better than that in other three different training. Compared with the VRbased training, the AR-based training can enhance the reference of the environment, but it may have the opposite effect and increase the difficulty in the operation. The MR-based training can improve the doctor's perception about the operating room and has a higher sense of immersion compared to VR-based training. We take various indicators including environment reference and simulation time into consideration when analysing the experiment results. In the module of the rope perforation, the performance of expert doctors is significantly better than that of novice doctors due to the high difficulty of the operation. Wearing a helmet display may cause some discomfort, so the experts take a long time in the training module. The novice doctors' performance is gradually consistent with those of the expert doctors. The influence of different training methods is reported in Figure 3. We can see that the novice group improves their surgical skills through training. After training, the movement trajectory of the surgical clip operated by the novice doctor is better.
We analyse the results obtained in the module of peg transfer. Figure 4 show that the novice group improves their surgical skills through training. Some novices doctors achieve the expert level in the operation. Considering the time to complete the training experiments, the AR-based training takes the longest time while the Box-based training takes the shortest time. The VR-based and MR-based training take similar times. The results suggest that the actual operating environment can affect the operation performance of a doctor. Let's analyse the moving distance of the surgical clip. The moving distance of the surgical clip is the longest in the AR-based training, which is followed by the VR-based training. Combining different objects as a reference for the position may affect the judgment of the novice doctor and negatively affect its performance. The motion trajectory of the surgical clip of the novice doctor is more concentrated after the training. In terms of the number of drops of small objects, the Box-based and AR-based have more drops. The MR-based training can significantly decrease the average number of drops of novice doctors. Based  on the entropy analysis method, the MR-based training can quickly improve this particular operation skill of novice doctors.
We investigate the impact of various training on the operation of vessel cutting. Figure 4 shows that the novice group improves the surgical skills of vessel cutting and shearing through training. Some novice doctors achieve the expert operation level after training. The box-based training takes less time than the other three training approaches but has the most significant error that affects blood vessel cutting and shearing. The authenticity of the box module is lower than others, which suggests that the module authenticity may affect the operation performance. We also consider the movement distance of the surgical clip. Regarding the time to complete, the improvement of the VR-based training is the largest, and the AR-based training takes the shortest time. The module authenticity and the environment can speed up a doctor's judgment, but that may affect the operation accuracy. The AR-based training can improve the thinking quality and increase the operation accuracy, so the trajectory movement distance is shorter, and the errors are minor. The less time spent on MR shows that experts need shorter adaptability in a familiar environment. The results suggest that vascular clipping and shearing errors are related to time and the movement distance of the surgical clip. The MR-based training is suitable for medical training, which is proven partially by the movement trajectory of the surgical clip. The doctor can use the actual object as a reference standard in the MR module, supported by the entropy analysis.
We analyse the results of the experiments with rope. As shown in Figure 4, the novice group improves their surgical skills through training with the comparison to the data of expert doctors. The rope module has a different vision in the competing approaches. The VR-based and AR-based training provide a broader field of vision than MR, so the ARbased training takes the shortest time while the MR-based training spends the longest time. The group of novice doctors spend similar time on the AR-based training and the VR-based training. The rope piercing is difficult. the expert doctors don't demonstrate significantly better performance than novice doctors. The MR-based training still has the largest improvement in the time to complete for novice doctors. The fidelity of the environment makes a great impact on the improvement. The trajectory of the VR-based training is the most concentrated because the field of view is the widest. Through training, novice doctors show good performance close to that of experts.

What vulnerabilities could be detected through advanced neural learning?
Two datasets (Lin et al., 2021) are used in the empirical study to evaluate the performance of advanced neural learning for MDT vulnerability detection. The real-world dataset consists of 12 open-source software projects and libraries, including Asterisk, Httpd, Imagemagick, LibPNG, LibTIFF, OpenSSL, Pidgin, qemu, samba, VLCPlayer and Xen. It is a dual-granularity vulnerability detection dataset, providing labelled file-level and function-level vulnerabilities according to the information from the public NVD (https://nvd.nist.gov/) and CVE (https://cve.mitre.org/). The NVD is the U.S. government repository of standards-based vulnerability management data. The mission of the CVE program is to identify, define and catalogue publicly disclosed cybersecurity vulnerabilities. Our experiments use about 2000 vulnerable functions and 130,000 non-vulnerable functions for performance evaluation. Our synthetic vulnerability dataset contains function samples from the SARD project, artificially constructed code fragments based on known vulnerability patterns.
We implement several methods and conduct a set of experiments to evaluate the impact of different neural embedding techniques to the vulnerability detection performance. Figure 5 shows the results of detecting vulnerable function at top k return. The embedding technique can affect the detection accuracy (precision) significantly, which varies from about 30% to about 60%, when we check top 1% ranked functions. The recall of CodeBERT is much better than Word2Vec, GloVe and FastText. The difference can achieve around 10% for the top 1% return function.
A set of experiments are designed and carried out to evaluate the impact of the fintuning to CodeBERT? To conduct the model fine-tuning, we add a fully-connection layer into the model and apply the synthetic SARD dataset for the parameter optimisation. As shown in Figure 5, our fine-tuning method can improve the precision up to 10% when retrieving 1% of vulnerable functions. In particular, our system can discover most of vulnerable function by checking 15% of top ranked functions. Compared to other detection models, our fine-tuning method can significantly improve the performance such as about 25% in precision and 20% in recall.
We conducted a set of experiments to choose a suitable sequence length for software vulnerability detection in MDT. For C language, the results show the best and reliable performance when the sequence length is 256. The precision for the setting of 128 is inferior, only 25%. Some essential information of vulnerable functions includes a head file, variable declaration, parameters, logic code and return value. If the sequence length is too short, the model may miss critical information, producing bad results. When the sequence length is too long, the performance is poor due to high noise. The sequences of vulnerable functions are automatically filled with 1 during the embedding process. Too much irrelevant information misleads the focus of the detection model.
Compared to the baseline of Flawfinder, which ranks the functions according to the vulnerability level, our neural embedding approach ranks functions based on the vulnerable probability. The experiment results show that the deep neural model with an optimised sequence length substantially outperforms FlawFinder. Our model can achieve the precision of 70% and the recall of about 50% in the top 1% return functions. Flawfinder has really poor precision and recall at this case.
We carried out a set of experiments to evaluate the performance of Word2Vec, GloVe, FastText and CodeBERT for vulnerability detection. Word2Vec performs word embeddings according to word co-occurrence and proximity in space. However, it doesn't pay attention to the context. GloVe and FastText have some progress over Word2Vec, but all of them use a single vector to represent a word and suffer from the issue of polysemous. Researchers proposed contextual word embeddings to consider the context information of a single word in a sentence. A vulnerability pattern could cross several lines, which has the characteristic of long-distance dependence. CodeBERT generates corresponding vectors in different contexts for a C word instead of one-to-one correspondence. Figure 5 reports the performance of different embedding methods and shows their various capacities in learning vulnerability patterns. CodeBERT demonstrates the ability to generate high-level deep context-dependent representations. We use the synthetic dataset and an additional fully connected layer to fine-tune CodeBERT to improve vulnerability detection accuracy. Compared to the other three models, our method can achieve over 10% improvement in precision and recall. We care about the impact of the batch size, epoch and input sequence length on the detection performance. Optimised input sequence length is assumed to benefit the model and reduce the information loss to improve the detection capability. In the experiments, we can improve from 7% to 44% in the top 1% returned functions. The results confirm that our new method can learn complex patterns and generate high-level representations better than conventional embedding methods.
Active MDTs are increasingly connected to the Internet to enhance their functionality and ability. Increasingly, MDTs can be controlled via a mobile phone, and data can be transmitted remotely to support new applications. The connectivity of MDTs to the Internet facilitates information sharing and treatment delivery, but it also exposes MDTs to the risk of potential cybersecurity threats. The threats can be reduced and managed by implementing a vulnerability resilience strategy. The responsibility for implementing and maintaining the cyber resilience of MDTs falls upon all stakeholders. Our research demonstrated a novel technique for MDT cyber resilience and can significantly benefit the medical industry.

Conclusion
This paper presented a new medical digital twin (MDT) combining XR navigation and deep learning to achieve cyber-human interaction for clinical training and cybersecurity. We designed a new system and three training modules to support cyber-human interaction and improve the clinical operation performance. We developed a new CodeBERT-based neural network to better understand risky code and capture cybersecurity semantics, so as to detect MDT vulnerabilities effectively. A large number of well-designed experiments were carried out to prove the effectiveness and efficiency of the new MDT. The experiment results and comprehensive analysis show that the proposed techniques work well and the new MDT can support clinical decision and has great potential in cyber resilience.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported by National Natural Science Foundation of China.