Automated threat modelling and risk analysis in e-Government using BPMN

Recent progress integrates security requirements into BPMN, enhancing its framework. Extensions aim to seamlessly embed security concepts, yet the inherent ambiguity of security terms may lead to misinterpretations and vulnerabilities. Unfortunately, many business process experts lack the expertise to accurately interpret and integrate vital security concepts. In this study, we present an innovative automated methodology tailored to assist business process experts in identifying security threats and conducting risk assessments, particularly in the context of e-Government processes. Our approach streamlines the process, requiring only a business specialist to annotate BPMN entities with high-level, non-security-related information. Based on these annotations, potential threats to the system can be automatically identified. To develop our methodology, we leverage the standard BPMN annotation mechanism. From the annotated BPMN, the methodology utilises the ENISA Threat Landscape knowledge base for threat identification and employs the OWASP Risk Rating Methodology for risk assessment. To demonstrate the effectiveness of our approach, we applied it to a straightforward case study within the e-Government domain. Through this example, we illustrate how our methodology can be employed to ensure compliance with the General Data Protection Regulation and meet the mandatory Data Protection Impact Assessment requirements.


Introduction
The recent European Cybersecurity Act (2019) outlined that the use of network and information systems by citizens, organisations and businesses across the Union is now pervasive and that Cyberattacks are on the increase and a connected economy and society that is more vulnerable to cyber threats and attacks requires stronger defences.This is one of the many new laws and regulations that are going in the direction of imposing the adoption of correct security countermeasures in the design of private and public systems.Other examples are GDPR or the Network and Information Security (NIS) Directive.The recent Covid-19 pandemic has accelerated the adoption of Internet-based services and, at the same time, the associated risks, increasing the need for adequate and secure infrastructures.In particular, the e-Government services are increasingly used, also due to the Covid-19 pandemic.As shown in some works (Raza et al., 2020;Sharfuddin, 2020), due to social isolation, the citizens began to use e-Government systems to pay bills, rates and taxes or to obtain official documents (e.g.Certificates).The need to use these systems is therefore felt and strong, but what is holding them back the most are the doubts regarding the security and privacy problems (Liang et al., 2023) that the systems may have.Citizens doubts about e-Government security are, however, not unmotivated.As an example, Zhao and Zhao (2010), identified opportunities and threats to U.S. state e-Government sites, including privacy problems.Some e-Government systems, as proven by Thompson et al. (2020), have a low-security level.In fact, the authors highlight how e-Government sites are subject to common attacks such as SQL Injection and Cross-Site-Scripting.However, security is not only a matter of technological solutions or the adoption of dedicated tools: it affects the policies devoted to managing the infrastructures and the organisational processes of a system.As a consequence, there is a clear need to identify and model security aspects when describing the processes of an information system.Among others, Leitner et al. (2013) stated that the accounting of the security aspects during the early modelling stage of a business process is highly beneficial.They also performed a survey to evaluate the comprehensibility of multiple BPMN security extensions, concluding that, although the considered sample of business experts was able to identify and interpret many symbols in the business process context, it was difficult for the participants to interpret symbols where domain knowledge was required.The comprehensibility of security concepts integrated within business process models is crucial to avoid misinterpretation and consequent security problems.
As a matter of fact, the term Security is an umbrella that covers a lot of different concepts, models, and technologies.NIST provides the Cyber-Security Framework recently updated to version 2.0 (Barrett, n.d.), which represents one of the most comprehensive and upto-date systematic overviews of cyber security activities and concepts.This most recently released version of the framework incorporates the latest advances in the field and provides a more uniform interpretation of the roles of existing security standards Pretending business experts take all such concepts into account from the very early design phase may result in an impractical approach: the core consideration made at the basis of this work is that business modellers should only focus on the organisational aspects of the processes, while security aspects should be derived, as an independent and possibly automated step.
The purpose of this work is to define an automated technique to identify the security threats that characterise a particular process in the e-Government field and rate the associated risk in an almost fully automated way.
In practice, instead of modelling security inside the business processes, we ask the business modeller to include some extra information in their models (checked through a validation step) to enable our technicians to automatically identify the menaces that may affect the process.Accordingly, the modeller will be able to change the process and/or add the suggested countermeasure in the process, without having specific competencies in the security field.
Contrary to the other existing approaches, the proposed methodology requires the modeller to just enumerate the technologies leveraged by the process.No information about the security of the process components, nor the security properties that must be guaranteed, are required, as these are automatically retrieved.
It is worth noticing that the Risk Analysis process heavily relies on the experience of the analyst and on human-based decisions.However, it is very hard (and costly) in a complex system to involve senior security experts all along the system life cycle.As a matter of fact, our methodology aims at automating all the steps that do not need an experienced analyst, limiting security expert (costly and time-expensive) work in a validation step, and offering a pre-constituted security evaluation base.The proposed approach was developed focusing on e-Government context: as demonstrated in the state-of-the-art section (Section 2), the request for security grants from citizens has a higher impact on the concrete adoption of the digital solutions and, at the same time, the existing infrastructures often show major limits in terms of adoption of security best practices.The technique was implemented through a web-based proof-of-concept that helps in performing the process modelling and the security analysis automation.The tool is available as open-source 1 and is offered as a service on our servers. 2To demonstrate the effectiveness of the suggested approach, we implemented it in a straightforward e-Government case study.We tailored the method's steps to generate a Data Protection Impact Assessment (DPIA), which is a GDPR requirement.However, it's worth noting that the comprehensiveness of the DPIA may be limited as it is also influenced by the opinions and subjective considerations of the Data Protection Officer (DPO).
Such an exercise, even if applied to a simple case study, aims at demonstrating the flexibility of the proposed methodology and how it can be concretely adopted in a real environment.
As summary, the main contributions of this paper are: • an innovative technique for automated threat modelling based on BPMN annotations (technology annotation, not security-related annotation); • identification of a set of properties, understandable to non-security-experts, associated with ENISA (The European Union Agency for Cybersecurity) catalogue of threats; • a fully automated Threat Modeling and Risk Rating technique; • customisation of the technique in order to produce a GDPR-compliant Data Protection Impact Analysis; • a proof-of-concept tool, that demonstrates the technique, that was tested against real case studies.
The remainder of this work is organised as follows: Section 2 presents the state of the art related to security modelling in BPMN, motivating the work and outlining the needs for the proposed technique.Section 3 describes the proposed technique in detail.Section 4 illustrates the technique application through a simple example.Finally, section 5 summarises our conclusions and possible future works.

State of the art
The need for modelling security concepts in BPMN is a matter of fact, outlined in the above introduction by the new regulations and law requirements that imposes the adoption of security-by-design principles and of security best practices in system design and maintenance.However, it is an open research topic How BPMN should model security, that can be synthesised by two simple research questions, analysed in the following: • Q1: What are the security concepts that should be integrated into the BPMN model?• Q2: How a modeller should integrate security concepts into a BPMN model?

What are the security concepts that should be integrated into the BPMN?
In Literature the reply to the first question varies a lot, depending on the way in which the experts model security and the goals of the BPMN model.
As a starting point, the most common approach in the literature addressing Security in BPMN, assumes that the BPMN should model Security Requirement in BPMN.However, there is no common agreement on how to express and exactly which are such requirements.
The most common approach is to simply consider a list of predefined requirements and enable their insertion into the BPMN model.As an example, in 2007 Rodrìguez et al. discussed in Rodriguez et al. (2007) an extension to the BPMN meta-model to include a predefined set of high-level cybersecurity requirements into the Business Process Diagrams, enabling business analysts to express their security needs.They included the concepts of non repudiation, attack harm detection, integrity, privacy and access control, although they did not consider other important security requirements like availability, confidentiality and auditability.
Another meta-model extension was introduced by Brucker et al. (2012) with regards to the privacy (Li et al., 2017) requirements (access control, separation of duties, binding of duty and need to know).Beyond the SecureBPMN security language, the authors proposed a tool to model the security concepts during the modelling phase and also to enforce them at runtime.However, the proposed approach lacks many important security requirements.In fact, this approach is specific to guarantee access control, even automatically, but does not consider, for example, confidentiality and data integrity requirements, or service availability.In 2012, Cherdantseva et al. introduced in Cherdantseva et al. (2012) a BPMN extension to include the Information Assurance & Security (IAS) requirements.They enriched the BPMN model with IAS modelling capabilities by developing SecureBPMN, a graphical security modelling extension for BPMN 2.0.However, the proposed approach covers only a few security aspects.Sang and Zhou (2015) also extended the BPMN meta-model with new security elements, which can be also represented within the BPMN diagrams.The extension is highly focused on the three main CIA (Confidentiality, Integrity, Availability) security indicators and is applied to a healthcare case-study.In Salnitri et al. (2014), Salitri et al. proposed a framework to express security requirements in terms of BPMN annotations.They introduced the SecBPMN language to describe system information, whereas the security policies are defined through SecBPMN-Q, a query language for BPMN.The annotated security requirements derived from the Reference Model of Information Assurance and Security (RMIAS) (Cherdantseva & Hilton, 2013) and include accountability, auditability, authenticity, availability, confidentiality, integrity, non-repudiation and privacy.
An alternative approach focuses on considering Security requirements as constraints on the model: Mülle et al. proposed in Mülle et al. (2011) a language to formulate security constraints and used BPMN artefacts as containers for security annotations.Authors extended an open-source business-process-management system (BPMS) from the business process modelling, through the configuration and to the runtime phase by taking into account the concepts of authorisation, authentication.auditing, confidentiality, and data  2022) that used ArchiMate language (and some transformation rules) to model security in the business layer of enterprise architectures.However, the security language is expert-oriented and business experts could find it hard to understand.Maines et al. argued in Maines et al. (2015) that no accurate study was conducted on the BPMN security requirements, and introduced a new comprehensive cybersecurity ontology for specifying security requirements within BPMN, identifying a total of 79 security concepts.A year later, authors of Maines et al. (2016) proposed to represent the BPMN security requirements in a third dimension.However, the extension is described only theoretically.In Chergui and Benslimane (2020), Chergui et al. proposed a BPMN meta-model extension based on the security requirements derived from the cybersecurity ontology in Maines et al. (2015).The extension is fully BPMN compliant and, in contrast to the other works, leverages the BPMN meta-model extension mechanism introduced in BPMN version 2.0.Also, they proposed a web tool to facilitate collaboration between business and security experts and provided an XML schema extension for integration with the existing BPMN modeller tools.
A completely different way of addressing security in BPMN relies on the idea of modelling Threats in the BPMN, in order to identify the possible malicious behaviour.Meland and Gjaere (2012) related the concept of threat modelling to the concept of business process modelling, presenting four different ways for threat specification at design-time within BPMN.In particular, they discuss the pros and cons of threat representation as (i) error events, (ii) escalation events, (iii) annotations, and (iv) through meta-model extensions.
A threat profile security framework was proposed as a BPMN extension by Zareen et al. (2020) in 2020.The framework is focused on security goals and provides a methodology for systematical analysis of multiple security requirements.It is based on the Software Quality Requirements Engineering (SQAURE) and the Software Requirements Engineering Process (SREP) processes to elaborate SQUARE using Common Criteria.The authors leveraged the extension mechanism provided in BPMN 2.0 to model the threat-based security requirements and introduced several graphical components for BPMN diagrams.
Similarly, in Altuhhova (2012), Altuhhova et al. proposed an extension for security risk management based on the BPMN alignment to the Information System Security Risk Management (ISSRM) concepts.They extended the BPMN meta-model and BPMN diagram components to express assets, risks, and risk treatments.However, the work is limited to descriptive modelling.As a summary, Table 1 summarises the proposed approaches and the paper that adopted each of them.

How to integrate security concepts in BPMN?
Regarding the second question, How to integrate security concepts in BPMN, the literature illustrates three approaches to extend the BPMN meta-model: (i) adoption of additional, non-compliant components, (ii) leverage the BPMN annotation system or (iii) the BPMN extension mechanism introduced in 2011 with BPMN 2.0.Table 2 summarises the papers, described above, that adopted each of the three approaches.
The first approach, which freely customises the BPMN annotation, was pretty common with the older version of the standard and has the clear disadvantage of being non-standard, so tools should explicitly be customised to support the new concepts and interoperability is limited.However, it offers high flexibility with respect to the concept that it is possible to express.The second approach, which relies on the standard concept of annotation, has the opposite advantages and disadvantages: it is standard-compliant and every BPMN tool automatically supports such annotations, but it is limited in the capability of expressing new concepts and may rely on natural language, making harder to use it in an automation process.The last approach relies on the new BPMN 2.0, which tries to explicitly address the extension problem in general, offering a standard extension mechanism.While such an approach solves the interoperability problem, maintaining the flexibility of the first approach, on the other side, tools should explicitly support the extension, as an example through a plugin system, and this could be a limit.

Considerations on the research questions
The proposed analysis of the state of the art, limited due to space motivation, outlines a multiplicity of different ways to address security in BPMN.However, as already pointed out, business experts are generally not also security experts and may have difficulty understanding many security concepts and/or the way of correctly using the BPMN extensions proposed.
Potentially, this could lead to incorrect or inaccurate security modelling, with severe consequences.BPMN security extensions try to address security requirements in business process modelling, although many of them are heavily security-centric, in the sense that they add a considerable number of security-related BPMN components, or are excessively verbose and, thus, difficult to manage (Mülle et al., 2011;Salnitri et al., 2014;Zareen et al., 2020).
Additionally, most of the proposed extensions still fail to express security concepts in a format that is fully comprehensible to business experts, others propose only a theoretical or descriptive extension (Maines et al., 2016;Meland & Gjaere, 2012).Another ambiguity is in the definition of security goal, security objective, and security requirements, interpreted differently or used interchangeably, and on the completeness of security information.Several extensions only address a reduced part of security goals and objectives (Brucker et al., 2012;Cherdantseva et al., 2012;Mülle et al., 2011;Rodriguez et al., 2007), although others leverage ontologies and taxonomies to identify the security concepts (Chergui & Benslimane, 2020;Maines et al., 2015Maines et al., , 2016)).
In literature, the threats are described in different ways depending on the system, the involved technologies and the interactions between systems, but, to the best of our knowledge, most of the existing threat modelling methodologies are applied to the software system and not to the business logic (Granata & Rak, 2023;Granata et al., 2021).Therefore, we propose a fundamentally different approach: a Business Model should solely encompass the Business Logic, as the expert can directly comprehend and manage it.Security Threats, Risks, and security requirements should be derived from the model, rather than being directly integrated into it.Accordingly, our paper presents a methodology capable of verifying that a model contains sufficient data to accurately identify security aspects and conduct an automated Risk Analysis.As a result, a set of security requirements (modelled in terms of standard security controls) are proposed.Business Logic may include tasks that implement specific countermeasures, for example, a task that grants that data are deleted after a given interval of time.
It is worth noticing that, as outlined in future works, we are working on BPMN patterns that models a specific set of controls, such an approach may enable the automated verification that a BPMN implements correctly a specific set of controls.
In our forthcoming research, our objective is to demonstrate a methodology for translating security requirements into terms of business logic and seamlessly integrating them into an established framework, with a strong emphasis on automation.We have initiated this effort, and you can find an initial exploration of this approach in the cited work, Rak et al. (2022).

Automated threat modelling technique
The state of the art outlines that: (i) there is a need for considering security requirements during business process modelling, (ii) business logic experts very rarely are also security experts and can hardly manage security concepts or the current state of the art of security extensions, and (iii) the existing solutions rarely demonstrate themselves complete in terms of the capability of modelling security (as outlined in the analysis in Table 2), even due to the ambiguity of literature on security terminology and the lack of standard security requirements.
Accordingly, we defined a methodology to derive security requirements in oan automated way (except for some information requested from the modeller), requiring the business logic expert to annotate the model only with information that does not require security expertise, but that we are able to use in order to derive security requirements.
Our methodology relies on four main phases, summarised in Figure 1: (i) the system modelling, (ii) the model refinement, (iii) the threat modelling, (iv) risk analysis.

System modelling and model refinement
Our approach assumes as a starting point a model of the (e-government) system, described through the BPMN standard, in particular by using the BPMN version 2.0. 3 Accordingly, each process is modelled through a set of activities, each identified as a task and a particular type (task type), namely Send, Receive, Service, Script, Business rules, User, and Manual.Note that we considered either automated tasks and tasks that involve human interaction: security involves both organisational and technological activities.
The model refinement phase consists of two steps: Model Structural Refinement and Model Annotation Refinement.
The Model Structural Refinement step checks the model structure (i.e. it may affect the contained tasks and may require a structural change in the BPMN model), to ensure that all needed information to perform the security assessment is reported.It is worth noticing that a (formally correct) BPMN process may neglect to report some activities, considered trivial or not relevant by the business expert.
Accordingly, we defined the following properties that must be verified: (i) The modeller must outline the data (through a BPMN data object) handled by each task, (ii) for each persistent data, each actor (defined by the pool name) must specify a service task for storing it, where it is stored (through a data store object), and a task for deleting the data, in addition to a time event object indicating the time the data is stored before deletion.It should be noted that if the storing and deleting process is not carried out by the actor who processes the data, another pool must be created (e.g.External Provider), (iii) For each data the modeller must indicate who is responsible for the data.
The Model Annotation Refinement step, instead, does not modify the BPMN structure but only requests the annotation for the existing BPMN object, needed to derive information related to security aspects.In practice, during this phase, the user identifies a set of annotation types for all the tasks (including the ones automatically created during the structural refinement phase).As outlined in Section 2, the goal is to extract security attributes without directly extending the BPMN standard with security concepts.Accordingly, we defined a set of additional annotation types and values, in compliance with the standard, to annotate the assets (i.e. the values to be protected), that, according to our choices are the BPMN concept of task (processes) and the data objects.
It is worth noticing that the introduced annotations do not focus on security, but they explicate, among a set of predefined values, the way in which the tasks take place or the type of data that the processes elaborate.As a consequence, our model fully complies with the standard version of BPMN and does not require any change to the available BPMN support tools.
Table 3 summarises the proposed annotations, listing, for each of them, (i) the task type that must have such an annotation in order to automatically derive the possible threats and (ii) the values that each annotation can have according to the annotation type it belongs to.The task type, the annotation type and the annotation values will be used during the automated threat modelling step to select the applicable threats.
Note that, according to our analysis, the Business Rule, the Script and the Manual tasks do not need additional annotations, since the task type information is enough to select the possible threats that may affect the process.
The Communication Type annotation, that applies to Send and Receive task types, describes how messages are exchanged, in particular evidencing the adoption of secure (PEC) or insecure mail exchange, the adoption of old-style post-office communication or an exchange of messages based on an interoperability protocol, i.e. a technology-based message exchange, whose security will depend heavily on the infrastructure.
The tasks that require human interaction supported by software (User) may differently affect the security, according to how such activities take place.The interaction type values are: online and offline.These text annotations indicate whether the actor manually interacts with supporting software over the Internet.Note that the Manual Task, which even relies on human interaction, by definition excludes the adoption of software, so no additional information is required.
The service type annotation differentiates whether the considered service includes the concept of status or not, which is relevant from a security point of view (e.g.service states may be maliciously altered).
In addition to the properties that specify the task typology, tasks must have an annotation indicating the log level (i.e.how in detail it is possible to reconstruct the operations performed by the task) with a Low-Medium-High value.The log level is indicated only for the Service Tasks, the Manual Tasks, User Tasks and the Script Tasks, as these activities require a logged user to perform the operations (digitally or on a physical register).We consider tree log levels: no log (low), approximate log (medium) and detailed log (high).
It is worth noting that an approximate log value means that the actions performed on that task are possibly traceable.In addition to the task properties, some annotations must be added to the data on which each task operates, as shown in Table 3.For each data, the user must indicate whether it is personal (i.e.identifies or makes identifiable, directly or indirectly, a person) through a BPMN annotation.A "Load Dependence" type annotation must also be indicated for each data, describing how many copies the specific pool lane has available of the data.
Another piece of information to be taken into account is the amount of data that is processed.This is provided by two annotation types: size (integer) and order of size (unit symbol).The concatenation of both annotations gives us the size of the data (e.g. 10 MB).
It is worth noticing that, as outlined in Section 2, the business expert that builds up the model, at this very early stage, has never explicitly adopted any security concepts: he/she has only to add the annotations among a limited and already available set and near to his/her competences.
The model refinement step is implemented with our publicly available tool. 4 The tool parses the BPMN, identifies the tasks and the personal data produced by each task and asks the modeller to add the requested annotations if they are not already included in the model.Then it automatically enriches the model with the missing process due to GDPR compliance, as described above.Finally, the tool validates the model enabling the next steps of the methodology.

Threat modelling
The main advantage of the proposed technique is the automation of the Threat Modelling process (e.g.Granata et al., 2021Mallouli et al., 2023) and Risk Analysis (e.g.Granata et al., 2022), enabling the security assessment of the (e-government) processes.
These activities rely on two key inputs: the annotated model and our annotated ENISA threat catalogue.We adopted the ENISA Threat Landscape, in particular the 2017 version, as a basis for the analysis.We focused, for this first work, only on the first level of threats, and we excluded a few categories (e.g.Disaster Recovery and Legal) of threats, which we are planning to address in the future.It is important to note that the choice of the ENISA catalogue is due to the fact that ENISA threats are high-level and, therefore, more suitable for describing the security problems of a Business Process.However, this does not represent a constraint of the methodology as the source of the threats can be easily replaced.
For each task type (information given by the standard) we identified ex-ante the applicable threats in terms of (threats and malicious behaviours).A threat represents a high-level overview of malicious behaviour, whereas threat behaviour provides a more detailed and comprehensive description.Then, we identified the type of information (i.e.annotation type) needed for a given threat to make sense for a target task type.Last, but not least, we listed the different values that the annotation could have.As a final result, we produced an annotated catalogue (in the form of a relational database) that contains all the ENISA threats, each associated with an annotation type and a value.
As a methodological note and for the readers' comprehension, it is worth noticing that the list of attributes, previously discussed and listed in Table 3, was built after the identification of the threats, and by choosing the values of the right attributes for the threats.
Algorithm 1 GenerateThreatModel(BPMN) The TM automation algorithm (Algorithm 1), described as pseudo-code, proceeds as follow: first, we select a threat agent list based on the threat agent selection technique described in Granata and Rak (2021).According to this technique, threat agents are selected on the basis of a questionnaire that identifies their characteristics.Then, for each of the refined BPMN, we select all the malicious behaviours that are associated with the task type and have the same annotation of the selected task.
Each triple < ThreatAgent, Task, Behaviour > (i.e. a threat) is added to the Threat Model (i.e. a list of all the possible threats that affect the system or the process), obtaining as a result the list of standard threats applicable to each of the activities of the BPMN model.
We conducted a time complexity analysis of the algorithm, in order to evaluate its efficiency.Considering that the threat agent number cannot vary and is not an input variable (i.e. it is not part of the model), the complexity depends only on the number of assets and the threat number.By defining n the number of assets and m the number of threats, the complexity of the algorithm is O(n * m).
To give an example of the process, we can consider the Loss of (integrity of) sensitive information threat, that may occur when an attacker intercepts and modifies some information improperly secured during a transmission.This threat could be applied to a send task via mail or the post office.The use of a PEC (certified email), on the other hand, does not make the threat possible, as the integrity of the message is guaranteed.The selection of the threats relates both to the text annotations and to the asset types.An Erroneous use of devices and system threat is applicable, for example, for User Task and not for Manual Task, as it requires interaction with the software.
The final result is a pretty long table that lists all the applicable threats for each asset type, a brief threat description, and a list of the associated annotation values.The information contained in the table has been checked by multiple experts and validated through our case studies (processes from Regione Campania and the University of Campania Luigi Vanvitelli), which are not reported in this paper for brevity reasons.The annotated threat catalogue can be requested by the authors of this paper and is available through our open-source tool.

Risk analysis
The Threat Modelling process identifies the menaces that threaten the SuA (System under Analysis), while the Risk Analysis process evaluates the probability that a threat happens, prioritising the countermeasures.
It is worth noticing that Risk Analysis is always qualitative and not quantitative and very subjective.However, we adopted a well-accepted methodology (OWASP Risk Rating) that offers a guideline for the selection of the quantitative parameters.For example, to calculate the Loss of Accountability parameter, OWASP expects to answer the following question: Loss of Accountability -Are the threat agents' actions traceable to an individual?Fully traceable (1), possibly traceable (7), completely anonymous (9).
As a matter of fact, it is the expertise of the security evaluator that affect the result.Our methodology helps both the business expert and the security expert at identifying the assets that should be analysed.In practice, the evaluation are still subjective, but it is possible to easily understand which are the assets and the evaluations that affect the overall result and help them to correct the evaluations.Moreover, our tools suggest the initial values, report the criteria in the graphical interface and help in making decisions In order to quantitatively measure the risks (i.e.probability a threat can be performed), we adopted the Risk Rating Methodology proposed by OWASP (Williams, 2020) that, as commonly happens, evaluates the risk through the composition of two indicators (Likelihood and Impact), each estimated through eight factors (sixteen in total).
The OWASP methodology offers descriptive criteria to assign to each of the above factors a number between 1 and 10.The Likelihood and the Impact will assume a level of risk (Low, Medium or High) if the average of the values of their factors is respectively in the range 1-3, 4-6 or 7-10.The risk value of a threat is assigned through a table that maps the final risk level according to the Likelihood and Impact levels.
To automate the full process, we first ask for a few (structured) pieces of information that cannot be reported as an annotation to the expert, 5 then we assign a value to each factor of the OWASP methodology leveraging: • the above-cited general information • values of the annotations • default values previously evaluated and stored in our catalogues

Likelihood rating
The Likelihood is an indicator that expresses how likely a threat would be implemented.OWASP gives a quantitative representation by considering two sets of factors associated with Threat Agents and the type of Vulnerabilities involved.
The first factor describes the group of threat agents, by considering: Skill Level, Motive, Opportunity and Size.In Granata and Rak (2021), we proposed a technique that automates the identification of the threat agents and the evaluation of the associated factors, making only four very simple questions.The selection relies on 21 categories of possible threat agents and on 9 attributes that were proposed in Casey (n.d.).Due to space motivation, we invite the interested reader to check the referenced paper for the details.
On the other hand, the Vulnerability Factors are related to the vulnerabilities needed to exploit a specific threat and take into account the Ease of Discovery, the Ease of Exploit, the Awareness and theIntrusion Detection mechanisms.
In this context, the vulnerabilities are related to the threat behaviours described in our annotated threat catalogue.Accordingly, we were able to enrich our catalogue with a predefined value of the factor for each pair [Asset, Threat] in the catalogue.As already said above, in our model the assets, i.e. anything that has a value and must be protected, are the tasks and the data objects.In practice, this set of values is predefined and evaluated by experts off-line in our threat catalogue and will just be retrieved querying the DB with the right pair [Asset, Threat].
As a matter of fact, the only input requested for Likelihood evaluation, are the 4 questions identified in Granata and Rak (2021).The final result is the likelihood evaluation for each of the Threats produced in the Threat Model in the scale (Low, Medium or High).

Impact rating
Impact evaluation is an original extension of our Risk Analysis technique.In OWASP, the Impact relies on two set of factors, namely the technical and the business factors.
The Business factors takes into account what is important to the company running the application, considering the Financial damage, Reputation damage, Non-compliance and Privacy violation.We stored a set of default values for each of the ENISA threats in our catalogue, based on threat descriptions.Our tools enable (expert) analysts to change such values if needed, taking into account the characteristics of the system under analysis.
The most critical part, is the estimation of the technical impact that takes into account how a threat affects the security requirements of the asset in terms of the Loss of Confidentiality, the Loss of Integrity, the Loss of Availability and the Loss of Accountability.
In order to calculate the Loss of Confidentiality, Integrity, Availability and Accountability, we analysed how these parameters are interpreted by OWASP (Williams, 2020).According to the methodology, these parameters strongly depend on the data managed by the system, i.e. the data treatment register.We have therefore developed an algorithm that takes into account how much data each threat compromises and whether it is sensitive or not.The algorithm has been reported in pseudo-code in Algorithm 2. (Daniele et al., 2023) The algorithm takes the BPMN and the TM as input and evaluates the Technical Impact parameters for each threat of the threat model.The algorithm evaluates which security requirements the threat compromises and, for each compromised requirement, the parameter is calculated through a specific function.For example, the Failure of Device threat only compromises the Availability requirement, so only the LossOfAvailability (LoA) parameter is calculated.In this case, the LossOfCondidentiality (LoC) and the LossOfIntegrity (LoI) parameters are set to minimum value (1).Finally, the algorithm calculates the LossOfAccountability (LoAc) parameter by evaluating the global log level of the compromised tasks for each threat.The LoAc risk parameter is calculated as the complement between the global log level calculated for the compromised tasks and the maximum log level.The functions invoked to calculate the LoC, the LoI and the LoA parameters take a specific threat as input and calculate the parameter through the Technical Impact Matrix (TIM), which we have generated from the OWASP risk rating process.The TIM is shown in Figure 4.Each element of the matrix described in of LossOfC/I/A whether the threat compromises: (i) minimal non-critical data, (ii) minimal critical data, (iii) extensive non-critical data, (iv) extensive critical data, or (v) all data.
A pseudo-code for GetLossOfAvailabilityFromTIM function is shown in Algorithm 3. The function counts the number of personal data connected to each task and the total size in kilobytes of the data compromised by the specific threat.Once these parameters are calculated, the algorithm evaluates a personal data compromise rate as the percentage of personal data divided by the number of data in the business processes (system).If this rate is greater than 25%, then the threat compromises a lot of personal data (critical row of TIM matrix), otherwise, it compromises a few and therefore the value (i.e.LossOfAvailability) is taken from the Not Critical row of the table.A similar approach is done for the rateCompromisedData.If it is greater than 25%, then the first column of the TIM matrix is evaluated (Minimal), otherwise, the second (Extensive).Finally, if it is greater than 90%, the third column of TIM (all data compromised) is taken into account.It is worth noting that these threshold values were arbitrarily chosen, but can be adjusted by an expert during the tool configuration

Case study
To provide a better understanding of the proposed methodology, we present a simple case study about the citizen's request for a certificate from a municipality.A common and useful application of our methodology, as the following paragraph demonstrates, is the implementation of a DPIA (Data Protection Impact Assessment) process (WP29, 2017).This process is mandatory according to Article 35, paragraph 1 of the General Data Protection Regulation (GDPR). 6 However, the regulation does not impose how to execute the DPIA, and different frameworks exist.As a reference, we adopted the framework implemented by the Information Commissioner's Office (i.e.UK's independent authority), 7 whose steps are compatible with our security assessment technique.We invite interested readers to check the framework documents for more details.
Accordingly, we produced a mapping between the phases of the DPIA process proposed above and the ones we defined in the methodology, as shown in Figure 2.
As the guidelines (WP29, 2017) underline, in the first phase the system owner should identify the need for a DPIA, by identifying the system goals.To produce the documentation, the guidelines suggest seeking the Data Protection Officer (DPO) advices.In our approach, the system modeller provides this information by filling in the template.
The second step of DPIA is a description of the nature, scope, context and purpose of the processing of the system, providing some flow charts.In the proposed approach, this step is performed automatically as the system is modelled using BPMN which describes all the system processes and how they interact.Since the BPMN provides information about the processes, in addition to providing information on the tasks, the modeller must provide the input or the output data for each task during the refinement phase.In order to do this, as stated above, the modeller must provide the BPMN data objects for each requested data, enriched by some additional information: how much data is used and the type of data (i.e.personal or not).Moreover, in order to assess the GDPR compliance, the modeller must enrich the model by specifying where (and whether) each data is stored and how long the data will be stored.This is modelled through the Stored Data object, provided by the BPMN standard, and through a set of predefined BPMN patterns (described in more detail in the case study).
The third step of the DPIA framework is Consider Consultation in which the modeller describes when and how to seek the individual's view.We have implemented this phase through the DPIA template.In the next step, the system modeller should assess the compliance and proportionality measures.In our approach, this phase is implemented through a questionnaire provided by the template.By specifying this additional information, it is possible to extract the system treatment register containing information about the data: who processes them and where and whether they are stored.The Identify and assess risk phase maps perfectly with the threat modelling and the Risk Analysis phases described above.Similarly, to perform a data protection impact assessment, the process should identify the measures to reduce the risk.This phase is carried out by the policy definition step.In the last stages of a DPIA, however, the DPO must document the results of the previous stages and propose a plan to implement the countermeasures.This corresponds to the risk management phase.

A simple eGov case study: certificate request
To illustrate the approach we will use the process of a local administration for the management of requests of a generic certificate by an ordinary citizen to the municipality.It is worth noticing that a real municipality provides a lot of different processes and each of them must be modelled following the same methodology to grant correctness of the approach.However, a single process is enough to demonstrate and validate the approach.
Such a process involves two main actors: the citizen and the municipality.The process starts with the citizen: he/she compiles the certification request using the online portal provided by the public administration.
After the certification compilation, the software automatically sends an email to the municipality.Once the certificate request has been acquired, on the one hand, the municipality processes the request through a stateful service.When the request has been validated, the system sends a notification to the Citizen via email.Upon receipt of the notification, the citizen downloads the certificate from the web portal, completing the process.
Figure 3 describes the process through a basic BPMN diagram, which is our starting model.

Model refinement
According to the proposed approach, we request the modeller to correctly refine the process in order to produce a threat model and perform the Risk Analysis.It is worth noticing that, as we assumed initially, the modeller does not take into account security issues (which are out of his expertise), but focuses on a detailed description of the processes, through a guided procedure, that asks for details for each task in the process and each type of data involved.
Figure 4 shows the refined BPMN process.In this BPMN, all the data processed by the tasks have been added and annotated.For example, the Compile certification request task takes a data object as input which is the profile of the citizen annotated as: personal data, relatively lightweight (10 kB) and which has a single copy (load dependence = 1).
The structure of the process has been enriched by creating some run-time tasks for each personal data.For example, Store Profile and Delete Profile tasks, carried out by the Municipality, have been added to the Profile data.Since these tasks are carried out by the same actor that manages the Profile data, it was not necessary in this case to create an external Pool.
The refinement phase has defined a data store in which the municipality stores the profile information.The data is kept for 180 days by an offline service and then automatically deleted.
The other personal data in the BPMN is the certificate, which is generated by the Municipality and is automatically stored by an external provider for a validity period of 6 months.
It is worth noting that in this case, since the Municipality does not manage the data, a Pool responsible for storing and deleting the data has been created.Using our tool, the modeller can easily insert both all requested annotation and all input/output data of tasks, but also the data store information and the time events for deletion.The tool automatically obtains the BPMN (managed by BPMN libraries), modifies the graph and produces a new XML file with the enriched BPMN.
The enriched model enables the automated threat modelling, but has even additional uses: it is possible, in fact, to automatically generate the data treatment register (i.e. a document containing information about the data, who processes them, how long they are stored, etc.).In this work, we described the algorithm that automatically produces the treatment register at the end of the modelling Refinement phase, the result is compliant with the most diffused templates for the treatment register and can be easily customised and enriched.Each line of the treatment register contains the data processed, who process it, how it is processed and the ultimate terms of cancellation.The algorithm takes into account all the data objects of the BPMN, sub-selecting only the personal data.It extracts the tasks that process them and adds them to the treatment register, including the terms of cancellation (implemented through a BPMN Time Event).The information about the type of end for 8: end for An excerpt of the treatment register generated by the tool is present in the Table 5.

Threat modelling
As described in paragraph 3.2, annotation enables us to automatically extract the list of threats to which each task is subject.Below, is reported a summary list of all the identified threats, defined according to the ENISA threat catalogue.Threats: Fraud; Sabotage; Coercion, extortion or corruption; Erroneous use or administration of devices and systems; Loss of information in the cloud; Loss of information in the cloud; Loss of (integrity of) sensitive information; Destruction of records; Failure of devices or systems; Loss of resources; Absence of personnel; Internet outage; Malicious code/ software/ activity; Information leakage; Failure or disruption of communication links; Failure or disruption of main supply; Failure or disruption of service providers; Interception of information; Man in the middle/ Session hijacking; Identity theft (Identity Fraud/ Account); Replay of messages; Receive of unsolicited E-mail; Denial of service; Social Engineering; Damage caused by a third party; Damages resulting from penetration testing; For example, since the certificate is stored in an external provider, the "loss of information in the cloud" threat can be applied by a malicious user in order to delete it.This is applicable to the Download Certificate task as the loss of data on the server could deny the availability of the certificate.Similarly, the Compile certificate request task can be compromised by the loss of data on the cloud as the profile may be incorrectly stored or unavailable.
Whereas, the Social Engineering threat is applicable to Acquire Certificate Request task, but also to Receive notification task.It is, therefore, possible that the data received via e-mail may be due to a social engineering process and the malicious user can perform a phishing attack.It is worth noticing that, threat identification is commonly a human-oriented task, errorprone and hard to validate.Tools that support the DPIA, like the French PIA tool, 8 leave to human definition and insertion of the threats and countermeasures.To the best of the authors' knowledge, our proposal is the only one that fully automates the process, leaving to human check only the final verification.

Risk rating
Finally, we report the results of the Risk Analysis process applied to our simple case study.Firstly, we applied the technique described in Granata and Rak (2021) and Ficco et al. (2021) in order to calculate the threat agent OWASP parameters.In our case study, we assume that the modeller considers only the hostile threat agents.This may be due to the exposure of threats that can deny the reputation of the municipality and steal data.We also assumed that the security administrator has the complete trust of the municipality and service provider employees.Moreover, having marked copy, deny and take as the most dangerous actions in order to deny the reputation of the municipality and steal municipality data, we got results in the threat agent categories shown in the Table 6.We hypothesised the low-medium-high scores associated with each category considering the potential danger to the system.
Considering the threat agent categories as a result, the related OWASP score is the following: Skill Level 5, Motive 7, Opportunity 3, Size 3.
The vulnerability factors, i.e. those that depend on the threat, have been considered with a default value of 5 so that they do not affect the overall score.
The major contribution was the application of the algorithms shown in Algorithm 3 to the case study.To show how the Technical Impact Parameter algorithm works, we report only Store certificate task as an asset to make a simple example. 9 A threat that compromises this task is Loss of information in the cloud.This threat only compromises the Availability security requirement, therefore, according to the Algorithm 2, LoC and LoI are 1.To calculate LossOfAvailability instead, the GetLossOfAvailabilityFromTIM function is called.Applying the algorithm, we can state that the threat can compromise most personal data.The LossOfAccountability parameter is instead 1 for each task.This means that there is a detailed event logging system and that threat agent actions are easily traceable.Having made these considerations, the technical impact parameter related to Loss of information in the cloud are: LossOfConfidentiality 1, LossOfIntegrity 1, LossOf Availability 9, LossOfAccountability 1.
Business impact factors, as described in paragraph 3.3, are obtained from the default value provided by the threat catalogue for each ENISA threat.
For Loss of information in the cloud threat, for example, the business impact default scores are: Financial Damage 4, Reputation Damage 2, Privacy violation 7, non-compliance 7. The application of that specific threat, therefore, has an impact on the Municipality not so much financial or reputational, but more on the violation of privacy and compliance.Calculating the risk as the average of the 16 OWASP parameters, it appears that the threat considered has a MEDIUM risk.By calculating the risks to each threat applicable to the Store Certificate asset, it appears that the threats with the greatest risk are Erroneous use of device and Coercion, extortion or corruption, with a risk of about 6.This allows the system owner to plan the appropriate countermeasures, giving priority to those with greater risk.
It is worth noticing that the threat agent algorithm used above not only enables the acquisition of crucial information for conducting the Risk Analysis process but also facilitates the enumeration of all potential threat agents, whether they originate from within or outside the system.The obtained results offer a valuable advantage by providing a comprehensive overview of the threats that impact the system, as well as the identities of those who have the capability to apply them.
To summarise, the process just applied to allow the modeller to have a corrected and enriched BPMN used as input of a threat modelling process (based on the ENISA threat landscape) that lists all the high-level threats that can compromise the system.Applied to our simple case study, our technique was able to identify 108 different threats applied to the few tasks of the BPMN.Each different threat has been associated with a risk value using the OWASP technique described above and 20 different security controls (described by CIS control framework) have been suggested in order to reduce this risk.

Conclusions and future work
The analysis of the state of the art outlined that (i) business experts rarely have security expertise and (ii) security should be addressed from the early design stage.In this paper, we proposed a new technique that enables a non-security expert business modeller to automatically derive the security threats and rate the associated risks.In terms of BPMN, the technique relies on simple annotations and does not require BPMN-specific extensions.
The proposed solution adopts a threat catalogue based on the ENISA Threat Landscape to automate the threat identification and the risk rating process.The annotated catalogue represents on its own a useful open-source contribution.Moreover, we implemented a proof-of-concept tool that supports the business modeller during the design phase.
The catalogue, the method, and the software tool have been used within a collection of actual case studies within the SSeCeGov Project.This initiative encompasses procedures from both the Regione Campania and the University of Campania Luigi Vanvitelli.However, in our current work, we were unable to delve into these intricate examples.Additionally, the case study has been employed by legal experts and procedure specialists.
This paper is the first step in the direction of a new way of interpreting the relationship between BPMN and security, by which security is automatically derived from the process model and, eventually, managed with specific security tasks.We firmly believe that BPMN should not be changed or extended to express complex security and extensions, but instead, should be used as it is and model security concepts and best practices through the standard models.
In future work, we aim to develop a technique to optimise the configuration of the risk thresholds (e.g. percentage of compromised data).Moreover, we are working on a technique to select countermeasures in terms of standard security controls from wellknown control frameworks (like the CIS Controls or the NIST SP-800-53 Control Framework) based on risk rating results.The technique can also be integrated with our BPMN-based security control verification technique (Rak et al., 2022) in order to verify the presence of countermeasures in the system based on the suggested security controls.

Figure 1 .
Figure 1.The flow of technique.

Figure 2 .
Figure 2. Mapping between DPIA and the methodology.

Table 1 .
Approaches adopted in literature to consider security concepts in BPMN.

Table 2 .
Approaches adopted in literature to extend BPMN with security concepts.

Table 4 .
Table 4 is in the C/I/A format and is related to the value Technical Impact Matrix (TIM) in format C/I/A.

Table 5 .
Part of treatment register.is requested by the modeller by the selectDataType function.The two required parameters are the Datatype string, i.e. the type of data, and a data schema, which contains the representation of the data requested through a schema (for example XSD format). data

Table 6 .
Results of threat agent selection phase in the example scenario.