Using computer assisted qualitative data analysis software (CAQDAS; NVivo) to assist in the complex process of realist theory generation, refinement and testing

ABSTRACT There have been several calls for more transparency in realist methods, particularly in the complex process of programme theory development and refinement. This paper will describe the way in which Computer Assisted Qualitative Data Analysis Software, specifically, NUD*IST Vivo (NVivo), was used to build and refine programme theories (using literature and interview data) in a realist evaluation. This article presents the evolving and complex process of coding several data sources to nodes and child nodes, whilst writing ‘attached memos’ to highlight the process of theory generation. In this project, NVivo helped create an explicitly documented and evidenced audit trail of the process of programme theory refinement, answering to calls for further transparency in realist analysis. RAMESES I and II have provided a platform to improve transparency in reporting realist research, by developing consensus and evidence-based reporting guidelines. We propose that the use of NVivo in realist approaches can help structure the iterative and by nature ‘messy’ process of generating, refining and testing complex programme theories when drawing on multiple data sources simultaneously. This effectively creates a structured track record of the analytical process, which increases its rigour and transparency.


Background
Computer-assisted qualitative data analysis software (CAQDAS) has been used as an aid to data analysis in qualitative research in several methodological fields, including grounded theory (Bringer et al., 2004), interpretative phenomenological analysis (Clare et al., 2008) and realist meta theory (BERGIN, 2011). NVivo, a form of CAQDAS, 'supports code-based inquiry, searching, and theorizing combined with ability to annotate and edit documents' (Richards, 1999, p, 412). Realist researchers have found using the programme challenging but valuable in advancing the robustness of qualitative research (Bergin, 2011).
Realist evaluation is used to understand and evaluate complex social programmes (Pawson & Tilley, 1997). It focuses on 'what works, for who, why and in which circumstances' using Context, Mechanism and Outcome Configurations (CMOC) as opposed to asking only whether or not an intervention 'works'. To operationalise this, explanatory statements are developed and tested, resulting in a refined programme theory. A key analytical tool in realist evaluation is the CMOC, conveying that intervention resources are introduced into contexts in a way that enhances a change in reasoning; this alters the behaviour of participants, which leads to measurable or observable outcomes (Dalkin et al., 2015a;Pawson & Tilley, 1997).
Realist evaluation and realist programme theory building is an iterative process, which often demands engagement with numerous data sources. This can make the often convoluted and iterative process of developing, testing and refining complex programme theory difficult. There have also been calls for greater transparency in realist methods (Welch & Tricco, 2016), in regard to how programme theories have been developed and refined, sometimes using various data sources. Literature also suggests that researchers find realist methods difficult to operationalise (Dalkin et al., 2015a;Feather, 2018;Shaw et al., 2018). Techniques to maximise the transparency of the realist analytical process have included systematically dated recordings of decision-making for a whole project in an MS Word document (Lhussier et al., 2015) the use of distinct MS Word documents for each individual programme theory (Dalkin et al., 2018b, and use of google docs (Turner et al., 2018). Whilst these permitted a systematic recording of the analytical process undertaken in developing and refining theories, they presented key challenges with regards to working across different datasets, and integrating data in this analytical process trail. This meant that although there was a clear effort to increase transparency of an inherently iterative process, the way in which this could be utilised beyond the team, and the way in which various analytical decisions could be rationalised was limited. The RAMESES II reporting guidelines for realist evaluations is one way in which the processes surrounding realist research have been illuminated. The guidelines ensure realist evaluations are reported in sufficient detail, in the context of existing evidence, and with a rating of strength of evidence for main findings that will greatly assist users of the evaluations (Welch & Tricco, 2016;Wong et al., 2016). While these standards have been invaluable in ensuring methodological clarity and comprehensiveness in the reporting of realist projects, less material is currently available which gives an insight into the processes which lie behind orderly, published accounts of realist evaluation. It can be difficult to evidence the analytical micro processes which lead to a clearly formulated programme theory in realist research, especially given the nature of the complex intervention under study. Welsh (2002, p. 1) states that 'Computer assisted qualitative data analysis software (CAQDAS) has been seen as aiding the researcher in his or her search for an accurate and transparent picture of the data whilst also providing an audit of the data analysis process as a whole -something which has often been missing in accounts of qualitative research.' Therefore, we propose that the use of CAQDAS (such as NVivo) could be a tool in the realist evaluators box, which aids them in the inherently complex approach to theory generation; in doing so this may also enhance transparency.
The paper adds to a scant evidence base on the use of NVivo in realist evaluation (Dalkin & Forster, 2015b;Douglas et al., 2010;Gilmore et al., 2019;Maluka et al., 2011;Marchal et al., 2010). Bergin (2011) has carried out a meta-realist theory analysis; this approach and analysis process was somewhat different to what we describe below, drawing on realist meta-theory (Bhaskar, 1989) as opposed to realist evaluation (Pawson & Tilley, 1997). This meant that a thematic analysis was utilised as opposed to a realist logic of analysis driven by programme theory. In this paper, we therefore aim to demonstrate the use of CAQDAS, specifically NVIVO, in the organisation and analysis of a realist evaluation.

Method
This article will provide a case study of how CAQDAS, specifically NVivo, was used to aid in the complex and messy process of theory generation, refinement and testing in NVivo, using an example of a recent realist evaluation exploring the health impact of welfare advice. The full details of the study are provided elsewhere (Dalkin et al., 2018;Forster et al., 2016). In brief, a realist evaluation of an intensive advice service (provided by Citizens Advice) in the North East of England explored the impact advice had on health, using a stress and wellbeing lens. Quantitative findings indicated that stress was significantly decreased and wellbeing increased after interaction with the service. This was explained through qualitative data, highlighting that advice worked through increasing individual capabilities, fostering trust, and acting as a buffer between state organisations and the client.
The following section will focus on the process of using NVivo as opposed to presenting the findings of the Citizens' Advice study. The aim of the article is not to explore the depths of NVivo and all of its functions, but to provide a case study of how it can be operationalised in a realist analytical process.

Findings
As highlighted in the RAMESES II guidelines (Wong et al., 2016), every realist evaluation presents itself differently and the focus here is therefore not on standardisation of NVivo use. As with the method itself, use of NVivo requires flexibility and should be tailored according to the specific programme and focus of the research.

Development of initial programme theories as nodes
The research process began with 'hunches' about how the Citizens' Advice projects might have a health impact for clients. Hunches can be defined as the evaluators' 'informed guesswork' about how the programme works (RAMESES II Project, 2017a); these initial hunches were formed from the evaluators' informal knowledge of the programme. They constituted rough, unformatted and unedited ideas about how the programme worked and sometimes took the form of 'if-then' statements. For each hunch we made a node. Nodes are central to working with NVivo; they function to gather related material in one place so that emerging patterns and ideas can be identified. Nodes are usually created as 'themes' or 'cases' such as people or organizations. In the project described here, nodes were initially used as 'hunches' or ideas around how the programme worked. Each node was given a title, such as 'Shaming the unhealthy'. We then created a 'linked memo' for each node which allowed us to provide a more detailed description of the thinking behind our initial hunch. At this point, it became clear that coding by C, M and O would lead to disjointed themes and therefore a decision was made to code using a programme theory lens, which is outlined here. Thus, each node was developed from an 'initial hunch' into an initial programme theory at this point, through theory development sessions conducted as a full research team. This was based on our understanding of the advice service, from a general literature scope carried out for the project's funding bid and protocol.
Following from this we conducted realist interviews with Citizens Advice (CA) staff. Using Manzano's (2016) three-stage realist interview process, this constituted the theory-gleaning phase. The focus of these interviews was to understand generally what works for clients receiving advice, specifically for whom, in which circumstances and why. These interviews aimed to develop our initial hunches into well formulated Initial Programme Theories (IPT), which could be formally tested through further empirical data. The interviews were transcribed and then imported into NVivo. Interview data could then be coded to the IPT nodes and where information was new (not covered by an existing IPT node) a theory/node could be created. For example, the IPTs did not detail that CAB can provide brief health interventions; this was shared during interviews with Citizens Advice project leads and therefore was developed as an additional IPT. This process helped us to develop and refine our IPTs, exploring the different context, mechanism and outcome configurations associated with the CAB projects.
We then revisited the literature in more detail to find supporting and disconfirming evidence for our IPTs; realist evaluation and realist programme theory building is an iterative process (Pawson, 2006). In order to keep various primary and secondary data sources coded under the same nodes but stored in distinct folders, so as to facilitate data retrieval, we used the N-Vivo function of child nodes. Child nodes allow you to create 'sub themes'. Therefore each node (for example, 'Basic Needs') now had two child nodes: 'Literature' and 'Interview' (Figure 1). For each overall node, we recoded our interview data from CA staff into the child node, '1 st interview with CA staff'. We then selected the 'aggregate coding from child nodes' function which meant that the main node now stored information from both interviews and literature. This gave us the option of examining data from all or only select sources, for each theory node.

Initial programme theory refinement with CA staff
We then interviewed the CA staff a second time; this constituted the theory refinement stage (Manzano, 2016). The initial programme theories were shared in interviews with staff, who were given the opportunity to comment upon and suggest additions to these theories. This was done in the form of general questions, developed from the IPTs, as opposed to presenting the theories in CMO form (Manzano, 2016). These interviews were transcribed and imported into NVivo, before being analysed and coded to IPT nodes where appropriate. The theory was then refined, based on the data from this second set of interviews with staff. The theory refinement process was conducted as a team, with the discussions and rationale for adjustments to theories recorded and dated in linked memos associated with each node (Figure 2). This meant that the full team's thinking was captured and reasons for changing the programme theory were explicitly noted. Where the IPT changed, additions were inserted using coloured font and deletions using a strikethrough. This ensured it was explicit to all the team how the programme theory has evolved throughout the project.

Programme theory 'testing'
The analysis of the interviews with Citizens Advice staff led to 17 IPT (Figure 3). Interviews with 22 clients were conducted to test the initial programme theories. The interviews were transcribed and imported into NVivo, in the same way as for staff interviews. These were then coded to the appropriate node, under the child note of 'client interviews' (Figure 4).
Often, it was felt that analysis led to coding under a programme theory that didn't quite 'fit'. At this point, a team member would call a full team meeting for programme theory refinement. Thus, whilst coding of the transcripts was done by independent team members, programme theory refinement was carried out by the team as a whole using the main node which encapsulated data from CA staff interviews, client interviews and the literature.
The full team (5 people) met bimonthly (or more regularly if necessary) to discuss analysis and refine the programme theories. After 4 interviews had been carried out with clients, the team felt that the programme theories required refinement. As noted above, this was highlighted due to issues in coding to the nodes we currently had; team members were finding their coding did not 'fit' with the current nodes (programme theory) suggesting refinement was required. As individual team members had analysed interviews, they came to this meeting with evidence-based and theorydriven ideas as to how the programme theory should be refined. A process of debate then ensued, anchored by reading data extracts together as a team, in order to refine current or create new programme theories which capture and explain all data. The process was therefore two fold; individual team members coding single interviews to pre-existing nodes; then whole team reviewing the nodes and refining their formulation in view of the data, utilising retroduction. Retroduction refers to the identification of hidden causal forces that lie behind identified patterns, recognising the insufficiency of both inductive and deductive logic (Jagosh, 2020).
PTs were often 'voided' when unsubstantiated by data. However, in order to ensure they were not forgotten, they were not deleted and remained within the NVivo file. Should relevant data later emerge they could then be 'unvoided'. The authors acknowledge that the term 'void' does not represent the realist premise of theory refinement, where no theories are 'thrown out' of the analysis (Pawson & Tilley, 1997). A better term for these theories could be 'unsubstantiated at that time', as these theories were never discounted, and were sometimes merged with other theories, but Opening the scrapbook -screen shot of node linked memo, with restated programme theory, highlighting how it was edited using additional coloured writing and strikethrough font function.
regardless the data was never lost. In the spirit of transparency and as we show the inner workings of our NVivo file, the term 'voided' has been used throughout.
Often as a process of refining programme theories, the names of the actual theory would evolve. For example, 'Basic Needs' changed to 'stop gap' to more efficiently capture the essence of the programme theory. The final list of nine programme theories is shown in NVivo in Figure 5.

Overall explanatory framework
An overall explanatory framework to understand how advice impacted on CA clients stress and wellbeing was developed from the programme theories, informed by substantive theory. Figure 6 displays the final list of programme theories, and whether they were 'voided' or contributed to the overall explanatory endeavour.
Specific substantive theories were identified through both structured searches and the project team's own theoretical knowledge base.   . Final list of 'tested' programme theories; also displaying those that were 'voided' throughout the analysis.

Discussion
Realist evaluation and realist programme theory building is an iterative process and often demands engagement with numerous data sources. This paper provides consideration of how we conduct theory-driven realist research, how theories start as hunches, which are then refined using evidence. It also highlights how these theories are the focus of discussion and disputation amongst scholars, where the theories are refined, judged, sifted, winnowed and tentatively unsubstantiated. Use of NVivo allowed us to capture these theory generation discussions, whilst thinking out loud and being immersed in the data in a shared way. This allowed us to better share and synthesise perspectives from the data as a group, rather than in isolation. It also meant that no reflection was lost, ambiguous or unable to be challenged and refined in the future.
The paper illuminates this important and vaguely understood aspect of realist analysis. The use of NVivo could aid in the pragmatics of engaging in the 'messy' and iterative process of realist sense making from multiple data sources, thereby enhancing rigour as an audit trail of the analytical process is documented, and transparency as no step in the process of analysis was lost to this documenting endeavour. Whilst neither we, nor realist researchers, aim or want to find a method to audit qualitative research, we propose that NVivo can aid in the complex process of programme theory development, refinement and testing, whilst increasing transparency; even if this transparency is of use only to the internal evaluation team. Use of NVivo allows 'tracking' of initial through Figure 6. List of programme theories; those that were 'voided' and those that contributed to final (middle range) explanatory framework.
to tested programme theories, with the use of linked memos and different data sources (e.g., literature, client interviews, staff interviews) utilising child nodes. The number of programme theories can be tracked, and no programme theories are forgotten in the multifaceted and iterative analysis due to the process of 'voiding'. This not only provides clarity whilst carrying out complex realist analysis, but also when writing for publication or presenting interim findings.

Overcoming issues of transparency
As a project team, we found that use of NVivo allowed for essential retroduction and group production of refined programme theories, drawing on all of the various expertise in the team. This was time consuming, as opposed to progressing analysis in isolation; but it carried more explanatory potential drawing on the knowledge of all team members. This issue isn't solely applicable to NVivo, but to all group projects using realist analysis. However, we feel that NVivo aided in the group process by tracking all aspects of programme theory refinement using linked memos, thus enhancing transparency. It also caveated for unintended occurrences, for example, researcher illness.
Importantly, the technology did not decrease the amount of time needed to read, conceptualize, and analyse data (Bringer et al., 2004). Data analysis in realist research, involving the identification of underlying generative causal outcome patterns, is iterative and time consuming (Punton et al., 2016;Robert et al., 2017). Using NVivo did not necessarily reduce analysis time, but did make writing up findings easier, due to clarity in justification of findings. It provided an anchor for team 'brainstorming' around the development and testing of programme theories, in a way that was very pragmatic and grounded in the data. It meant that the whole team could engage in data analysis, whether they had physically collected some, none or all the data. It thus provided a space for team members to challenge each other's interpretation in a productive disputatious space, where everything was recorded systematically. We do not wish to encourage an instructive or 'one size fits all' approach to the activities of theory generation; as with realist approaches in general, the theory generation, refinement and testing documentation processes should be tailored to the individual project. This should be thought through, and decisions about technology thoroughly considered, alongside other creative means of theory generation. The process of using N-Vivo meant that there was a thorough sense-checking procedure in place, adhering to the systematic and thorough application of the principles of qualitative research, which added rigour to the analysis (Barbour, 2001). The approach provides quality assurance that is more complex than checklist 'technical fixes', as described by Barbour (2001).

Engaging with multiple data sources
A further benefit of using NVivo was the ability to upload both primary and secondary data which can be used for coding. This allows literature to be considered as data, which is consistent with a realist approach. This therefore facilitates prior theoretical ideas, concepts, models or propositions to be used in relation to theoretical sampling and theory generation (Layder, 1998). Furthermore, as the blending of evaluation and synthesis continues (e.g., (Cooper et al., 2017;Maidment et al., 2017)) we feel there is much scope for NVivo to be useful within this approach, which integrates both literature and empirical data.

Challenges and future research
The process does have inherent challenges; although the software is fairly user-friendly, it can be time consuming becoming familiar with NVivo and its functions. Furthermore, system issues can present further problems. For example, due to institutional system restrictions at the time, in our project only one researcher could work on the file from a shared drive at once. This meant that the master file, which was saved on a password protected institutional shared drive, had to be downloaded on individual computers while working on it, and re-uploaded to ensure data protection on a secure drive. However, these issues are not distinct to realist approaches (Bergin, 2011).
This constituted the research team's first attempt at the use of NVivo in a realist evaluation. There are undoubtedly other ways in which NVivo can be employed in a realist project. For example, nodes could be used for Contexts, Mechanisms and Outcomes instead of programme theories. The use of NVivo will (and should) be different dependent on the individual project; all realist projects require tailored data collection and the analysis should also be project dependent. More complex functions could also be employed in NVivo, for example, using matrices, and we highlight this as an avenue for future research. Realist evaluations are carried out from different disciplinary perspectives and use a plurality of methods that are fit for purpose due to the method neutrality of the approach. NVivo currently allows input of audio, text based, and visual material, and more innovative approaches to data collection could be incorporated in to realist analysis using NVivo. For example, stimulated recall (Calderhead, 1981) which utilises video technology could be used where appropriate.
Finally, in this specific project, we feel we could have further integrated the substantive theories considered and thus added an extra layer of transparency at this level of abstraction; we will look to action this in future projects utilising NVivo and realist approaches.

Conclusion
RAMESES I and II have provided a platform to improve the understanding and reporting of findings in realist research, by developing consensus and evidence-based reporting guidelines (Wong et al., 2016). We have shown how the use of NVivo in realist methods has the potential to aid realist researchers in the complex process of theory development, refinement and testing. It may also add transparency to the approach, by using several NVivo functions in innovative ways. Having illustrated how we used the different functions offered by NVivo in one realist evaluation project, we invite other researchers to take our work further, and to explore and advance the use of NVivo in realist methods.

Disclosure statement
Ethical approval for the study was granted by Northumbria University's Ethical Approval system on 01/06/2015; all participants from the study provided informed consent to participate and for publication. The data collected in the study is not readily available due to ethical constraints. Materials used throughout the study are available upon request. The authors declare that they have no competing interests. This study was funded by the National  (Welsh Assembly Government) and the Wellcome Trust, under the auspices of the UKCRC, is gratefully acknowledged. SD drafted the original manuscript; NF, PH, ML and SMC aided in revisions and study execution. The authors would like to acknowledge our practice partners Citizens Advice Gateshead and thank all the participants who took part in the initial study. We would also like to thank Fuse (The Centre for Translational Research in Public Health) for their support throughout the project.