Should we share qualitative data? Epistemological and practical insights from conversation analysis

ABSTRACT Over the last 30 years, there has been substantial debate about the practical, ethical and epistemological issues uniquely associated with qualitative data sharing. In this paper, we contribute to these debates by examining established data sharing practices in Conversation Analysis (CA). CA is an approach to the analysis of social interaction that relies on audio/video recordings of naturally occurring human interactions and moreover works at a level of detail that presents challenges for assumptions about participant anonymity. Nonetheless, data sharing occupies a central position in both the methodology and the wider academic culture of CA as a discipline and a community. Despite this, CA has largely been ignored in qualitative data sharing debates and discussions. We argue that the methodological traditions of CA present a strong case for the value of qualitative data sharing and offer open data sharing practices that might be usefully adopted in other qualitative approaches.


Qualitative data sharing: nature and context
Interest in potential reuse/secondary analysis of qualitative data has grown since the 1990s (Heaton, 2008;Hughes et al., 2020).Arguments for sharing and reusing qualitative data include checking of findings, fostering public trust in science, and enhancing research training (DuBois et al., 2018).In addition, existing data can be analysed to produce new findings, which are time and cost-effective for researchers and avoid unnecessary burden on participants (Kuula, 2011).The first qualitative data repository, Qualidata, was established over 25 years ago (and is now part of the UK Data Service 1 ).Since then, technological advances have increased the capacity to store and facilitate access to large datasets in repositories (Chauvette et al., 2019, p. 2;Corti et al., 2016). 2 Increasingly, research funders and publishers encourage and even mandate QDS (Antonio et al., 2019;Chauvette et al., 2019) through policies of open access, shaped by a commitment to principles of transparency and scrutiny, and to maximising the social value of publicly funded research (UK Research and Innovation (UKRI), 2021).In the UK, important milestones for QDS were the adoption by all research councils in 2011 of the Common Principles on Data Policy and in 2012 of the Policy on Access to Research Outputs which required research outputs to make explicit how the research data would be made available (Bishop & Kuula-Luumi, 2017).With 55% of research in UK HE funded by research councils, these policies 'strongly influence research practices' (Bishop & Kuula-Luumi, 2017, p. 2).The UK Research and Innovation (UKRI) conducted a review, in the UK (published February 2022), of open access policies and the data sharing landscape which reiterated their concordat on Open Research Data (UK Research and Innovation (UKRI), 2021) emphasising that publicly funded research should be openly available with as few as restrictions as possible.
Nonetheless, concern about QDS persists.Resistance is often expressed on ethical and epistemological grounds (Chauvette et al., 2019;Mozersky et al., 2020a).Two large-scale surveys of scientific researchers identified a number of recurring concerns: fears relating to participant anonymity, misinterpretation in secondary analyses, invalid conclusions, data errors, being scooped, and researcher burden.Overarching and unifying guidelines, policies, and mandates are also still lacking internationally.Moreover, where they exist, data sharing approaches, policies, and repositories are mostly established with quantitative research in mind (Antonio et al., 2019;Tsai et al., 2016).For example, pre-registration forms required by some funders are often inadequate adaptations of forms designed for quantitative work (Humă & Joyce, frth).Recent empirical research on QDS preparedness in the US found that even repository specialists lacked experience and knowledge relevant to QDS (Mozersky et al., 2020a).Many felt unprepared to advise qualitative researchers -particularly on decisions about sensitive data.Similar limitations in knowledge and preparedness were found amongst qualitative researchers and institutional ethics committee members.This US study found little experiential knowledge of QDS; each group (researchers, ethics committees and repository staff) felt that the primary responsibilities and decisions about QDS lay elsewhere.Even if those groups were to become more experienced, there remains a lack of agreement and guidance on best practice, exacerbated by different requirements between institutions and repositories, and different countries' laws relating to cross-border data sharing.Arguably, resolving these infrastructural and practical concerns depends on also addressing the debates about QDS to which we now turn.

Debating qualitative data sharing
Literature reflecting on the viability of QDS is dominated by discussion of ethical and epistemological challenges posed by the various forms of qualitative data, each with different affordances, including (inter alia) observational data (e.g.fieldnotes and audio/video recordings), participant produced data (e.g.diaries) and researcher elicited data (e.g.interviews).Across this section, we review the literature outlining these challenges and the impact they have on qualitative researchers' commitment to data sharing and data reuse.We discuss ethical and epistemological challenges that exist across the range of qualitative methods as well as relating to specific methods.

Ethical debates
Some qualitative researchers argue that it is impossible to ensure that research participants know what they are consenting to when it comes to data sharing -how data might be used in future projects and by other researchers (Parry & Mauthner, 2004; see also, Chauvette et al., 2019).Consent for data sharing can only ever be secured in a general manner for and about the process itself, rather than for specific research (Irwin, 2013, p. 297).The issue, then, is whether it can be considered ethical to share data when fully informed consent for the myriad ways data might be used can never be achieved.
The counterargument, however, is that it is impossible to be fully 'informed' about all aspects of research (even research questions, for example, may not be formed prior to data collection in some qualitative approaches; Bishop, 2009).Hence, this is not a reason to dismiss QDS.Additionally, qualitative research participants, who invest time but also emotionally in the research, generally seem to support data sharing and to assume it occurs to a greater extent than it does (Kuula, 2011).Indeed, participants in sensitive qualitative studies, interviewed by Mozersky et al. (2020b), reported broad support for QDS where data are anonymised, although, when pressed, expressed concern about confidentiality and potential misuse/misunderstanding in future research.This research also suggests that qualitative research participants trusted research institutions and their researchers to be sufficiently transparent with data collection and sharing plans.These findings echo similar work on the trust and value of research, such as Parry et al. (2016) who surveyed research participants and reported that most regard qualitative video-based research as acceptable, and Williams et al. (2010) who reported an overwhelming majority of participants believing that recording was worthwhile.
There are, however, other potential ethical issues.Qualitative data can be highly sensitive, confessional and intimate.Ensuring confidentiality and anonymity and protecting participants from unintended identification are vital, but rigorous anonymisation is time and labour intensive and challenging.In some forms of qualitative research, even if all possible measures of anonymisation are taken, there is still the possibility for identification as total anonymisation is impossible (Hopkins, 1993).For example, in longitudinal data collection or in other forms of data collection that link different sets of data together, the accumulation of information and associations potentially present a greater disclosive risk (Law, 2005).Similarly, qualitative research that takes place in small communities or on phenomena that are rare may be particularly hard to fully anonymise (Chauvette et al., 2019;Hardy et al., 2016).
This poses a challenge to QDS on two-fronts: the integrity and quality of certain forms of data is compromised when it is digitally altered to ensure participant confidentiality (such as video data; Bishop, 2009: 262), and the reuse of original (unedited) data runs the risk that researchers outside of the original project may not know what should be anonymised and how it should be anonymised.These are challenges whose answers may have profound consequences for the scientific analysis (Corti et al., 2000).
It is true that there are no straightforward or 'one size fits all' answers to these issues (see, Humă & Joyce, frth).However, it has also been argued 3 that 'too often the critics of reusing qualitative data have narrowly construed the debate to focus solely on participants -to the exclusion of other agents -and rights -to exclusions of duties.Such arguments do not do justice to the depth of moral debate required' (Bishop, 2009, p. 258).In this sense, advocates of QDS argue for a broader range of ethical considerations.From this perspective, the benefits to knowledge, policy and society from data sharing (weighed against, in the majority of cases, minimal risks to participants) need to be given greater emphasis (Bishop, 2009).For example, data sharing can mean avoiding the unnecessary intrusion and burden on participants that result from collecting data that already exist.

Epistemological debates: the problem of context
The central epistemological debate in QDS literature concerns the contextual nature of qualitative data.One sense of this contextuality relates to the relationship between the researcher and the overall research process.In this sense, qualitative data is argued to be contextual in that it reflects the original researcher's positionality, beliefs, judgments, disciplinary assumptions and boundaries, as well as their theoretical and methodological inclinations and intentions within those disciplinary boundaries (Irwin, 2013).These aspects are embedded in the data, uniquely shaping its constitution and analysis.Reflexivity is thus a central practice of the qualitative paradigm, requiring the researcher to explicitly examine how these underpinning beliefs and practices have shaped the data and its analysis.However, for many qualitative approaches, this practice is not available in secondary data analysis (Mauthner et al., 1998).As a consequence, many qualitative researchers maintain that only they (and their team) can analyse their data in a contextually fitted and adequately reflexive manner.
A second aspect of the contextuality of qualitative data focuses more narrowly on different conceptualisations of context, particularly with respect to how it relates to talk/text.Berg (2008, pp. 186-188) describes three ways of conceptualising context: as (broad) extra-discursive template, where the relation between text and context is predefined (e.g.Critical Discourse Analysis); as (narrow) intra-discursive product, where context is only relevant when demonstrably made relevant in a participant's talk (e.g.Conversation Analysis); and as (intermediate) conditions of discursive production, where the necessary contextual information depends on the focus of the research and the data being used (e.g.ethnography of communication).The broad and intermediate conceptions of context view qualitative data as generated in a specific time and setting of which the primary researcher necessarily has first-hand, intimate experience.In this sense, ethnographic fieldnotes, for example, may be difficult or impossible to meaningfully interpret by a researcher who did not participate in the original research (Chauvette et al., 2019).As Hammersley (2010, p. 3) notes, 'in the process of data collection researchers generate not only what are written down as data but also implicit understandings and memories of what they have seen, heard, and felt, during the data collection process'.According to this perspective, the extent of contextual understanding in secondary analysis will necessarily be more limited and interpretive (Hammersley, 2010).This understanding of context, however, is not in harmony with CA's perspective which we will consider later.
Other contributors to context as an intra-discursive product posit that data are constructed and not independent of the research process -that, in other words, 'context' is not ontologically separate from data (Mauthner & Parry, 2009).Ethnomethodologists (Garfinkel et al., 1981;Lynch, 1982) have questioned how 'data' is even first granted that status by researchers, and how a discipline's technical language and concepts must be deployed to alert others to the presence of data (Maynard & Clayman, 1991).From this perspective, researchers do not 're-use' data, because data are constituted for the first time in a particular research project (Moore, 2007; see also , Bishop, 2007).
The past two sub-sections have explored the existing literature concerned with data sharing in qualitative research.This has shown gaps in researchers' agreement on the value of -and commitment to the practice of -QDS; significant interpretive and practical difficulties associated with this; and a series of ethical and philosophical questions regarding the sharing and reuse of data.The paper now turns to understandings and approaches to data sharing in CA and considers their potential to address these gaps, difficulties and questions.

Centrality of data reuse in conversation analysis
Before detailing CA's contributions to data sharing debates, it is necessary to provide a brief overview of CA's history and its arguably unique relationship with data sharing. 4CA draws focus on practices and social actions rather than people or experiences which means it is commonplace to ask very different questions of reused data.In the early days of CA, data collections tended to be limited to audio only.Initially, Harvey Sacks, while a researcher at The Suicide Prevention Centre, analysed recorded phone calls for his PhD thesis in 1966. 5 Since its inception, data has been central to CA's concerns -with much of the early development of CA by Sacks drawing on two collections of audio recordings: calls to the suicide hotline and group therapy sessions (Sacks, 1992).Over the following decades, a number of phone call corpora were created, notably recordings taken from around Santa Barbara in California which the CA community refer to as the Newport Beach corpus or more commonly: 'Classic data' and Elizabeth Holt's 'Holt corpus' of phone calls recorded by a British family over 3 years.These are data which are widely available and widely reused and on which much of the groundbreaking work in CA is based. 6The (re)use of this data speaks to a core ideal in CA -that data used in research should be made available to check findings (often in the form of transcripts, but also with visual representations (see, Walker, 2017) or in the sharing of audio/video data).Sacks illustrates this point: 'It was not from any large interest in language or from some theoretical formulation of what should be studied that I started with tape-recorded conversations, but simply because I could get my hands on it and I could study it again and again, and also, consequentially, because others could look at what I had studied and make of it what they could, if, for example, they wanted to be able to disagree with me.' (Sacks, 1984, p. 26, emphasis added) The technology of the time influenced the practices of the discipline: recordings could be replayed, scrutinised by others and made available for future studies by other researchers.Sacks goes on to explain how he chooses the data he works with: 'People often ask me why I choose the particular data I choose.[.]And I am insistent that I just happened to have it, it became fascinating, and I spent some time at it.'(Sacks, 1984, p. 27) Data which CA researchers just happen to have have been used and reused in a number of studies addressing a wide range of interactional phenomena.CA with its grounding in ethnomethodology examines the observable practical common-sense reasoning as revealed in the data itself to make sense of how the social world is constituted in local environments.This means that the distinction between primary and secondary analyses disappears because the source of evidence is always constituted for the first time.This fits with Moore's (2007) understanding that in data reuse, analysis is always primary but of a different order of data, Hughes et al. (2020) extend this position by articulating 'the range of approaches and practices involved in producing different orders of data' (p.568).One example is Gibson's (2019) primary (rhetorical) analysis of Milgram's classic obedience experiment data which offers a reinterpretation of the core insights around obedience and persuasion.Similarly, Hughes et al. (2020) show how interview data might be reused to examine features of relational dynamics.In this way, QDS can open up novel avenues of research and lead to further scrutiny of prior findings. 7 Opening up novel avenues of research is one benefit of QDS, other benefits are explored by Jepsen and colleagues who reflect on reasons for creating their primary care consultation archive: a corpus of recordings of GP consultations, linked survey responses and patient records named the 'One in a Million' corpus: 'Data sharing provides considerable added value in terms of minimising data collection costs, reduced environmental impact, and patient and practice burden.This will support low-cost studies including doctorallevel research, thus building research capacity in primary care.' (Jepson et al., 2017, p. 350) Their argument expands Sacks' point -that a chief reason for sharing data is so that other researchers (particularly those who are at an early stage of their careers) can just happen to have that data.This is a point the paper will return to later, but first it is necessary to describe the kinds of data that CA usually works with and the essential characteristics of that data.
CA draws on recordings of naturally occuring 8 social interaction.This can include any site where interaction occurs between participants, including (but not limited to): Online chat logs (e.g.Meredith & Stokoe, 2014), Institutional encounters (e.g.Drew & Heritage, 1992), Phone calls (Holt, 1996), AI (e.g.Mair et al., 2020;Suchman, 2007), Video recordings (e.g.Mondada, 2018) and so on.In this vein, research interviews can be viewed as a site of interaction (Potter & Hepburn, 2012).CA research can be broadly placed within one of the two camps: 'pure CA' and 'Applied CA'.Antaki (2011) explains that 'pure CA' focuses on interactional practices and procedures detached from any type of context (e.g.Jefferson, 1988) and takes an endogenous orientation to the conversation itself rather than drawing on analytic insights about the institutional context.Compared to 'applied CA' which focuses on interactional practices within a certain setting (e.g.Drew & Heritage, 1992;ten Have, 2007) and provides an evidence-base for interventions (e.g.Stokoe, 2014;Wilkinson, 2015).A discussion of the debate regarding the two terms can be found in Antaki (2011).
The procedures for both camps are largely the same -recordings of social interaction are gathered and analysis proceeds with 'unmotivated looking'; that is, as Psathas (1990) explains, the researcher discovering what is happening in the recordings and not searching for predetermined phenomenon.Data collection of interaction recordings does not normatively involve the researcher which thus enhances the usefulness of the data for reuse and reanalysis.To outsiders, the unmotivated looking and efforts to remain exogenous to the data collection process may seem unstructured and haphazard, but the methodological technology imposes a high degree of rigour to account for and evidence unmotivated discoveries (see, Liddicoat, 2007, p. 9;Schegloff, 1996a, pp. 172-173 on accounting for phenomena).In short, data analysed in this way aim to avoid mediation by the subjective perspective of the researcher.
Despite the growing range of data that CA researchers draw upon, core data sharing principles remain unchanged since its foundation -that others ought to have access to the data, including, ideally, the original video/audio recordings, so to scrutinise the analysis of the researcher and that data is usually made available for pre-publication data sharing sessions (referred to as 'data sessions') as an integral part of the method.Recordings which are particularly sensitive may be subject to greater sharing restrictions which may be mitigated by heavily anonymising the recordings (e.g.voice altering and video manipulation) or by asking data session participants to sign a non-disclosure agreement and return all materials after a meeting.These more extreme measures are often the result of ethics committee requirements, and not of the science itself, with many sensitive anonymised data sets shared without such restrictions in place.
To summarise, over the course of CA's history the core principle that data should be shared and reused has established formal and informal practices for handling data.The remainder of the paper describes CA's understanding of and approach to ethics and epistemology, and explores CA's established procedures and practices for sharing data with the intent of widening ongoing debates and allaying some of the persistent concerns in qualitative research about data sharing.

Ethics
Conversation Analysts deal with many of the same ethical dilemmas experienced by other forms of qualitative research.For CA studies, which collect recordings of social interactions, ensuring anonymity for participants in shared data can be technically complex.It minimally requires the deletion of names, dates and locations.However, other components that make participants identifiable require more complex anonymisation decisions, for instance, anonymising voices and faces, or whether to remove specific details such as references to a participant's medical condition.Moreover, a participant may, in the course of a recording, indicate (in the recording) that some part should be anonymised through either explicit mention (e.g.Speer & Hutchby, 2003) or by blocking the recording equipment (e.g.Mondada, 2014).There is a profound understanding in CA that simply removing names, dates and locations might not always be sufficient for anonymisation.
It is not possible to predict every ethical dilemma which may arise in the course of research; hence, ethical solutions cannot be prescribed a priori.These ethical questions will persist as (hopefully all) researchers endeavour to protect their participants from harm.However, when the possibility of data sharing is built into research procedures, ethical safeguards become even more central to the research design (see, Albert & Hofstetter, frth for a discussion).This includes providing information to participants about data reuse (and its associated risks) along with consent forms which allow participants to decide whether, and in what forms/contexts, their data may be shared for future research. 9This has, for example, been the approach taken in the National Institute for Health Research (NIHR) funded CA project known as 'Real Complaints' (Real Complaints, 2021).

Context
CA can uniquely contribute to debates about the problem of context in QDS.Precisely what is meant by 'context' is arguably 'fuzzy' (Van Dijk, 2007, p. 285) across the social sciences, as it can be a shorthand to denote a specific situation, or the historical/geographical/cultural environment, of the object of investigation.However, CA's particular way of dealing with 'context' does not entertain contextual explanations of phenomena.Handling context in CA has been debated at length 10 (see, ten Have, 2007, p. 58-59;Wooffitt, 2005, pp.168-179 for reviews).In short, CA does not assume that aspects of context such as social categories (race, gender, power, class, etc.) are relevant a priori.Rather, context is dealt with analytically if, and only if, it is procedurally relevant and demonstrably attended to by the interlocutors themselves 11 (see Schegloff, 1992).Hence, Irwin's (2013) concerns about the contextual qualities of qualitative data are not normally relevant in CA.
This should not be read as necessarily advocating for this way of handling context in qualitative research generally.The point being made is that when sharing data, it cannot be foreseen how it may be (re)used.The endogenous understanding of context espoused by CA means that it does not carry any 'burden' of externally imposed context to delimit what it can be used to demonstrate.It cannot be expected that the data which researchers share will only be used by those within the same discipline or even those who share similar interests -rather, researchers ought to anticipate that the shared data may be used beyond the scope of the original research (see the previous discussion on the practices of producing different orders of data (Hughes et al., 2020)).
Returning to discussion of the extent to which data may be meaningfully interpreted by researchers outside of the original research, new and fruitful avenues of investigation can be found in the reuse of data collected for alternate purposes (e.g.Gibson, 2019;Hughes et al., 2020).For certain qualitative approaches, the arguments by Irwin (2013) and Chauvette et al. (2019) are salient, but different approaches with different epistemologies may make use of data in ways unforeseen by the original researchers.Data collected for one particular purpose may be meaningfully reinterpreted with CA because of its focus on phenomena demonstrably enacted and treated as relevant by participants in the discourse.

Data sharing in conversation analysis: practical aspects
As a fundamentally collaborative discipline, CA has fostered a culture 12 and tradition of data sharing out of which has emerged a community of practice: 'CA is a community, although with various degrees of intensity.As it has become established as a quite solidly and specifically defined approach in the human sciences, you can, by working in the CA tradition, become "a member" of that community.'(ten Have, 2007, p. 11) Typically, research which makes its data available does so following completion of the projectwhether defined as the publication of a (final) article or the overall conclusion of the funding period.Data collections may be described on websites such as the Open Science Framework during the research process but are not commonly available until after project completion.We refer to this widespread form of QDS as 'corpus sharing' distinguished from the practices and solutions employed by the CA community to add levels of transparency and rigour to the analysis, which we refer to as 'data sharing as a research practice'.This section returns to the fears outlined previously and discusses the practical aspects of the CA approach to data sharing both as corpus sharing and as established research practice.

Data sharing as a research practice
CA is a community of practice with a particularly democratic impulse -that both the analysis and the research process build from the ground up with students, practitioners, and experienced CA researchers able to contribute insights through data sharing sessions.Data sharing is thus 'baked into' the research process.During the research process, there are options for qualitative researchers to share data and findings in progress at conferences, seminars and research meetings, but CA is distinctive in that focused data sharing meetings (referred to as 'data sessions') are an integral part of the scientific process and disciplinary culture -researchers regularly share their data at data sessions.
Data sessions are structured research meetings where direct access to the recordings is made available to other researchers to scrutinise. 13Although data is often analysed by multiple researchers during the research process across the gamut of qualitative (and quantitative) research approaches, data sessions are distinctive in the sense that data is subject to scrutiny and analysis by others outside of immediate research teams and institutions.Findings can be independently checked, and ideas collaboratively explored (ten Have, 2007).
The procedures of a data sharing session can vary amongst research groups, but usually the data presenter shares recordings and transcripts with the group.The transcripts will contain more detail than is perhaps necessary for the data owner's interest 'because even if, say, pauses or overlaps are not germane to the current analysis, some other researcher might want to use the same materials for checking findings or for novel analytic purposes' (Jordan & Henderson, 1995, p. 48).Participants see/hear the recording several times as transcripts are recognised to be a static and partial representation of interaction.After a moment or two of quiet 'thinking' time, the group will propose observations.There is usually some rule that group members are initially limited to a single observation to encourage a collaborative, democratic ethos with equal access to contribute.
Crucially, sharing data allows for a more transparent analytic process for the data owner and for learners of the method.The practices and procedures of the data session might be usefully adopted by other qualitative approaches to fit with the Open Science movement (see, Humă & Joyce, frth) by not only corroborating findings, but also making explicit the discussions and analysis of data often done behind closed doors.We argue that collaborative analysis of data adds another level of rigour in the analytic process where banal observations may be retold as composed and refined analytic points and flawed analysis, or invalid conclusions recognised and corrected.The value of the data session cannot be overstated and highlights possibilities for data sharing beyond making data sets available in a repository post-project.Worries about being scooped by sharing one's data prior to publication are greatly outweighed by the benefits of the data session, and indeed, new projects may be launched, and collaborations proposed following such sessions.Sharing is thus baked into the research process through the tradition of the data session through which both new and seasoned researchers and, where relevant, participants involved in the data, are invited to witness and scrutinise data on their own terms.

Corpus sharing
'Data sharing' typically refers to post-project data sharing in a repository -where a final corpus of data is made available to other researchers.This paper treats a corpus of data as data (in whatever form) associated with a single project, whereas a repository of data is a resource where multiple corpora are stored and made accessible to others.The gold standard of data sharing is typically regarded as unrestricted access to data, which is shared with accessibility in mind, and is fully described and indexed so that other researchers can easily search and understand the data (e.g.Mass Observation Archive and British Library Sound Archive), and importantly, check analysis.CA is not unlike other qualitative disciplines in that corpora are held in various places (some, such as TalkBank, are specific to communication data), and although the ideal is unrestricted sharing, in practice there may be gatekeepers or restrictions on accessing data.Moreover, data is often shared through informal networks rather than through a formal repository.
It is impossible to predict how data may be reused which carries with it benefits and drawbacks.Within CA, the units of analysis are discursively realised practices and social actions, rather than people or experiences.The approach allows for the study of phenomena whose context is endogenously constituted within the talk itself.In this way, subsequent researchers can ask very different questions about the data, focusing on what is being done in the interaction and not the original purpose of the data collection (which for CA researchers is a context that is not relevant to the analysis).For example, data from Heritage et al. (2007) study investigating how doctors encourage patients to voice concerns in consultations was used by Heritage (2012) in a study focusing on the relationship between epistemic status and stance.Similarly, interview data from a study by Hepburn and Brown (2001) asking how secondary school teachers use 'stress' to manage their accountability and make sense of their institutional role was used by Potter and Hepburn (2005) to critique the (over)use of interviewing in qualitative psychology.This illustrates that all data irrespective of method of collection might be repurposed to generate novel findings potentially in novel ways.
The generation of findings is, however, only one argument for post-project data sharing.Conversation Analysts have long advocated for direct access to the data 14 presented in empirical articles.Providing corroborative evidence, which is prepared effectively (see, Walker, 2017), for research claims allows others to independently check those claims.To repeat Sacks's observation, 'others could look at what I had studied and make of it what they could' (Sacks, 1984, p. 26).Much of the foundational CA work which reused recordings (e.g. from the Newport Beach corpus or the Holt corpus) had such an impact in the community because fellow researchers were familiar with the data and able to independently check findings.

Conclusion
This paper contributes to discussions around the viability and usefulness of QDS, adding insights from the established traditions of CA to widen those discussions and to advocate for a more open and flexible approach to QDS.Ongoing debates emphasise the ethical and epistemological barriers to QDS and are framed by a distinction between primary and secondary data.We argue that CA's conception of data and context makes this distinction redundant.Currently, the demands of funders, Open Science and legal restrictions influence decisions about what gets shared and how, but inexperience and lack of consensus on best practice for QDS persist.Our aim has been to reflect on the long history of sharing data in CA, the impetus for sharing within the CA community, and how these procedures might be drawn on by other qualitative approaches.
We discuss two types of data sharing that are baked into the design of CA studies: sharing as research practice, and corpus sharing.We show how the 'data session', while not unique to CA (see also, grounded theory), enables the research process to build from the ground up and that this collaborative analysis adds rigour to the analytic process.Corpus sharing is a more traditional understanding of data sharing -and while matters of context and ethics present as barriers to reusing data, for conversation analysts, having access to the original recordings of analysed encounters is considered gold-standard.The expectations, tools and procedures of CA facilitate more transparent QDS within the community but as with most other approaches they rely on the researcher(s) having sufficient means to engage in QDS.
Beyond the practical barriers to engaging in sharing as a research practice or establishing and sharing a data corpus, many authors point to significant ethical and epistemological barriers for QDS.For ethics, data reuse presents challenges for informed consent and the high level of anonymisation potentially required might make the data difficult to work with.While we accept that attempting to solve all ethical issues relating to participant consent or prescribing ethical solutions a priori is a fool's errand, building in the possibility of data sharing into the research design, as CA does, foregrounds ethical safeguards which can alleviate potential dilemmas.For epistemology, many qualitative approaches consider reflexivity a central practice making secondary analyses impossible and indeed, different qualitative approaches conceive of 'context' very differently which again makes the data difficult to work with.CA, as an illustration, does not face these concerns.The distinct way that CA conceives of 'context' dissolves the distinction between 'primary' and 'secondary' data meaning that data is always constituted for the first time.Unlike a number of qualitative approaches, CA is not 'burdened' by externally imposed context to delimit what it can be used to demonstrate.To be clear, we are not advocating that all qualitative approaches follow CA's conception of context but instead are arguing that future use of any data can never be predicted and that all data, irrespective of method of collection, might be repurposed to generate novel findings in potentially novel ways.
We have demonstrated that sharing insights from CA can allay fears and barriers to QDS and that the long-established and refined tools of CA make QDS much more achievable.CA is, however, not a panacea for all QDS challenges and for many qualitative researchers, particularly early-career researchers, or those in marginalised areas, barriers to QDS -whether ethical, epistemological, or economic -can prove difficult to overcome without sufficient support and funding and thus they may be reluctant to share their data (Pownall et al., 2021) which can adversely impact career outcomes (Siegel & LaMarre, 2019).This is a crucially important topic which we have not discussed at length and so encourage further scholarship on this issue.
We conclude by reiterating Sacks (1984, p. 26) explanation of how he came to study the data that he did 'simply because I could get my hands on it and I could study it again and again, and also, consequentially, because others could look at what I had studied and make of it what they could'.CA was built on the ideal that data should be shared for the benefit of the primary investigator and the research community.The overall intention of this paper was to spark further discussion of what QDS could look like across the range of qualitative approaches.Notes 1.A resource that provides guidance on data management and includes a large archive of data and the details of other collections.2. Data repositories were initially developed to increase the transparency and sharing of data from clinical trials (Antonio et al., 2019).3.This is influenced by deontological ethics.See, Bishop (2009, pp. 257-260) for an overview.4. The cumulative relationship between CA and QDS is unique but the specific practices and procedures are not unique to the approach.5.For a fuller picture of the founding of CA see: Psathas (1994), ten Have (2007), Sidnell (2011), andSilverman (1998).6.Most modern CA research no longer draws on classic data with that collection being normally reserved for teaching.7. See Humă and Joyce (frth) for a discussion on the relationship between the culture of data sharing and the culture of continuous refinement and replication in CA. 8. 'Naturally occuring' is a slogan in the CA enterprise and usually contrasts with researcher elicited data or scripted talk, but see the debate in Discourse Studies which problematises the 'natural' and 'non-natural' data distinction (Lynch, 2002;Potter, 2002;Speer, 2002aSpeer, , 2002b;;ten Have, 2002).
a particular interest in the syntax semantics interface, there is an emphasis in her research on the interaction between the interactional and the linguistic properties of language in use.Her current focus is the Real Complaints project: an NIHR funded project researching the language of complaints handling in the NHS with colleagues at the Universities of Stirling, Loughborough and Queen Margaret Ruth Parry recently retired as Professor of Human Communication and Interaction at Loughborough University UK.She uses audio-visual recordings of real life interactions and an approach known as conversation analysis to capture and understand how we attempt and accomplish things with one another through our interpersonal interactions.She has largely worked on recordings of healthcare interactions.She has interests in difficult communication tasks such as telling someone else what is wrong with their motor performance, and talking about issues such as illness progression and dying.Adrian Kerrison is a Postdoctoral Researcher/Postdoktor at Linköping University with the Non-Lexical Vocalizations project.His work uses Ethnomethodology and Conversation Analysis to examine how crowds operate as social actors within large-scale settings such as sporting events, artistic performances, and protests.Currently he is focused on the use of individual non-lexicals (yelps, grunts, etc.) to perform attention, understanding, and assessment of play in sporting contexts.
Ruth's other key area of interest is in using analysis of recordings to generate insights into what somewhat nebulous concepts (dignity, patient-centred care) look like in practice.In recent years, with her team, Ruth developed, disseminated, and evaluated a set of communication training resources called 'RealTalk' which incorporate both clips from real life recordings and learning points based upon insights and findings of conversation analytic research.The resources are designed for use by communication trainers within their work in the NHS, universities, and hospices, and aim to increase the evidence-base and authenticity of health and social care communication training.Ruth has also pioneered the adaptation and application of systematic review methods for conversation analytic studies.Richard Simmons is a Professor of Public and Social Policy, and Co-Director of the Mutuality Research Programme at the University of Stirling.Over the last decade he has led an extensive programme of research on the use of voice in public services.This includes four studies funded by the Economic and Social Research Council, a Single Regeneration Budget-funded study, and work for the NHS, Scottish Executive, National Consumer Council, Carnegie Trust, World Bank, Co-operatives UK, NESTA and the Care Inspectorate.He also writes widely on these issues for academic, policy and practitioner audiences.His book, 'The Consumer in Public Services' is published by the Policy Press.As well as a series of journal articles in high-quality international journals such as Social Policy and Administration, Policy and Politics, Annals of Public and Co-operative Economics, and Public Policy andAdministration, Richard has written a number of policy-oriented publications and professional journal articles for a practitioner audience.His research interests are broadly in the field of user voice, the governance and delivery of public services and the role of mutuality and co-operation in public policy.The Mutuality Research Programme has acquired an international reputation as a centre of excellence for research, knowledge exchange and consultancy on these issues.