End-of-Life Decision Making Between Doctors and Parents in NICU: The Development and Assessment of a Conversation Analysis Coding Framework

ABSTRACT We report the development and assessment of a novel coding framework in the context of research into neonatal end-of-life decision making conversations. Data comprised 27 formal conversations between doctors and parents of critically ill babies, recorded in two neonatal intensive care units. The coding framework was developed from a qualitative analysis of the recordings using the method of conversation analysis (CA). Codes underpinned by our qualitative analysis had in the main moderate to strong agreement (inter-rater reliability) between coders; three codes had lower agreement reflecting the use of euphemisms for death and disability. Coding these interactions confirmed the significance of the doctors’ talk in terms of parental involvement in decision-making, whilst highlighting areas warranting further qualitative analysis. This quantifiable representation provides a novel outcome based on evidence that is internal to the conversation rather than influenced by other factors related to the baby’s care or outcome.

In neonatal intensive care, doctors are regularly faced with treatment decisions for babies who are born very premature or with life threatening or compromising conditions. These decisions include the possibility of redirecting care from full intensive care to palliative care. Such decisions require the participation and agreement of parents, often through a series of conversations. We recorded and analyzed conversations between doctors and parents in order to identify patterns that might be associated with parental participation, and with alignment between parents and medical teams regarding their baby's future treatment. Our preliminary analyses using conversation analysis (CA) suggested that certain linguistic formats (strategies) used by doctors in explaining a baby's current state of health, giving a tentative prognosis, or introducing possible care options in the immediate future, were more likely to result in alignment between doctors and parents, and thereby facilitate decision making (on the importance of CA's methodology for understanding medical decision making, see Gulbrandsen, 2020). In contrast, it appeared that parents were less likely to align with doctors when other techniques or formats for delivering prognostic news and recommending courses of action were employed; these other formats appeared to result in resistance by parents to certain medically recommended courses of action (care and treatment). However, we are aware that qualitative analysis alone may not be a sufficient basis to inform or advise medical practice and that the ability to quantify these conversations in a structured way could support our contentions and facilitate further research.
In this paper we report the development of an inductive coding frame that could appropriately be applied to doctors' and parents' communicative contributions in these interactions, one that would reflect and retain the qualitative observations emerging from our CA research whilst at the same time enable us to quantify our results so that they might contribute usefully to future medical practice.
There were two principal and closely interrelated difficulties in developing an appropriate coding frame. The first was the difficulty in implementing the kind of coding frame that has been applied successfully in CA research into medical interactions/communication (see . The second was the implicit, indirect, and even allusive nature of the language that is characteristic of these conversations, especially on the part of doctors. Whilst CA is predominantly a qualitative methodological approach, CA studies of medical interactions have adopted the quantification of coded interactions for precisely the reason that we have done -in order that medical professionals could be assured that our qualitative results are sufficiently systematic, reliable, and robust to be adopted in medical practice and training. Part of the success of CA-informed coding in medical interactions derives from codes that are based on formal 'surface' lexico-syntactic features of talk/turn design. Other approaches to coding in medicine, such as Roter's influential Interaction Analysis System , adopt a 'top-down' approach requiring (as do most coding schemes) some interpretive connection between actual verbal conduct (e.g. the words used) and a particular, often rather broadly expressed, code -an interpretation or judgment based on the content of what is said. By contrast, CA-informed coding schemes have focused on the social actions being performed through particular turns or sequences of talk, and on the design of those actions, utterances or turns. By focusing on the relationship between turns of talk, CA-informed coding schemes enable us to identify misunderstanding and interactional problems (see Alsubaie et al., 2021). A good example of a CA-informed coding scheme is the work of Stivers et al. on the formats with which general primary care practitioners recommend certain treatments to patients, formats ranging from 'treatment recommendations' formatted as pronouncements ('I'll start you on treatment with iron tab-supplements'), suggestions ('I would get some twelve hour Sudafed'), proposals ("Why don't we put you on the plain Allegra"), offers ('if you'd like I can give you uh-sample of uh nasal steroid spray'), and assertions about a treatment's benefit ('Muscle relaxants are a very good choice in this type of pain') . These formats for making recommendations are easily coded on the basis of surface linguistic features of doctors' speech. Another example of a similarly 'bottom-up' approach is the coding framework of Chappell et al. (2018) for decision making in neurology outpatient clinics, based on a detailed analysis of different approaches made by doctors to invite patients to take part in decisions.
We have adopted aspects of these CA-based coding schemes; for instance, we have coded for whether the doctor or parent initiated the moment of decision making, and as far as possible coded on the basis of the linguistic design of talk rather than an interpretation about content. However, 'as far as possible' is a necessary caveat. There is something especially challenging in conversations between neonatal doctors and parents of very sick babies when parents are being asked to participate in decisions to limit life-sustaining treatments (LST) for their baby, i.e. moving to palliative care . Whilst it is essential to maintain alignment between doctors and parents in reaching such decisions, the consequences of these decisions can so easily result in conflict between parents and medical teams. In these circumstances, especially when there may be considerable prognostic uncertainty (Han et al., 2021), doctors' language is often indirect, even euphemistic. For instance, as we explain below, one of our codes was whether doctors refer to a baby's possible death. However, explicit references to 'death' are rare; more commonly doctors use idiomatically euphemistic expressions such as 'we're coming to the end of the road'. There was not therefore a range of equivalent alternative lexico-syntactic formats that could be arrayed on a continuum, as in the study by Stivers et al. Whilst adhering as far as possible to the principles of CA-based, bottom-up inductive coding according to specific alternative formats, the coding frame we developed required pragmatic judgments about whether a form of words constituted a given action or fitted a given code. This required a rather novel approach to building a coding frame, one that would enable us to code the extraordinary nuance and indirectness of talk in neonatal intensive care conversation into categories for which there were no clear-cut action types (i.e., comparable to pronouncements, suggestions, proposals, offers, and assertions, for treatment recommendations). It should be noted that although our approach was novel, there are parallels with that adopted in Chappell et al. (2018) and Reuber et al. (2015, ch.2).
The challenge, therefore, was to design a coding frame that would incorporate our qualitative (CA) observations about forms of language that were often much less transparent, more indirect and allusive than the language, for instance, of primary care. We could not count solely on formal lexicosyntactic features of (the surface of) turns at talk; we had instead to code for underlying and implicit communicative messages that might be considered too painful for parents to be stated explicitly.
One other aim of developing a coding frame suitable for quantifying our qualitative observations was to be able to validate the reliability of our emerging results . These suggested that when faced with the decision of whether to limit life support for their critically ill baby, parents fared better when doctors presented options rather than making recommendations. Also, we found evidence that parents were more involved in their decision when the doctor discussed such options with them and that this approach reduced conflict between parents and doctors  -results that should inform neonatal practice and training.
Therefore, we designed a coding frame that was underpinned by our qualitative observations and analysis that would enable us to examine distributional differences emerging in these interactional patterns; at the same time our coding frame was designed to enable us to capture the circumlocutions and indirect formulations by doctors, and the frequently quite implicit, indirect responses by parents. This paper describes the development of the frame and its validation.

Materials and methods
Observational data were collected from two Level 3 neonatal intensive care units in England as described previously . Families were recruited whose baby was critically ill and where it was considered likely that the parents would be engaged in critical care decision making. Recordings were made on audio (Site 1, 2013-2014) or video (Site 2, 2015-2016) media. From 51 families recruited to the study, we identified 21 (27 conversations) in which the possibility of limiting LST was presented to the parents. There were 9 conversations with one parent and 18 with both parents present. The conversations involved 14 different consultants (11 neonatologists, 3 cardiologists); 9 were male and 5 female; for further information see . Decision-making sequences were identified and transcribed according to CA conventions, capturing not only what was said but prosodic and temporal aspects of speech. Our key findings from previous conversation analytic studies  were used to develop the coding framework. Data were coded from transcripts, in conjunction with the audio/video recordings, by two independent coders (CS and KC).
The study received approval from London-City & East Research Ethics Committee, NHS Health Research Authority (12/LO/1949), as well as Research and Development approval from participating NHS Trusts. All parents and doctors who took part in the study provided written informed consent. Parents were introduced to the study by their consultant and subsequently recruited by either their consultant or a member of the research team, after providing further information and obtaining informed consent.

Results
The results are organized into two sections: the development of the coding tool, and the validation process and results.

Development of the coding tool
The components are addressed chronologically, and the conversational codes reflect the design and understanding of action. The coding tool addresses the first decision point (e.g. who initiated, how initiated), aspects of any subsequent decision point(s) and finally how parents responded. Focusing on each of these enabled us to explore the relationship between how the decisions were introduced and how parents engaged in the conversation. The codes all describe features of the conversation, and more specifically the design and participants' understanding of action. For each code, the relevant segment of the conversation is coded according to specific categories. Whilst it was possible in some cases to code instances straightforwardly as one or another, we encountered several systematic ambiguities, as detailed below.

Initiation of the first decision point
We developed codes for the way in which decisions were introduced and thereby initiated. This included 'best interest' and 'partnership', as well as mentioning 'disability', 'death', 'who initiated the decision', and 'when the decision is actually made'. Subsequent decision points within the same conversation were not coded for the same features, as these first decision points foreground subsequent decision points, making categorization too complex. We will consider this further in the discussion.
Decision-making sequences were initiated by either the medical team or parents. This code was binary according to who first initiated talk about future courses of action. Doctors initiated the decision (the decision point) by raising the matter of future treatment, as highlighted in the following example: Dr: And obviously up to now we've been trying everything we can with Jamie. We continue to do so but think if he's not showing the ability to come off the ventilator, we do need to consider what he's going through now and what he'd have to go through in the future. In terms of treatment. -S2F7R1 Parents initiated the decision in one of two ways, either by enquiring about future courses of action (e.g. 'what's next?') or by asserting a preference for a particular course of action, prior to the decision being raised by the doctor: Dr: . . . .I mean I think it's very likely that he'd have cerebral palsy. Uh probably be a severe type so.
F: From what we-we've talked about it wi' two of us to some extent. AN:DHH. (.snff) to try a' put in a nutshell 'cause I-obviously not being very if there's anything can be done, even if it was a lifetime in a wheelchair . . . It's better than losing 'im. -S2F10R2 Because both death and severe disability are key determinants for considering limiting LST in neonatal care (Larcher et al., 2015), we explored whether the use of these categories in decision initiations had any meaningful relationship with parental participation in the conversation. We therefore coded whether death and disability were each referred to: 1) explicitly, 2) implicitly, or 3) not at all.
In line with Ekberg et al. (2019), explicit references to death included the words 'death' and 'dying', as well as euphemistic expressions, which clearly indicated end-of-life. For example, the doctor in the next extract used a euphemistic expression 'end of the road' to convey the baby's likely pessimistic prognosis: Dr: And unless we can do that, we can't do anything else useful. And that's why I'm worried that we're coming to the end of the road . . . S2F7R1 Other references were made to deteriorating health status or possible poor outcomes: Dr: One of the things that I think we need to sort of think about together is what we will do and what will be in her best interests if she were to deteriorate. S2F1R1 The difference between explicit and implicit references to death was not always clear-cut, however, as in the following example: Dr: I showed you a paper, you got the paper, haven't you. Uhm to say about the problems in the chromosome. And uh which has not got a very good ou-outcome unfortunately.-S1F1R4 Ambiguous cases such as this, where death is neither explicitly mentioned (as seen in the first two extracts in this section) nor clearly implied, rely on interpretation; the term 'outcome' could refer to end-of-life (explicit), or alternatively the baby's quality of life (implicit).
Explicit references to disability included words describing or labeling a disability the baby was at risk of developing, as 'handicaps' or 'cerebral palsy' or some such term: Dr: And I think it's very likely that he would have some very important long-term consequences and handicaps and that if he were able to come through this. And by that, I mean I think it's very likely that he would have cerebral palsy, er, probably be severe type. S2F10R2 Implicit references to disability were made through descriptions that were vague and ambiguous, creating an uncertainty about coding them one way over another, as in the following example: Dr: I think the chances of him having a good quality of life have significantly changed from when he was poorly before. S2F6R3 The words 'good quality of life' (S2F6R3) rely on interpretation and the negative outcome for the baby is ambiguous.
Doctors referred to a decision either as one to be made by the medical team, or as a plan that should be made collaboratively with the parents . The former was associated with the 'recommendations' format and was accompanied by less parental involvement compared to the latter. We therefore coded for whether there was: i) explicit orientation to a joint decision/joint plan, ii) reference to a medical team decision, or iii) no clear orientation to either. When a joint decision or plan was explicit, the doctor tended to specify that it was a team decision specifically including the parents, as in the following example: Dr: What we all need to really decide as a team, you know all of us; the nursing staff, the junior doctors, myself and you, is knowing everything we know about her start in life and the fact that we have really major concerns about her future but should she deteriorate, would you want us to bring the machine back out. -S2F1R1 When the decision was presented as a medical team decision, the doctor referred only to the medical team, without reference to parental involvement: Dr: . . . I think our feeling as a team is that we should offer to palliate him and to discontinue intensive care. I think that would be in his best interest and ultimately he would he would die. I'm sorry. -S2F6R4 In the third category, 'no clear reference to either', the doctor referred to 'we' without clearly specifying who is included in the decision making: Dr: Well we don't need to make any urgent decisions but, one option in this situation is to consider whether it's actually right to continue with the intensive care support that she's, she's having . . .

. -S2F13R1
On other occasions, there was ambiguity in the extent to which the decision process included the parents. In the following extract, the parents were recognized as having a role in the decision-making in so far as they may object to a course of action, rather than being involved fully in making the final decision: Dr: If it happens, we need to decide, whether we should put him back to the breathing machine or whether, we need to just to make him happy, comfortable, give comfort care . . . So I'm not putting any pressure, if you feel you are uncomfortable, we will do everything what he needs. -S1F1R1 Doctors may refer to the baby's 'best interest' or use a similar phrase, thereby evoking moral certainty, e.g. 'the right thing', 'the kindest thing'.  found that the 'recommendations' format tended to be coupled with such references, whereas option listing sequences tended either not to use such references, or to use them in a way which frames 'best interest' as yet to be determined . We therefore coded the use of 'best interest' in terms of whether it was mentioned, and how it was used. This included: i) the use of 'best interest' (or similar term) to underscore certainty: Dr: We as a group of doctors, and nurses, feel, in his best interest, we should, change his active intensive care into palliative care -S2F13R1 ii) the use of 'best interest' (or similar term) to frame the decision-making process: Dr: One of the things that I think we need to sort of think about together is what we will do and what will be in her best interests if she were to deteriorate. . . -S2F1R1 iii) the use of 'best interest' (or similar term) in a way that was not clearly one or the other.
We coded the initial decision according to the type of limitation to life sustaining treatment being introduced. This distinction is outlined in professional guidelines by the Royal College of Pediatrics and Child Health (RCPCH) and includes the following three categories (Royal College of Pediatrics and Child Health [RCPCH], 2004): i) Withdrawing LST: ending current life supporting treatment such as ventilatory support.
Dr: . . .One option, in this situation is to consider whether it's actually right to continue with the intensive care support that she's, she's having. . .would it actually be kinder to refocus on taking her off the breathing machine. . . -S2F13R1 ii) Withholding LST: withholding treatment that has not been started, which could mean surgery or other invasive procedures for example.
Dr: Now when-our plan is to when we go out, if he's still well, take the tube out. . .when we do that, there's fifty fifty chance, one in two chance, he may breathe one in two chance, because of either the heart problem, or because of, chromosome abnormality, he may develop apnea. . .if it happens, we need to decide, whether we should put him back to the breathing machine, or whether, we need to just to make him happy, comfortable give comfort care. S1F1R1 iii) Do not Resuscitate (DNR) Order: This refers specifically to withholding procedures to restart the heart and breathing in the case of cardio-respiratory arrest, including the use of both adrenaline and cardiac compressions.
Dr: . . .should she deteriorate in the vent-in the ventilator, I don't think it's appropriate, uh doing chest compression, or doing, uh ma-giving medication to kick-start the heart . . .. -S2F19R1 We also coded the moment when a decision point was reached, including whether: i) the decision was deferred by the doctor: Dr: I'm not asking you to make a decision right now . . . I don't think there's any immediate need for you to make that-you know for us to make that decision. -S2F1R1 ii) the decision was deferred by the parents: P: Can I think about what I would like to do? -S2F9R1 iii) a decision was made in the conversation: P: Um the current thinking is that if Aran requires intubation again, that we are fine to do that and then make a decision as to whether that's continued, once he's intubated . . . -S1F15R1 There was on occasions some ambiguity concerning who is deferring the decision. In the following example, the father implicitly asked for the decision to be put on hold whilst they pursue a particular course of treatment. However, it is the doctor who explicitly referred to not making a decision now (to limit LST): F: Can we see if that-infusion line can help . . . ((the nurse then confirms with the doctor that the line has just been put in)) Dr: So certainly we can we can wait for that. An' see if it makes a difference. We don't have to do anything now . . . -S2F6R3

Overview of decision-making process
We then developed two codes to summarize the invitation to make a decision, and the decision-making format. We previously observed that deferring the decision avoided the demand for or urgency of an answer and provided parents with an opportunity to ask questions . A code was therefore included for invitations to parent(s) to make a decision, in three categories: i) a decision deferred by the doctor: Dr: The difficult question and the difficult thing that you will now need to go away and think about is what do we do if after we take the tube away he doesn't cope. -S1F4R1 ii) parents were invited to decide at that moment: Dr: We as a group of doctors and nurses, feel in his best interest we should change his active intensive care into palliative care. -S1F8R1 iii) a decision was raised but without a clear expectation that the parents make a decision at that moment: Dr: But one of the things that I need to make sure that you're also aware of is that given the multitude of problems that Abdallah has that it would not be an unreasonable decision for us all to make, for us not to give chest compressions an' not to give drugs. If that is what collectively was felt to be in his best interests. And I just need to make sure that you both understand that. -S2F2R1 Because doctors rarely explicitly invited parents to make a decision, this category involves some interpretation of the extent to which a decision is being invited in that moment, rather than relying on binary categories or linguistic formulations. This meant there was some ambiguity in coding examples such as the one above (S2F2R1), where the doctor might be understood to be giving information, rather than soliciting the parents' perspective on a treatment decision.
We then classified the type of decision format along the lines we had deduced from our previous CA research (see : i) Recommendations, in which one course of action is presented and explicitly endorsed as the best course of action: Dr: We as a group of doctors and nurses, feel in his best interest we should change his active intensive care into palliative care.-S1F8R1 ii) Single-option choice format, in which the doctor presents a course of action which is dependent on parental choice, without explicitly referring to or listing other options: Dr: But one of the things that I need to make sure that you're also aware of is that given the multitude of problems that Abdallah has that it would not be an unreasonable decision for us all to make, for us not to give chest compressions an' not to give drugs. If that is what collectively was felt to be in his best interests. And I just need to make sure that you both understand that.-S2F2R1 iii) Options, a format in which the doctor explicitly refers to or lists multiple options: Dr: So we need to make a plan, if that acute arrest or stop breathing kind of things happen, how far we should go like whether just mask ventilation suction mask ventilation oxygen or putting the tube in to the breathing pipe an breathing, and pumping the heart, and giving the medicine as well.-S1F15R1

Parental responses to the decision
Having recorded the style and content of the process initiated by the doctor, we then coded the parental responses in terms of their opportunity to have a conversation about the decision and process, and how they expressed their preferences within the episode, amounting to an overall 'response score'. The response scores were ordered, with high scores indicating greater parental participation. The novelty of these codes is their focus on the alignment between the doctor and parents within the decision-making process, rather than whether the parent agrees with the doctor or not. In other words, it is about whether the parents are given the interactional space to ask questions or assert a preference which might be different to that of the doctors (or despite such differences), without having to resist the doctor's actions.
There were two codes for parental responses: 'opportunities for questions prior to making a decision' and 'expression of preference'. The first of these codes captured the extent to which parents were given opportunities to ask questions. A score of 3 indicates that the parents were either invited by the doctor to ask a question (S2F7R1), or the parents freely volunteered a question (S2F6R3): Occasions when questions were neither invited nor volunteered, or when a parent's question could be construed as challenging (Koshik, 2003), a score of 2 was given: Dr: The team night team felt we should not put the tube back again. I work with you, he's our child, we to make informed decision.
M: Why did they say that. -S1F1R2 The lowest score of 1 was given when the parents asked explicitly challenging questions: M: Yeah but can't he still have the machine on while I'm holding him. -S2F9R1 Again, these categories lie on a continuum of response types rather than being discrete, bounded categories. At times this meant that it was not always easy to categorize a parent's response. For example, in the following extract, the extent to which the parent's questions may be considered challenging is ambiguous: When coding was considered truly ambiguous for this category, a middle score of 2 was given.
The second response code: 'expression of preference', captured whether parents asserted their preference freely, reflecting the extent of interactional trouble, or the alignment between parents and doctors. Occasions when parents asserted their preference freely with minimal resistance, when the parent agreed or concurred with the doctor's proposal to defer the decision, or when the parent requested a deferral, were scored 3 Preferences asserted with minimal resistance: F: Well I'm unchanged as to our last decision, which was an escalation from the previous decision . . . the current thinking is that if Aran requires intubation, that we are fine to do that . . . S1F15R1 Agreement with deferral: Occasions when the parent responded minimally, for instance by nodding or only briefly acknowledging, could be heard as passive acceptance or implicit resistance (Heritage & Sefi, 1992;, and were scored 2: Tacit/passive acceptance/implicit resistance (including nodding): Dr2: Personally I'd feel much more comfortable if we could agree that we wouldn't do that. We treat anything that's reversible but not-. . . not do . . . treatment that wasn't benefiting him.

M: ((Gentle nod)) -S2F17R2
Non-concurrence with doctor's deferral were also scored a 2: Dr1: . . . And you need to have a really good think about what the right thing for your baby is and obviously we'll do what we can to help . . . Explicit resistance, including where parent deferrals were used to resist a doctor's recommendation, was scored 1: M: Do I have an option there?-S1F1R3 M: You're telling me to kill my baby basically. -S1F24R2 Ambiguous cases were again scored a 2.
In this section we have shown that systematic ambiguities can arise in these conversations which have consequences for coding what is said by doctors and parents. Such ambiguities made it difficult to code interaction as straightforwardly as in other coding schemes outlined above. In this section we have illustrated how we developed a coding frame enabling us to incorporate such ambiguities.

Validation
Because there can be multiple decision points during a particular interaction, a first step required the coders to agree on which segments of the conversation counted as a decision point, for coding purposes. Two independent CA experts (CS and KC) then coded the 27 conversations. Data were entered into separate excel spreadsheets for each coder and then added to an SPSS database. Cohen's Kappa was used to measure the inter-reliability of coders on a total of 12 codes. These 12 codes are represented in Table 1.
Internal reliability was then assessed by comparing the significant relationships between variables for each coder. The Kruskal Wallis Test and pairwise tests were used to identify any significant relationships between the decision formats used, and the parents' response scores. Pearson's Chi Square was used to identify any significant relationships between categorical variables.
A total of 45 decision points across 27 conversations were coded. In 9 conversations there was one principal decision point. In the remaining 18 there were between 1 and 5 further decision points across the conversations, with most conversations having 1-2 decision points in total (interquartile range 1-2).

Inter-rater reliability
The two coders categorized all identified decision sequences across the corpus, and agreement was interpreted using the categorization of McHugh (2012) to avoid the ambiguity of the broad groups originally proposed as 'substantial' agreement (Cohen, 1960) for scores ranging from 0.60 to 0.79 or 'almost perfect' for greater values. Individual coder responses are shown in Figure 1. Kappa values for 'decision format' and 'parental questions' were 0.80 (95% CI 0.65-0.95), and for 'who raised the decision' Kappa was 0.81 (95% CI 0.56 to 1.00) denoting 'strong agreement', whereas 'expression of preference' (k = 0.70, 95% CI 0.52-0.87) and 'total response score' (k = 0.62, 95% CI 0.46-0.78) showed 'moderate agreement'. These features were identified by our qualitative process  and considered to be important indicators of parental involvement in the decision-making process. In addition, 'moderate agreement' was shown for 'partnership' (k = 0.71, 95% CI 0.48-0.94), 'best interest' (including whether best interest was mentioned or not) (k = 0.71, 95% CI 0.48-0.94), 'when decision made' (k = 0.76, 95% CI 0. 0.55-0.96), and 'decision type' (k = 0.69, 95% CI 0.45-0.93). In contrast the weakest agreement was shown for 'when invited' (k = 0.52, 95% C1 0.34 to 0.71), 'mention of disability' (k = 0.46, 95% CI 0.12 to 0.81), and 'mention of death' (k = 0.24, 95% CI −0.07 to 0.54). This level of agreement is instructive, despite the low kappa values, as it further indicates areas where parents may have had difficulty in following the conversation due to the allusive and indirect language frequently employed.

Internal validity
There was statistically significant agreement between the results of both coders when assessed using our previously identified outcome groups, providing internal validity for the relationship between variables. As reported elsewhere , response scores varied significantly (preference: p = .005; total response score: p = .002) and with borderline significance (questions: p = .053) according to the decision format used when Coder A's and Coder B's scores were combined. Similar results are also found when comparing Coder A's and Coder B's results separately (Figure 1). Pairwise comparisons indicate that, after Bonferroni correction, total response scores are higher for single-option choice (conditional) formats (Coder A p=.003; Coder B p=.002) and options (Coder A p=.062; Coder B p=.061) compared to recommendations. The associations between 'best interest' and 'decision format' as well as partnership and decision format, for Coder B and Coder A's combined scores  have likewise been found for Coder A and Coder B separately (see Appendix 1). For both coders, when phrases about the baby's 'best interests' were used with certainty, they tended to accompany recommendations, whereas when they were used to suggest a framework for guiding the decision, they tended to accompany singleoption choice formats or options (for both coders: twotailed Fisher's exact p < .01). A significant association was also found between decision format and partnership score, for both coders (two-tailed Fisher's exact p<.01), as found previously for Coder A and Coder B's combined scores . Recommendations tended to be presented as a medical team decision whereas single-option choice formats and options tended to be presented with either an explicit reference to a joint decision, or with no explicit reference to either.
Comparable associations were also found between the three decision formats, and whether doctors or parents had initiated the decision point. When parents initiated the decision point, doctors usually either refer to or list possible options (see Appendix 2). Doctor initiations, in contrast, tend to be followed by the recommendation or single-option choice (conditional) format. The association was significant for Coder A and Coder B (two-tailed Fisher's exact p < .05).

Discussion
We have reported the development and validation of a novel coding framework for conversations between doctors and parents about decision-making around limiting life supporting treatment in neonatal intensive care. Its novelty derives from the development of a system to quantify the interaction based on conclusions derived from conversation analysis. It identified robust measures of aspects of the interaction that have good inter-rater reliability and consistency, together with several codes which have less agreement, which nonetheless are instructional in that these are markers of the complex and allusional language used in conversations by professionals to refer to challenging issues such as death or impairment. Utilizing such a system provides a rapid and robust method of summarizing the interaction that can be used in training and in future studies. Furthermore, this outcome measure is internal to the conversation and thus not influenced by other aspects of the baby's care or final outcome, being derived from transcripts of real-time conversations. In this, it shares common features with other CA informed coding frameworks (Chappell et al., 2018;Heritage et al., 2007;.
Our coding framework was designed to reduce, as far as possible, the drawbacks involved in interpreting language for coding purposes, by using grammatical and binary categories, and analyzing talk within its sequential context. Thus, our framework is more robust than one relying entirely on interpretation and inference. However, in a context in which doctors frequently used circumlocutions and indirectness, coding decisions could not be made so easily or directly on the basis of simple grammatical and other linguistic features. The challenge was therefore to develop a coding framework that was sensitive to the ambiguities of doctors' talk, that still required interpretive decisions to be made by coders. Using this approach, our coding categories reduce as far as possible the interpretive aspect of coding. The indirectness, ambiguity and nuance of doctors' talk, characteristic of this context, has rendered this difficult at times, and yet the coding framework has, to a great extent, mitigated this challenge. This is supported by our moderate to strong inter-rater agreement for codes, and the consistency demonstrated by both coders, one of which was new to the project and not involved in the CA study previously.
The strengths of our study alluded to above have to be set against the relatively small sample size from which we have derived our conclusions. The use of the coding frame will facilitate the evaluations of conversations going forward. The weak agreement found for some codes identifies areas in which doctors will require guidance to be more direct in their speech in their training.
This study reflects the iterative and mixed method approach we have taken to our study of communication of end-of-life decision making in neonatal care. We commenced with qualitative analysis of real time conversations between doctors and parents, quantified the patterns of talk that we observed in the initiation and formulation of the issues, and their consequences for parental response and participation. These findings help to identify strategies that doctors can use to approach this area of conversation, and the ability to code conversations rapidly allows us to build our dataset more efficiently. We now intend to use CA to explore issues in greater depth, using our coding framework to characterize important aspects of the talk, alongside refining our use of CA in training opportunities for established consultants and neonatal doctors in training.