Using mixed-methods, a data model and a computational ontology in film audience research

ABSTRACT This paper discusses a methodology that seeks to address one of the challenges in working with a range of data in mixed-methods audience research, which is how to sort, order and categorise different data so that they can be systematically combined and interrogated. The methodology was developed as part of the “Beyond the Multiplex: audiences for specialised films in English regions” (BtM) project. This project required a mixed methods approach using surveys, interviews, focus groups and document analysis to explore the richness of audience experiences and trends in the context of regional film policy. This required a mixed methods approach using surveys, interviews, focus groups and document analysis. The project utilised a data model approach that uses the principles of a computational ontology in order to sort, order and categorise data for systematic interrogation. The paper discusses methods, data, coding, and the use of a data model to support data analysis. We argue that this approach enables the cross referencing of data that provides a rich, multi-layered and relational understanding of film audiences but requires time and attention to data management and coding. Although, additionally it also forms the basis of an open access data resource for future research.

To support our analysis we used a method from Information Science, which is a data model approach that uses the principles of a computational ontology.The purpose of this method is that it supports researchers to sort, order and categorise data for systematic analysis (Beydoun, Henderson-Sellers, Shen, and Low, 2009).The paper discusses the overall research design, the data we collected, how we coded the data and how we developed a data model to help us prepare the data for analysis.The focus of this paper is on methods and we draw on some indicative findings to illustrate the contribution of these methods and data in our research.
The structure of the paper is as follows.Section Two outlines the policy context of our audience research and Section Three discusses conceptualising audiences and film worlds.Section Four addresses the methodological challenges of audience research.
Section Five covers research design, methods and data to provide concrete details of the mixed methods before outlining how we prepared and managed our data (section Six).
Section Seven describes how mixed methods worked with a data model that was informed by the principles of a computational ontology.Section Eight reflects on the advantages and disadvantages of this methodology.We conclude by arguing that this methodology requires time and attention to data management, but it provides consistency for querying data and helps to yield rich and multi-layered understandings of film audiences.

The policy context of our audience research
Central to an 'audience development ethos' within UK film policy is a focus on inequality of access to a broad range of film, including 'specialised film' at a regional level (DCMS, 2012).The UK Film Council (UKFC) established the goal of distributing "...a more diverse range of films to a broader UK audience..." (UKFC, 2003, p. 8), introducing the term 'specialised film' to designate a category of film distinct from mainstream or commercial genres to support with public funding.UKFC saw specialised films as separate from mainstream film in terms of country of origin (e.g.foreign language), genre (e.g.documentary), age (e.g.classic films), aesthetic form (e.g.artists' moving image), content (e.g.engagement with political or social issues), or representation (e.g.gender, ethnicity, sexual orientation, or dis/ability).Following the closure of UKFC, the BFI continues to use the term as a category to report annual film industry activity in their Statistical Yearbook (BFI, 2018).
Engagement with specialised film is lower in the north of England than in London and the South West (Jones, 2015), part of this is related to differences in the types of venues audience have access to.For example, the BFI reports that commercial multiplex chains accounted for 91% of cinema screens in North West England and 90% in the North East, whereas they only made up 69% and 63% respectively in London and the South West (BFI, 2018).Inequalities of access to diverse programming and a range of venues have shaped the concerns of public funders, such as the BFI, which in response created the Film Audience Network (FAN) -a collaboration of eight regional Film Hubs funded with Lottery money to support greater audience 'choice' in regional contexts.
The use of the term 'specialised film', and the desire of those allocating public resource to address geographic imbalances in film access, raises questions about how we might enable regional audiences to participate in a more diverse film culture.Our research aims to advance a greater understanding of those processes.BtM focused on four English regions (North East, North West, Yorkshire and Humberside, and South West), examining film consumption in theatrical and venue-based exhibition, including multiplex, boutique, independent, and community cinemas, alongside film festivals.We also addressed non-theatrical forms, such as television and online/on-demand platforms, to capture the variety of audience experience.

Conceptualising audiences and film worlds
Audience reception studies have established that film watching experience is diverse and extensive (Christie, 2012), audiences are plural in the ways they interpret film (Staiger, 1992), that cultural context matters (Barker, Arthurs, and Harindranath, 2001), and that people's readings of films often differ from those developed through scholarly textual analysis (Livingstone, 2013).To address the diversity of experience we drew on Livingstone's (2013) conceptualisation of audiences as relational and interactive.This required a balance between (1) attention to texts, in our case film and (2) attention to audiences and their experiences.This means asking for example, how films including specialised films are located and understood as part of people's wider social and cultural practices.This approach emphasises the modes of connection, relationship and communication through which audiences form (Livingstone, 2013).
Reception studies examine the interpretive, interactive, and relational aspects of audiences, but focus less on the market aspects of cultural consumption.To address this with audience development in mind, we drew on Becker's (1982) notion of 'art worlds', which recognises the relations amongst producers, distributors, and consumers in creating cultural markets 3 .By applying this to film, we explored the relations of what we term 'film worlds', composed of relationships between industry leaders, policy-makers, funders, producers, film-makers, distributors, censors, online platforms, broadcasters, festival organisers and programmers, marketers, film-critics, and audiences.The concept of 'film worlds' allows us to address film audiences in a relational manner, accounting for broad trends alongside specific film audience formations and experiences.

Methodological Challenges
There is a long history of contemporary and historical research about film and television audiences (Christie, 2012;Biltereyst, Lotze, Meers, 2012).While methodological and theoretical approaches have evolved over time, there are a long-standing set of tendencies.For example, contemporary audience research often involves either largescale quantitative surveys to examine broad trends (e.g.Arts Council England, 2011) or small-scale qualitative studies that capture rich detail about audience experiences (e.g.Evans, 2011).Both provide useful knowledge about audiences; however, both hold limitations.Findings from qualitative methods are not easily open to generalisation, and quantitative methods cannot fully capture the richness of audience experiences (Johanson and Glow, 2015).
To counter these limitations, mixed-methods and multimethod research is becoming widespread in contemporary audience research (Schrøder, Hasebrink, Hölig, and Barker, 2012).Mixed-methods refers to using two or more research methods and integrating them within a coherent research design (Bryman, 2006).Mixed-methods can provide rich qualitative accounts and analyses of broader trends, and thus hold potential to produce more rounded insights.Using mixed method approaches raises questions about how to work with different types of data.Crossley and Edwards (2016) argue that it is possible to combine quantitative and qualitative data, provided researchers are attentive to the practical and epistemic ways that each dataset frames the overall analysis.For Cresswell (2009), this means researchers should analyse data systematically, exploring each type of data and the relations between data.Schrøder et al. (2012) are concerned that mixed-methods research often lacks close attention to the details of data collection, analysis, and interpretation.Like Crossley and Edwards (2016), Schrøder et al. (2012) argue this extends to a lack of concern for how different methods (and datasets) relate to each and that there can be a lack of sensitivity towards underlying epistemic differences between datasets.For Schrøder et al. (2012), mixed-methods researchers often assume different datasets can be complementary, or that triangulation (combining different lenses and corroborating between methods) will enable greater validity without a critical appreciation of how different datasets relate to one another.
For BtM, we developed a data model, which uses the principles of a computational ontology (see section 6) to systematically combine and interrogate different types of data across different datasets.A data model is an abstract description and representation of how data categories relate to one another so that they can be sorted, ordered, and categorised in data storage systems such as relational and XML databases.A computational ontology is a type of data model that describes how data categories relate to one another in accordance with a specific domain of discourse, in our case film worlds (Pidd and Rogers, 2018).This differs from approaches that have also sought to address the concerns raised by Crossley and Edwards (2016), Cresswell (2009), andSchrøder et al. (2012).For example, Barker and Mathijs (2012) combine data through a rigorous stepped process of analysing one method, then another in planned sequence, and Davis and Michelle (2011) use factor analysis as the key driver for their overall analysis while using Q-methodology.Our approach goes beyond integrating or triangulating between different datasets and seeks to achieve mixed research synthesis (Heyvaert, Maes and Onghena, 2013).We are able to analyse a large database of mixed data systematically, irrespective of the data's original source and format, because the data is structured and stored in a single consistent way which reflects the domain of discourse.

Research design, methods, and data
Our mixed-methods research design allowed us to explore how film is consumed and by whom, how people experience and interpret film, and the importance of place and venues in relation to policy and industry trends.It involved the following methods: • Secondary analyses of Department for Culture, Media, and Sport (DCMS) and BFI survey data to develop socio-cultural profiles of film audiences.
• 200 semi-structured qualitative interviews with a wide range of film viewers to understand the nature of film viewing and audience practices.
• A three-wave longitudinal survey of regional film audience patterns through time.
• 16 film-elicitation focus groups to explore how audiences interpret specialised film.
• Quantitative and discourse analyses of 200 film policy documents to understand policy and industry trends in regional film provision.
• 27 semi-structured interviews with film policy and industry experts to explore different strategies for film distribution and exhibition.This produced the following datasets: • 200 x Audience interview transcripts.
• 4 x Survey datasets (one per wave, and one of all waves combined) drawing on N=5, 071 respondents.
The research will also generate several open access resources for future researchers: • 3 x NVivo Project files (including all transcripts).
• Variables from our secondary analysis of DCMS and BFI data.
• A graph database based on our data model.
• A documented version of the data model.
• A website with data visualisation tools, enabling researchers and non-expert publics to use our data and computational ontology.
Rather than producing standalone analyses for each method and then comparing findings manually, we used the data model to compile datasets into a coherent whole, and to map complex interrelationships between them.

Audience and film preferences: secondary analysis of survey data
Film is one of the most common cultural interests in the UK (Northern Alliance and Ipsos MediaCT, 2011).To understand distinctions within UK film consumption, we undertook secondary analysis of two datasets to assess film genre preference and attendance in relation to income, age, gender, education, and urban/rural residence.
To identify how film audiences cluster in relation to socio-cultural backgrounds, film preference and consumption we conducted latent class analysis (LCA), hierarchical clustering, and regression modelling of the DCMS's 'Taking Part' survey data (2017)4 and BFI's 'Opening Our Eyes' survey data (Northern Alliance and Ipsos MediaCT, 2011). 5e identified five clusters of film genre preference within film consumption: 'arthouse and foreign language film', 'romance and romantic comedy', 'drama, comedy, action and thriller', 'fantasy and sci-fi' and 'classic and documentary'.We identified a specific group of consumers that watch 'arthouse and foreign language' films and that this group are also highly likely to watch any film genre.Our analysis shows that people who prefer 'arthouse and foreign language' films are likely to earn >£30,000 pa, reside in cities, and be higher educated than people in other genre preference groups.Our initial findings informed later aspects of the research, including interview and longitudinal survey questions and sampling.

Exploring audience experiences: qualitative interviews
To understand people's experiences of film, we undertook 200 semi-structured interviews, 50 per region.We used a snowball sample, which covered a broad range of ages, occupational statuses, and educational levels.The interviews gathered rich data on the types of films participants liked (and did not like) to watch, where and how they watched films, and with whom.We also explored how viewing habits had changed over time, and perceptions of being part of an audience.
Our preliminary analysis identified five themes; types of audiences, practices of film watching, the value of film and cinema, venue and place, and reasons for watching.In the audience theme, we found different senses, scales and meanings of audiencehood.
These related to what people watched, where they watched and how they interacted with others through film, from watching film alone in the cinema to feeling part of a global fan culture.We found that partners, friends and relatives are influential in shaping film choice and how film experiences are shared.We found film and cinema played an important role in many participants' everyday social and cultural lives, and in some cases made a clear contribution to wellbeing.
We also determined the significance of place, examining participants' views on their access to different types of cinema, this showed us how film connected them to other places (both real and imagined).Finally, to understand the context in which participants chose to watch certain films, we identified their reasons for watching in different situations, finding nuanced ideas of escapism to be significant.Overall, the interviews provided insight into how people consume film in a regional context, what, and where they like to watch them, and the cultural value they place on their engagement with both less familiar and mainstream films.

Audiences trends through time: longitudinal survey
To explore regional patterns of film engagement at scale and over time, we undertook a three-wave survey in two-month intervals between August 2018 and January 2019 6 .The first wave collected responses from a regionally representative sample (n=5,071) of adults, replicating key measures from the secondary datasets alongside questions drawn from our interview analysis.
The results confirmed the clustering of film genre preferences found in our secondary analysis (4.1) and provided insights into film watching frequency, who films were watched with, how film experiences were shared, and the factors that influenced film and venue choice.Respondents described their access to cinematic film positively, with 68% finding their local film provision 'good' or 'very good'.We found that in the 12 months preceding the survey: • 66% of respondents visited a large commercial chain cinema (e.g.Odeon, Vue or Cineworld) • 24% visited a smaller or 'boutique' commercial cinema chain (e.g.Curzon or Everyman) • 16% visited an independent or arthouse cinema • 11% watched a film at a community event or film club • 9% watched a film at a film festival also found that 49.6% of wave one respondents had watched some kind of 'specialised film' in the 12 months preceding the survey.It was this group that the second and third survey waves followed (n=547, n=317, respectively) by asking for the specific films that respondents had watched in the preceding two months, how, where, with whom and what their experience of the film was like.Overall, the three waves provided a detailed picture of patterns of film watching over a six-month period within our regions.

Audience interpretations of film: film-elicitation focus groups
To explore how audiences interpret and makes sense of specialised film, we conducted 16 film-elicitation focus groups (four per region) in both urban and rural areas, recruiting participants through snowball sampling.The sample was made up of people with different types of age, gender, ethnicity, occupational status, dis/ability and included people who self-identified as cinephiles alongside people with little or no experience of specialised film.
To develop our method, we adapted approaches to photo-elicitation (e.g.Kolb, 2018) and film-elicitation (e.g.Philippott, 1993) within our focus groups to explore how the participants interpreted some examples of specialised film.For this, we selected selfcontained film sequences to explore people's interpretation of cinematic techniques and film narratives, and representations of both geographically local and more distant cultures in film.The sequences were drawn from eight foreign language and British films released between 2016 and 2018.
Discussion in each focus group explored how participants felt about each sequence, and what they found significant.Participants discussed their interpretations of different aspects of each sequence, e.g.how they related to characters, and the visual and audio aspects of the film.They also discussed how film narratives and aesthetics generated meaning.
Our analysis identified four themes.The first theme showed how viewers located themselves in relation to place, setting and landscape, whether familiar or unknown.
The second theme showed how viewers articulated their emotional identification and investment with characters and situations.Thirdly, we found viewers expressed a sensory appreciation of film style, in terms of the ways they discussed empathy and embodied reactions to film.Finally, the last theme showed how viewers experienced pleasure and labour in their interpretations, e.g., in cinematic techniques they found engaging/disengaging and (at times) in finding subtitles challenging.

Policy analysis
To understand the changing policy and industry contexts in which film-viewing takes place we undertook an assessment of industry reports, annual film release and box office statistics, policy statements, and strategy documents dating from 1997 to 2018, focussing on those published by the UKFC, BFI, and MEDIA/Creative Europe programmes.
This provided statistical data (e.g.number of specialised films released, their box office value) and a descriptive overview of language employed by each organisation to promote their goals.This allowed us to understand how conceptions of audience development were evidenced, articulated and applied, and how public money was allocated accordingly.
Our analysis focussed on how funding is channelled through production, distribution and exhibition to reach audiences in different ways.In doing so, it examined how public investment is directed towards supporting intermediary roles between producers and consumers (Smits, Higson, Mateer, Jones, and D'Ippolito, 2018).It found that during the period covered by our analysis, the UKFC was created and closed, and the BFI was given greater resources and responsibilities.Meanwhile, public investment in film distribution and exhibition has decreased, and there was significant change at a regional level as the Regional Screen Agencies were established and a number subsequently closed.The BFI moved away from UKFC's focus on funding technological development (as digital projection expanded), and invested in 'audience development' programmes.This included regional investment through the creation of FANwhich they have recently increased funding for (BFI, 2017).Our analysis found these changes have led towards a greater focus on fostering collaborations amongst exhibitors and Film Hubs at the regional level.

Expert interviews
We interviewed 27 representatives from film distribution and exhibition organisations to gain an understanding of their current priorities and challenges.We selected participants according to professional role, level of industry experience, regional location and decision-making influence (Harvey, 2011).Our sample included seniormanagement representatives from national cinema support agencies, policy-makers, film-funders and distributors, online platform managers, film-programmers, and cinema staff (from both commercial chains and independent cinemas).
The interviews gathered detail on different organisational approaches to film, programming, marketing and audience development.We found there were significant new challenges to film exhibition and distribution across the UK, including the impact of online streaming subscription services,7 the role of new 'boutique' cinema chains (which show both mainstream and some independent film) and the implications of the large volume of new films being made and released.These interviews enabled us to situate different business concerns and strategies within the broader context of film access, consumption, distribution, and exhibition.

Preparing and managing data
Developing a data model requires careful attention to the methods of data collection, coding data and how different types of data are managed.In BtM we coded variables from quantitative analysis and dual coded qualitative data.
In our secondary analysis of DCMS survey data (2017), we used variables on frequency of participation, reasons for participating, barriers to participation, and attitudes towards different cultural sectors.We also used film-related categorical variables from the BFI survey (Northern Alliance and Ipsos MediaCT, 2011), e.g. its classification of 'film genres'.We used variables of respondents' demographics from both surveys e.g.age, education, income and socioeconomic status, marital status, number of children in the household, and whether the respondent lived in an urban or rural location.These variables were the basis of our secondary analyses (4.1), and they generated a new set of variables for predicting and grouping film genres likely to be watched (consumed) and/or preferred based on respondents' demographic data.
To compare our secondary analysis with other datasets through the data model, we ingested the above DCMS and BFI variables alongside our newly generated ones into our database.This helped to refine the items within our data model.
Our longitudinal survey produced variables such as film watching frequency and type of experience (alongside raw survey data) in each wave.Following secondary analysis of DCMS and BFI data, we ingested the longitudinal survey responses (as raw data) and the variables (e.g.multiple response questions created categorical variables) into the database, using them to further refine our data model.The longitudinal survey included sense of film.8This had subcodes for 'Life experiences' with further sub-subcodes for different types of life experience, e.g., 'Workas a Nurse [Mental Health]' or 'Unemployment'.
We found that dual coding generated a rich scheme for each qualitative dataset, providing a firm base for analysis.Our process started with open coding data and then moved on to a stage of focussed coding.Open coding provided a broad range of descriptive and conceptual codes.In our focussed coding, we refined the open codes, sorting and ordering them into a hierarchical coding scheme.This enabled us to generate an initial set of working concepts.Where we found a relationship between two codes, we generated a 'relationship code' to link them.For example, some participants described changes in the films they watched, and related that change to progression into different life stages.This led us to generate a relationship code called 'Film Choice (Changes with) Life Stage'.
All qualitative datasets were ingested into the database based on their respective coding scheme.This was initially driven by the interview coding, which informed the preliminary shape of other coding schemes, influencing the structure of the data model.This qualitative data was ingested along with the quantitative data that was based on the selected variables.

Working with mixed-methods data in our data model
Managing and integrating different datasets into a coherent analysis is a challenge for all mixed-methods research, especially when it involves interpretive coding of unstructured (micro-scale) interview transcripts with description and exploration of (macro-scale) structured survey data.Mason argues that researchers should "...view mixed methods multi-dimensionally, rather than simply in qualitative-plus-quantitative terms..." (Mason, 2006, p. 15), in order to go beyond "...mimicking and reinforcing the micro/macro distinction…" (Ibid.).She adds this should be done creatively, openly, and reflexively in order to fully explore "...what different approaches can yield in practical, epistemological and ontological terms."(2006, p. 21).To address Mason's point, we defined a data model using the principles of a computational ontology to systematically combine and interrogate data from different approaches, at differing scales, whilst remaining sensitive to the underlying methods (Crossley and Edwards, 2016).
The use of a computational ontology enabled us to integrate data coherently because of its tri-part structure.This is called a 'semantic triple' in information science, it is composed of entities, characteristics, and relationships.Our data model incorporated concepts from the knowledge domain of film, cinema, and film audiences within all three parts.It also included the ingested quantitative variables and qualitative coding for its 'entities' and 'characteristics', and relationship codes for its 'relationships'.
To illustrate this, one interview participant (Sarah) explained that the films she chooses to watch have changed with her shift in life stage into parenthood: …since we had the children, we don't tend to watch really hardhitting stuff anymore...I find it quite hard to watch things that are overly graphically violent, and particularly things that involve young children...The tri-part structure in this example is as follows.Sarah is an example of a person (entity, with characteristics such as gender, age, residence) who is a (relationship) parent (a PersonCategory entity).Sarah experiences (relationship) film engagement (entity, described as "challenging to watch") with violent films (a FilmCategory entity).
Sarah's person category of Parent directly influences (relationship) her film engagement.By modelling in this way, we can draw on all our data to: 1) Examine all 'challenging to watch' engagements and find out what particular film characteristics are associated with this.
2) Examine who experiences different types of film engagement to see if there any lifestage patterns.
3) Ask questions about parenthood and film engagement in two different ways: we can examine the film engagements of parents versus non-parents or we can examine the person characteristics relating to parenthood and see which film engagements specifically relate to parenthood.
Analysed separately, each dataset provides useful insights, but by utilising a data model which uses the principles of a computational ontology we can consistently interrogate all our data -irrespective of its original format or type -and identify relationships across datasets.This enabled us to query our data for broader patterns in the way audiences form, to develop conceptualisations, while simultaneously delving into the depth, richness, and diversity of audience experiences.

Reflections on our methodology
Our approach responded to the need to sort, order and categorise different data so that it could be systematically combined and interrogated.We found the advantages of using a data model which employs the principles of a computational ontology were: • It ensures consistency in the coding of data within and across datasets.
• It identifies relationships between data through dual coding.
• It enables broad patterns and anomalies across the data to be revealed through distant reading techniques (such as data visualisation) which can then be explored further in depth through close reading.
• It enables the cross-referencing of datasets to provide a rich, multi-layered and relational understanding of key concepts such as 'audiences' or 'genre'.
• It forms the basis of an easy to use, open access resource, enabling stakeholders and researchers to explore the data.
There are also disadvantages.Encoding a large quantity of data in line with a data model that describes an entire domain of discourse requires significant time and resource.This is because the tri-part structure imposed by a computational ontology requires data to be encoded at a fine-grained level.This is especially the case with unstructured natural language data such as interviews.
However, the value of our approach is that it enables us to develop conclusions from a broad range of data sources, conclusions which may not have been evident from separate analyses of individual data sources.The analysis is iterative, allowing us to first work with each dataset and then the data produced through the relations made visible between the datasets.For example, there are numerous ways in which we might understand the relationship between audiences and place.In the interviews, we identified specific places with distinctive and active local film cultures, each fostering a unique range of film venues, events and organisations.In the film-elicitation focus groups, we identified relationships between specific film attributes such as portrayal of landscape.Through the data model, we can draw both datasets together and compare them with audience demographics from the survey data (e.g.age, gender, education, location, films watched, and cultural attitudes).Doing all this allows us to examine how place features within film worlds and helps us develop a relational understanding of place and film audiences.

Conclusion
In this paper, we have discussed the use of a data model using the principles of a computational ontology to help manage data from mixed methods research.While this process requires both time and attention to data management, it allows consistency when querying a range of data.In the BtM project this helped us to develop rich, nuanced, and meaningful insights into film audiences in depth and at scale.This included how audiences accessed diverse types of film through different platforms and venues and how meaning and value is established by audiences.Adopting an approach that keeps all data in perspective allowed us to explore the relations of film worlds, including film audience experience and how audiences interpret and consume film within a specific policy and industry context.
Using this approach, we are generating a fully documented and publicly accessible data model for describing film and audiences, and a series of data visualisations and analytical tools that will be freely available for public use.We are working with FAN and the BFI to use this resource to facilitate further debates about the cultural value of a diverse film culture and the role that policy and public funding can play in enabling such diversity.