Exploring environmental concerns on digital platforms through big data: the effect of online consumers’ environmental discourse on online review ratings

Abstract By deploying big data analytical techniques to retrieve and analyze a large volume of more than 2.7 million reviews, this work sheds light on how environmental concerns expressed by tourists on digital platforms, in the guise of online reviews, influence their satisfaction with tourism and hospitality services. More specifically, we conduct a multi-platform study of Tripadvisor.com and Booking.com online reviews (ORs) pertaining to hotel services across eight leading tourism destination cities in America and Europe over the period 2017–2018. By adopting multivariate regression analyses, we show that OR ratings are positively influenced by both the presence and depth of environmental discourse on these platforms. Theoretical and managerial contributions, and implications for digital platforms, big data analytics (BDA), electronic word-of-mouth (eWOM) and environmental research within the tourism and hospitality domain are examined, with a view to capturing, empirically, the effect of environmental discourse presence and depth on customer satisfaction proxied through online ratings.


Introduction
Human consumption activities have been found to generate natural resources depletion and jeopardize the very existence of the planet (United Nations, 2013).Tourism is no exception.As a subset of human activities, tourism-related activities have been found to consume considerable natural resources.In 2010, it was estimated that tourism-related activities consumed (on average) 3,575MJ of energy per trip, 6,575 liters of water per tourist, 42 m 2 of land per bed, 1,800 g of food per tourist/day, and generate 250 kg CO2 of emission per trip (G€ ossling & Peeters, 2015).The overall situation is getting worst over time, with growth factors over the period 2010-2050 estimated to be 2.64 for energy and CO2 emissions, 1.92 for fresh water, 2.89 for land use and 2.08 for food use.Tourism activitiesthe tourism and hospitality infrastructure, and transit activitieshave a detrimental impact on the environment (for example, in the form of CO2 emission).An excess of tourism flow, known as over-tourism, is identified as one of the major issues associated with tourism activities and has been increasingly researched (Oklevik et al., 2019).
Digital technologies in the hospitality and tourism domains are bringing about both risk and opportunities for society.As these technologies empower consumers, they can engender further tourism over-consumption (G€ ossling, 2017), but they can also make business processes more environmentally friendly and help monitor consumption levels in real time (Gorgemans & Murillo-Luna, 2016).Digital technologies and user-generated content (UGC) have become "the single most important new determinant in tourism's demand and supply structures" (G€ ossling, 2017, p.1025).Accordingly, and as recognized by recent studies (Mariani & Borghi, 2021a), digital technologies and digitalization have significantly impacted on the demand of tourism, changing the way consumers book and experience tourism products, and especially how consumers seek sustainable tourism products, services and experiences.However, so far, tourism and hospitality research has typically examined traditional media coverage of environment-related issues, largely disregarding the deployment of UGC-derived big data (BD) and analytics to generate insights on consumers' opinions of environmental issues.This is a major knowledge gap, originally raised in a recent study (Mariani & Borghi, 2021a), that we intend to address with the present work.Accordingly, we argue that travel-related online reviews (ORs) and UGCand the related analyticscan be leveraged to detect, monitor and analyze travelers' and consumers' environmental perceptions and concerns, therefore moving beyond their use to capture consumers' satisfaction with hospitality and tourism services (Xiang et al., 2017).As such, ORs can proxy online consumers' environmental concerns about tourism and hospitality services and products.More specifically, we build on the established and renowned marketing concept of electronic word-of-mouth (eWOM) (Hennig-Thurau et al., 2004) and use the conceptualization of online consumers' environmental discourse defined as "eWOM in the form of ORs directly related to online consumers' evaluations of environmental issues" (Mariani & Borghi, 2021a, p.833) and the distinction between the presence and depth of online consumers' environmental discourse to address the following research question: "What is the effect of online consumers' environmental discourse presence and depth on online review ratings?" To address this research question and overcome the empirical limitations of previous studies using convenience samples of customers to make sense of the social and environmental impact on customer satisfaction (Brazyt _ e et al., 2017;D'Acunto et al., 2020;Ettinger et al., 2018;Lee et al., 2016;Yu et al., 2017), we deploy advanced (big) data analysis techniques to retrieve and analyze the entire population of Tripadvisor and Booking.comORs, covering hospitality services in eight leading cities renowned for tourism in America (Las Vegas, Miami, New York City and Orlando) and Europe (London, Paris, Rome and Barcelona) over a long time.Building on more than 2.7 million ORs pertaining to hotel services, delivered by an unparalleled number of hotel companies in environmental studies (5,572 and 5,044 hotels listed on Tripadvisor and Booking.com, respectively), this study uses multivariate regression analyses to examine the effect of online consumers' environmental discourse presence and depth on online review ratings across the two platforms: a community-based review platform (i.e.Tripadvisor) vs. a transaction-based online travel agency (i.e.Booking.com).
As such, the study contributes to a nascent research stream at the intersection of digital platforms, eWOM, big data analytics (BDA) and sustainable tourism.Here we summarize three contributions and provide the full list of seven contributions in our discussion and conclusion.First, we contribute to sustainable tourism research by addressing how the presence and depth of online consumers' environmental discourse emerge on digital platforms, thus complementing a nascent research line (Mariani & Borghi, 2021a) which has simply tracked longitudinally the presence and depth of online consumers' environmental discourse.Our study differs from previous literatureand especially the work by Mariani and Borghi (2020) in that it quantitatively measures how consumers' environmental concerns influence online consumer satisfaction with tourism and hospitality services.Secondly, we find that tourists seek sustainable experiences (Oklevik et al., 2019) in online settings, and voice their environmental concerns on digital platforms through ORs.Thus, ORs are not only a rich data source to gain insights into online consumers' environmental discourse, but they also allow researchers to understandby means of BDAhow virtual customers (Nambisan & Baron, 2007;Nambisan & Nambisan, 2008) engaging with sustainability, can help generate important insights about prospective customers' behaviors.Third, by focusing on the online tourists' environmental concerns, this work enriches the body of eWOM literature in the sustainable tourism field that, so far, has mainly focused on broad corporate social responsibility (CSR) discourses (e.g.Ettinger et al., 2018) or green practices at the local or, at best, national level (e.g. Lee et al., 2016;Yu et al., 2017).More specifically, we enrich the research stream at the intersection between online consumers' perceptions of environmental issues through eWOM and their satisfaction with hospitality and tourism services (e.g.Brazyt _ e et al., 2017;Lee et al., 2016;Yu et al., 2017), by finding that there is a positive relationship between environmental discourse presence/depth and online customer satisfaction.

Digital technologies, digital platforms and big data analytics in tourism
Digital technologies have been recognized as major drivers of the ongoing digital revolution (R€ ußmann et al., 2015) and digital platforms the means through which novel business models are developed (UNCTAD., 2019) and value is created in the platform economy (Kenney & Zysman, 2016;Mariani and Nambisan, 2021;Mariani et al., 2021).Big data and analytics have been recognized as one of the technological drivers of the digital revolution and transformation of business.Beyond representing a technological paradigm per se, BD has been demarcated and conceptualized as a vast amount of data, whether structured or unstructured, which is produced at high speed because of technological advancements and the growth and diffusion of automation, the internet and connected devices (Mariani et al., 2018).
Over the last decade an increasing number of scientists, scholars and practitioners are relying on BD and analytics to uncover patterns in data which can translate into competitive business intelligence (Davenport, 2014) as well as knowledge (Erevelles et al., 2016).Also, scholars and practitioners working in the travel and tourism domains are deploying BD and analytics (Li et al., 2018;Mariani et al., 2018).BD and analytics are generated from various sources encompassing transactions, devices and UGC.Transaction data include online booking data, consumer cards data and web search data; more specifically web search data was deployed to forecast tourism demand (Pan et al., 2011;Peng et al., 2017).Device data encompass GPS, mobile roaming, Bluetooth, radio-frequency identification (RFID); more specifically, travelers' location data have generated useful insights about travel patterns and purpose (Gong et al., 2016), transportation modes (Kasahara et al., 2015) and points of tourism interest.Yet, UGC is certainly the most common source of datafor instance, ORs have been widely used to capture online customer satisfaction and consumers' evaluations of travel experiences, hospitality services (Guo et al., 2017;Xiang et al., 2017) and tourism destinations (Mariani et al., 2018).Geo-tagged online content, such as photos and social media posts, have been able to reveal the geographic distribution of travelers and residents, as well as tourism attractions with the most positive destination image (e.g.Marine-Roig & Clav e, 2015;Paldino et al., 2015;Salvatella, 2014).
User-generated content, in the form of social media content, posts and online reviews, is increasingly relevant not only to understand more about consumers' opinions and preferences about travel, tourism and hospitality products, but also to generate business intelligence to improve products, services and attractions (Fang et al., 2016;Mariani and Borghi, 2021;Mariani et al., 2016;Mariani et al., 2018).The most popular form of BD from UGC is ORs, and the importance and function of these are examined in the next section.

Electronic word-of-mouth and online reviews in tourism and hospitality
The development and growth of the internet, as well as digital platforms and social media, have led to a widespread proliferation of UGC in the form of ORs.Online reviews enable consumers to develop, articulate and share their opinions about goods, services, firms and brands in online contexts (Hennig-Thurau et al., 2004).In marketing, ORs constitute an important part of eWOM, subsequently termed as online word-of-mouth (King et al., 2014).
Researchers in disciplines such as marketing, computer science and information management have analyzed both drivers and outcomes of eWOM (Hennig-Thurau et al., 2004;Rosario et al., 2016), and eWOM is much more dominant than simple WOM due to its rapidity of diffusion, potential anonymity, one-to-many and many-to-many reach, convenience, lack of face-to-face interaction, and communication effectiveness (Sun et al., 2006).
In the travel and tourism domains, eWOM is endlessly engendered through ORs produced by consumers on digital platforms, including online travel agencies (OTAs) such as Booking.com and Expedia.com, and online travel review websites such as Tripadvisor and Ctrip.Online reviews have been found to influence consumers' reservation intentions, sales (Ye et al., 2009) and firms' performance (Mariani and Borghi, 2020;Yang et al., 2018).Tourists and travelers deploy eWOM in the guise of ORs before purchase.For instance, it has been found that Tripadvisor ORs are commonly adopted by consumers to inform their accommodation decisions (Gretzel & Yoo, 2008) and generally that some of the features of ORs (such as ratings and volume) have a positive impact on commercial and financial performance (Yang et al., 2018).

Environmental concerns in travel, tourism and hospitality
Environmental awareness and perceptions in tourists' and travelers' electronic wordof-mouth Over the last four decades an increasing number of firms, NGOs and national and international governments have adopted environmental schemes and practices (Pedersen & Neergaard, 2006) in response to the consolidation of an ecological position within the mainstream academic community recognizing that human consumption activity is a primary cause of natural resources depletion (United Nations, 2013) that might jeopardize not only firm growth and sustainability (Camilleri, 2018;Ek Styv en and Mariani, 2020;Salimath & Chandna, 2021), but also the very same existence of the planet.
Several policy makers, industry and professional associations, and other stakeholders (Clarkson, 1995) active in the tourism domain, realize that the consumption of natural resources is out of control, with negative repercussions on the planet (Di Pietro et al., 2013;Hardy et al., 2002).Accordingly, at the macro (national) level (Intergovernmental Panel on Climate Change, 2013), at the meso (organizational) level (Ettinger et al., 2018) and at the micro (individual) level (Liu et al., 2013) organizations and individuals are environmentally concerned.More specifically, policy makers are trying to put pressure on businesses to embrace low environmental business practices (G€ ossling et al., 2016).
Firms are not only embracing more sustainable business processes and practices (Bonilla Priego et al., 2011) but increasingly showcasing what they do for the environment through CSR reporting (Guix et al., 2018;Thongplew et al., 2017;Watts, 2015) and by offering green products (Kemper et al., 2019).Indeed, it seems that industry discourse on environmental issues has, at least formally, been shifting towards a more serious commitment (Jones et al., 2014) than a decade ago when companies tended to downplay their environmental footprint (G€ ossling & Peeters, 2007).This is becoming increasingly relevant for both large hotel chains and SMEs in hospitality (Dief & Font, 2010).For instance, in addition to identifying four strategic priorities for its Planet 21 program (encompassing employees, customers, partners and local communities), the Accor hotel group has developed a strategic framework, including sustainable food and sustainable buildings.Marriott developed its "Make a Green Choice" initiative, allowing guests to give up housekeeping services (e.g. the use of cleaning products) and barter it for loyalty points or food and beverage vouchers.Also, independent hotels are becoming more active from an environmental stance.For example, independent hotels affiliated with the Greater Miami and the Beaches Hotel Association have launched sustainable practices aimed at reducing water consumption and water wasting products/processes (Raub & Martin-Rios, 2019).
Consumers are heterogeneous in the way they perceive and deal with environmental issues at a global level.Some are not entirely aware of environmental ecological issues or the relevance of environmental and sustainability practices and certifications (Di Pietro et al., 2013;Guizzardi et al., 2017); others are particularly sensitive to environmentally friendly tourism products and services (Dolnicar et al., 2008).Several studies in the tourism and hospitality domain (Di Pietro et al., 2013;Gustin & Weaver, 1996;Kyung et al., 2012;Masau & Prideaux, 2003) indicate that environmental concerns influence consumers' purchase intentions.Several tourists are willing to pay a premium for green products and to meet sustainability requirements (Masau & Prideaux, 2003), whereas others think that the cost of sustainability initiatives should be incurred entirely by product/service providers (Gustin & Weaver, 1996).For instance, in a survey of U.S. hotel customers, Gustin and Weaver (1996) detected a positive relationship between customers' attitudes towards U.S. hotels' pro-environmental practices and purchase intentions.Kyung et al. (2012) find that, by deploying the New Ecological Paradigm (NEP) scale (Dunlap et al., 2000) on a sample of 9,000 tourists (the final respondents retained being 455) in Arizona, Florida and Texas, the degree of environmental concern positively influences the disposition to pay a premium for a hotel's green initiatives, consistent with previous findings (Dutta et al., 2008;Gustin & Weaver, 1996).In their study of restaurant customers in the southeastern United States, Di Pietro et al. (2013) find that customers perceived that they were sufficiently knowledgeable about green practices, but they wanted to gain more knowledge about them.The respondents mentioned that they were more prone to choose restaurants that are environmentally friendly and use environmentally safe products.Environmentally aware customers were characterized as having higher levels of education and adopting green practices at home.Interestingly, even among consumers sensitive to environmentally friendly tourism products and services, only a tiny fraction of consumers are willing to pay for environmentally friendly products and services (Kyung et al., 2012).
Overall consumer research in tourism and hospitality has yielded mixed findings; several studies have emphasized that consumers are not aware of/knowledgeable about environmental issues and practices (e.g.Guizzardi et al., 2017;Prayag et al., 2022), whereas other investigations have pointed out that tourism and travelers are sensitive to environmentally friendly tourism products and services (e.g.Dolnicar et al., 2008;Kyung et al., 2012).This research puzzle seems to be the result of a confluence of several factors: (1) most of the studies carried out so far have focused on stated intentions and stated behaviorsthey focused on consumers' perceptions rather than behaviors (Pedersen & Neergaard, 2006); (2) the majority of the research produced has examined perceptions before the service consumption (e.g.Di Pietro et al., 2013); (3) when Ors have been used as a source to document hotel guests' comments on service providers' CSR and environmental practices, convenience samples of customers have been used, adopting single case studies focusing on a few companies or destination cities within a specific continent (Brazyt _ e et al., 2017;D'Acunto et al., 2020;Ettinger et al., 2018;Lee et al., 2016;Yu et al., 2017).For instance, Lee et al. (2016) analyze the top 10 green hotels in the U.S.A. on Tripadvisor and find that the majority of hotel guests respond in a positive way to green practices (such as saving energy and water) when they recognize them.Nonetheless, they can express negative opinions when they are not made aware of green practices, and this happens because not all the 10 hotels effectively inform prospective customers of green practices on Tripadvisor.

Online environmental discourse of tourists and travelers, and online review ratings
Research on UGC and, more specifically, Ors has examined the relationship between several characteristics and features of online reviews (and online reviewers) and online review scores as a proxy for online customer satisfaction with, and evaluation of, travel, tourism and hospitality services (e.g.Guo et al., 2017;Li et al., 2013;Radojevic et al., 2015;Xiang et al., 2017) and subsequently, looks at the capability of these to influence tourism and hospitality firm performance more than traditional (offline) customer satisfaction (Kim & Park, 2017).
So far, OR ratings have been mostly used to capture, assess and understand online consumer satisfaction with a specific travel, tourism and hospitality service or its constituent attributes (e.g.Li et al., 2013;Radojevic et al., 2015).Very few studies have tried to relate the wider CSR construct (including both the social and environmental dimensions) to online ratings (Brazyt _ e et al., 2017;D'Acunto et al., 2020;Peir o-Signes et al., 2014).Leveraging OR ratings related to Spanish hotels, Peir o-Signes et al. ( 2014) find that four-star hotels endowed with an ISO 14001 environmental certification display higher ratings than those not endowed with such a certification.The difference is not significant in three-and five-star hotels, thus leading the authors to conclude that only four-star hotels are able to secure a distinctive competitive advantage from environmental certification.
By analyzing a small sample of 727 green ORs (i.e.ORs mentioning green experiences) covering the top 10 green hotels on Tripadvisor, Yu et al. (2017) mainly conduct descriptive and limited regression analyses and find that consumers document both positive and negative experiences at green hotels, and different types of green practices affect online customer satisfaction differently.More specifically, "energy" and "education and innovation" positively affect customer satisfaction, while "guest training," "energy," "water" and "purchasing" negatively influence customer satisfaction.
Building on a relatively small sample of less than 2,500 ORs of 30 Costa-Rican hotels endowed with a sustainable tourism certification, Brazyt _ e et al. (2017) observe that less than a third mention implicitly sustainability-related indicators.They also find that ORs explicitly mentioning sustainability indicators, display higher ratings (average rating ¼ 4.75) than reviews where sustainability indicators are not mentioned or implicitly mentioned (average rating ¼ 4.39).The results seem to suggest that service customers that are aware of the sustainability actions taken by hotels are comparatively more satisfied that those where hotels/reviews did not mention, or only implicitly mentioned sustainability issues.
Based on content analysis of about 1,380 ORs related to 47 Austrian hotels emphasizing CSR aspects, Ettinger et al. (2018) find that the bulk of ORsmore than 90%display a neutral/positive characterization.Yet, the authors do not examine what neutral/positive generates in terms of actual ratings.
Overall, research on the relationship between environmental eWOM and OR ratings is rather fragmented and limited.Indeed, it has been studied either in a specific country (e.g.Yu et al., 2017) or in a limited set of cities, within the same continent and using a limited number of reviews compared to those available.Furthermore, as far as analytical techniques are concerned, the analyses conducted are extremely rudimentary (mostly relying on simple correlations, such as D 'Acunto et al., 2020, Peir o-Signes et al., 2014) and when a model was fitted (Yu et al., 2017), it included a very small sample of observations.

Data and sample
Online review data was collected from two distinctively different platforms: Booking.com and Tripadvisor.Both are good exemplars of a transaction-based and a community-based digital platform, respectively (Gligorijevic, 2016).The former was chosen because it hosts the highest share of certified hotel reviews worldwide (Revinate, 2018), while the latter was selected because it is the major online travel review website in the world.Data was collected through two scrapers developed in the Python programming language (by leveraging the libraries Selenium and Beautiful Soup) at the beginning of 2019.First, we sampled the top 10 city tourism destinations in terms of international tourist arrivals in both America and Europe (Geerts, 2018), and later focused on four city destinations in each: Barcelona, London, Paris and Rome (Europe); Las Vegas, Miami, New York City and Orlando (North America).We used the crawlers to retrieve the full list of reviewed hotels across our reference platforms.
Second, we collected the entire population of ORs covering the hotels located in these destinations across both platforms (i.e.Booking.com and Tripadvisor), over two years (2017)(2018).Third, in line with other studies adopting text analytics (e.g.Xiang et al., 2017;Zhao et al., 2019), we kept only ORs written in English in our final database.We performed this task adopting the language detection package (langdetect) available in Python, which allows the detection of the language of a review through a lexicon-based analysis (Xiang et al., 2017).Overall, 2,702,227 ORs were retrieved: 1,144,461 from Tripadvisor and 1,557,766 from Booking.com.Therefore, the data builds on a very large number of reviews extracted from different types of OR platforms (community-based vs. transaction-based OR platform) and pertaining to hospitality services across multiple countries and continents.

Techniques adopted
To address our research question, we deployed model specifications with variables that are illustrated in the following section and, more specifically, Tobit regression analysis (Tobin, 1958) for the Booking.comORs (as the dependent variable of rating is close to continuous but is left and right censored) and ordinal logistic regression analysis (Greene, 1999) for the Tripadvisor ORs (as the dependent variable is ordinal and can assume only five categorical values).

Variables
The key variables used in this study are illustrated and described in Table 1.The two focal variables -Environmental Presence and Environmental Depthare operationalized in line with extant literature (Mariani & Borghi, 2021a).Following the lead of Gao et al. (2018), the Observed Average Rating is defined as the average hotel's OR rating, as observed by the reviewer at the time when they posted their review.Reviewer Experience is a proxy of the experience in online reviewing in the focal OR platform and it is measured as the overall number of ORs written by the reviewer in the focal platform (i.e.either Tripadvisor or Booking.com).Reviewer Image is a dummy variable that equals to 1 if the reviewer used a personalized image for their profile on the platform (Forman et al., 2008).Country of Origin Disclosure is a dummy variable that is equal to 1 if the reviewer did not disclose their country of origin (see Filieri et al., 2019).
In terms of text analytics, we focused on Review Length and Review Polarity.The former consists of the number of words in the OR (e.g.Chevalier & Mayzlin, 2006;Zhang et al., 2016).The latter, sometimes termed as sentiment score, is a continuous variable ranging from À1 to þ1 and was computed using the Valence Aware Dictionary for sEntiment Reasoning (VADER), based on a dictionary and heuristics (Hutto & Gilbert, 2014).The measure was preferred over other alternative measures as recent research has found that it performs other measures in the tourism domain (Alaei et al., 2019).
Several control variables were used, including Type of Trip, Type of Group, Submission Device, Destination City, Year, Chain and Star Rating.Type of Trip is a categorical variable that describes the trip purpose: leisure or business.Type of Group is a categorical variable that indicates if the travelers were a couple, solo, family or group.The Submission Device is a dummy variable that describes if the OR was written using a mobile device or a different device connected to the It is a ratio equal to the number of environment-related words (words present in the Pencle & M al aescu, 2016) environmental dictionary) over the total number of words in the review, multiplied by 100.In other words, it captures the percentage of environmental elements in the online review (see Mariani and Borghi, 2020).Observed Average Rating (Observed Avg Rating) Hotels' review average rating as observed by the reviewing guest at the time when they posted their review (see Gao et al., 2018) Reviewer Experience It is the overall number of online reviews written by the reviewer in the platform.

Reviewer Image
It is a dummy variable that is equal to 1 if the reviewer used a personalized image for its social profile in the platform, and zero otherwise (Forman et al., 2008).Country of origin disclousure (Country Disclosure) It is a dummy variable that is equal to 1 if the reviewers did not disclose their country of origin, and zero otherwise (see Filieri et al., 2019).

Review Length
It represents the number of words included in each online review (Chevalier & Mayzlin, 2006;Zhang et al., 2016).

Review Polarity
The polarity, also known as sentiment score, was operationalized using a continuous variable ranging from À1 to þ1, respectively, equating to extremely negative and extremely positive content and emotions.To create this measure, we used the Valence aware dictionary for sentiment reasoning (VADER), which exploits a set of heuristics along with a specific lexicon dictionary for this particular task (Hutto & Gilbert, 2014) It is a categorical variable that describes the hotel class category adopted to classify hotels according to their quality (from 1-to 5-stars).It has been retrieved directly from the page of each hotel as displayed in the online review platform (Booking.comor TripAdvisor).aReview Valence Online review rating posted by online reviewers to summarize with a number their satisfaction with the hospitality service.a Since TripAdvisor uses a customized rating system including half-star rating, in order to obtain the final value we rounded the retrieved value to the closest half-star rating.(Gao et al., 2018).
internet (e.g.desktop computer).Destination City is a categorical variable that indicates the city where the reviewed hotel is located.Year represents the year when the review was written.
Chain is a dummy variable that indicates whether the hotel belongs to a hotel chain or not.Star Rating is a categorical variable that describes the hotel class category adopted to classify hotels according to their quality (from one to five stars).Overall, the controls have been used in extant tourism and hospitality literature trying to identify and explain the determinants of online review ratings (e.g.Gao et al., 2018;Mariani et al., 2019aMariani et al., ,2019b) ) Online Review Valence has been used as a dependent variable to address our research question and it represents the OR rating posted by an online reviewer to summarize with a number their satisfaction with the hospitality service.
Finally, environmental depth, reviewer experience and review length were log transformed, given the skewness of the variables' distribution (Filieri et al., 2019).Tables 2.a and 2.b illustrate the descriptive statistics of the variables under consideration for the period 2017-2018.

Model specification
When choosing the econometric model to deploy, we considered the nature of the dependent variable analyzed in the study, namely the overall online rating.Thus, on the one hand, when analyzing Booking.comORs, we adopted a Tobit multivariate regression as the online rating is continuous but both left and right censored, with the minimum and maximum variables being 2.5 and 10.0, respectively (Mariani & Borghi, 2018).On the other hand, for Tripadvisor ORs we used an ordered logit regression model as the online rating in the platform is a categorical variable assuming only five values (see Yu et al., 2017;Zhang et al., 2016).Therefore, based on the samples indicated in the research design, we developed two model specifications.More formally, denoting with Valence Ã ij the underlying latent variable representing the latent overall rating provided by reviewer i for hotel j, we estimated the following econometric specifications: where e ij refers to the error term at the single review level and, b i and h i indicate the regression coefficients and vector of coefficients, respectively.From the latent variable Valence Ã ij , in the case of the ordered logit regression model, the observed review rating Valence ij , is obtained through the following rule: where r relates to the number of alternatives embedded in the model, in our case R ¼ 5 since Tripadvisor allows to rate a hotel on a five-point Likert scale.Besides, we assume k 0 ¼ À1 and k R ¼ 1, whereas k 1 to k RÀ1 designate the set of cut-off thresholds, estimated by the model, that are used to determine the correct interval of materialization of the predicted latent variable (Cameron & Trivedi, 2005).
Regarding the Tobit model, the observed Booking.comrating for an online review is specified as: where y L and y U represent the lower (2.5) and upper (10) limit of the distribution of the observed dependent variable, respectively.As clear from both models, the reference dependent variable is online review valence (namely the OR ratings) and it was regressed against the focal independent variables (Environmental Discourse Presence in model 1 and Environmental Discourse Presence in model 2) and a series of other explanatory variables, including observed average rating, reviewer experience, reviewer image, country disclosure, text analytics (namely Review Length and Review Polarity) and control variables such as type of trip, type of group, submission device, destination city, year, chain and star rating.

Key findings
The results of the regression analyses capturing the influence of environmental discourse presence and environmental discourse depth on OR ratings are illustrated in Table 3. Online consumers' environmental discourse presence appears to influence positively OR ratings, both on Booking.com(p < 0.001) and on Tripadvisor (p < 0.001) (see, respectively, models 1.B and 1.T in Table 3).This finding extends extant small-sample research that found that environmental certifications or wider CSR practices positively influence OR ratings (e.g.Brazyt _ e et al., 2017;Peir o-Signes et al., 2014).
Interestingly, online consumers' environmental discourse presence appears to influence positively OR ratings both on Booking.com(p < 0.001) and on Tripadvisor (p < 0.001) (see models 2.B and 2.T in Table 3).This finding is rather novel if benchmarked with extant literature in the tourism and hospitality field and suggests that the comprehensiveness of the environmental-related discourse within ORs has the capability to influence positively and significantly online review ratings.-1,116,363.0 -1,116,183.8 -2,703,911.3 -2,703,982.9Notes: Ã p < 0.05; ÃÃ p < 0.01; ÃÃÃ p < 0.001.
The analysis of the focal independent variables (Environmental Presence and Depth) suggests that both the presence (proxying environmental awareness) and the depth (proxying in-depth digging about environmental aspects) of environmental-related aspects in ORs make a difference for online ratings and, ultimately, for customers' online satisfaction.The effects measured are robust regardless of the digital platform considered (community vs. transaction-based platform), geographic setting (continent or country where the destination is located), type of trip (leisure or business), type of traveling group (couple, solo, family and group), submission device (mobile vs. desktop), time (specific year), or firm level characteristics (such as belonging to a chain or to a specific category).
In relation to the reviewer-level control variables, reviewers' experience in online reviewing is negative and significant (p < 0.001) in all the four models, consistent with previous research (e.g.Mariani et al., 2019b).Given that experience in online reviewing is, to a certain extent, connected to travel experience (Gao et al., 2018), this result is consistent with extant research showing that experience has a negative impact on reviewers' online ratings.
A reviewer's disclosure of their image can positively influence online ratings (p < 0.001) in all the four models, in line with extant research conducted in tourism and hospitality settings (Gao et al., 2018).Disclosure of the reviewer's country of origin also exerts a positive impact on online ratings (p < 0.001) in each of the four models, consistent with the existing body of tourism management and marketing literature (Filieri et al., 2019).
As far as the text analytics are concerned, Review Length negatively influences (p < 0.001) OR ratings in all the four models, and this is considered in line with extant literature (Chevalier & Mayzlin, 2006) as customers tend to put more effort in writing and write longer reviews when they are dissatisfied with a product or service.Review Polarity positively affects (p < 0.001) OR ratings in all the four models, again considered in line with extant literature (Geetha et al., 2017) as customers evaluate their consumption experience more positively when they are in a positive emotional state (Isen, 1987).

Discussion and conclusion
Building on more than 2.7 million ORs collected from Booking.com and Tripadvisor.com (Li et al., 2018;Mariani et al., 2018), and examined with sophisticated big data analytical techniques, this work has described consumers' behaviors and how consumers perceive and deal with sustainability issues, and if consumers' environmental discourse affects their OR ratings.By adopting multivariate regression analyses, we have shown that OR ratings are positively influenced by both environmental discourse presence and depth on both the platforms analyzed.Overall, our findings shed light on the presence and impact on customer satisfaction of consumers' environmental concerns embedded in tourists' UGC, across several leading tourism destinations based in two different continents.

Theoretical contributions
This study contributes to digital platforms, BDA, eWOM and environmental research within the hospitality and tourism domain in multiple ways.To the best of our knowledge, this is one of the first studies in sustainable tourism research addressing how the presence and depth of online consumers' environmental discourse emerge on digital platforms, thus complementing a nascent research line (Mariani & Borghi, 2021a) which simply tracked longitudinally the presence and depth of online consumers' environmental discourse.Unlike previous studies that have used social media (predominantly Twitter) to track online customers' opinions (e.g.Chisholm & O'Sullivan, 2017;Reyes-Menendez et al., 2018), this is a cross-platform study ensuring robustness of findings and generalizability.Moreover, while previous studies (e.g.Mariani & Borghi, 2021a) measured the presence and depth of the environmental discourse, the present work measures quantitatively the influence of the presence and depth of the environmental discourse on online review ratings, which constitute a proxy of customer satisfaction.Second, our findings seem to broadly suggest that tourists are interested in (and engage with) environmental issues and have environmental concerns, voicing themeither explicitly or implicitlythrough their ORs; this finding resonates with recent literature emphasizing that travelers seek sustainable experiences (Oklevik et al., 2019).This extends recent studies (Mariani & Borghi, 2021a) that have looked at how digital technologies and digitalization can help monitor online consumers' environmental discourse, by suggesting that digital technologiesand especially BDA as a pillar of the digital transformationcan help consumers and tourism researchers make sense of online reviewers' evaluation of sustainable tourism products, services and experiences.Third, we indicate that in today's digital world, consumers' concerns can be captured effectively by relying on digital platforms and ecosystems (Nambisan, 2017) which offer a rich data source in the guise of consumer reviews.This seems to support earlier conceptualization of virtual customers (Nambisan & Baron, 2007;Nambisan & Nambisan, 2008) that have played a critical role in the development of digital entrepreneurship literature (Nambisan, 2017).Fourth, by focusing on the online tourists' environmental concerns, this work enriches the body of eWOM literature in the sustainable tourism field that, so far, has mainly focused on broad CSR discourses (e.g.Ettinger et al., 2018) or green practices at the local or, at best, national level (e.g. Lee et al., 2016;Yu et al., 2017).Building on the received distinction between the presence and the depth of online consumers' environmental discourse (Mariani & Borghi, 2021a), we test a model and offer empirical substantiation of the constructs/variables developed in the recent sustainable tourism literature.Fifth, we contribute to consumer behavior literature (e.g.Straughan & Roberts, 1999;Thøgersen et al., 2010) by suggesting that environmental issues do influence consumer perceptions of tourism and hospitality services.Sixth, we contribute to literature on big data and analytics in tourism (Li et al., 2018;Mariani et al., 2018), by methodologically developing and deploying text analytics from UGC sourced across different digital platforms.Seventh, we enrich the emerging debate on consumers' perception of CSR initiatives in the tourism and hospitality industry (Ettinger et al., 2018), especially the research stream at the intersection between online consumers' perceptions of environmental issues through eWOM and their satisfaction with hospitality and tourism services (e.g.Brazyt _ e et al., 2017;Lee et al., 2016;Yu et al., 2017), by shedding new light on the relationship between environmental discourse presence and online customer satisfaction, proxied by means of OR ratings.In addition to augmenting the power and generalizability of previous findings on the relationship between green practices and OR hotel ratings (by adopting the largest sample of ORs considered so far in the research stream), we go much beyond andbased on well specified multivariate regression modelsalso evaluate the relationship between the depth of the environmental discourse embedded in ORs on online consumer review ratings.Text analytics suggest that both the presence and depth of online consumers' environmental discourse positively and consistently effect OR ratings.

Practical implications
Several practical implications stem from this work, including implications for tourism and hospitality practitioners, and digital platforms managers and developers.
As far as tourism and hospitality practitioners are concerned, they should recognize that travelers give better evaluations to services when environmental aspects are mentioned, as is clear from the positive and significant effect of environmental discourse presence and depth within online review ratings across both the platforms analyzed.This should push hotel managers to: 1) invest in and develop green initiatives, programs and practices, with the awareness that only a tiny portion of consumers will pay for eco-friendly products and services (Kyung et al., 2012); 2) look for environmental certification, as this has been found to generate benefits in specific conditions (Peir o-Signes et al., 2014); 3) develop a clear marketing communication strategy and ad hoc communication tactics that encompass clear messages aimed at clarifying and detailing prospective customers' tangible benefits pertaining to the green practices adopted (Dolnicar et al., 2017), beyond explaining the societal benefits of those green practices; 4) possibly include education in the repertory of their green practices; environmental education has been found to play a crucial role in affecting online ratings (Yu et al., 2017); 5) emphasize their environmental commitment when responding to (positive) ORs that have mentioned environmental aspectsthis might eventually translate into higher purchase intentions of prospective customers interested in environmentally friendly experiences; 6) record preferences towards specific green practices (e.g.towel reuse) into the company database to facilitate instantaneous knowledge about the customer and their pro-environmental preferences.
Platform developers and managers that deal with OTAs and online community travel review platforms would benefit from the findings of this study as they host ORs which relate to environmental issues.As environmental-related ORs display different features than non-environmentalrelated ORs (as is clear from previous research), platform managers might develop and/or use environmental dictionaries, allowing them to segment both ORs and online reviewers that wrote them, as this might affect online review ratings.Second, given that the presence of environmental-related ORs seem to have a positive impact on online ratings, platforms developers can develop an attribute named "green friendly" for the hospitality service, as other research conducted in offline contexts has shown that the presence of a "green" attribute enhances consumers' involvement in the purchase process (Straughan & Roberts, 1999;Thøgersen et al., 2012).This might assist both hotel managers and customers in designing and assessing green processes and practices, respectively.Eventually, this will increase consumer bookings at hotels that explicitly cater to consumers with ecological sensitivities.

Conclusion and limitations
This work contributes insights into online consumers' perceptions of environmental concerns and sheds light on the extent to which consumers' environmental sensitivity, as expressed on digital platforms, can influence customer satisfaction.In particular, we measure empirically the effect of online consumers' environmental discourse on online review ratings as a proxy of customer satisfaction.The objective is pursued by leveraging large volumes (big data) of OR sources in relation to hospitality services across different firms, destinations, continents and countries.Accordingly, this paper contributes to the area at the intersection between digital platforms, BDA, eWOM and environmental research within the hospitality and tourism domain, by examining the relationship between the environmental features of ORs and OR ratings (Peir o-Signes et al., 2014;Yu et al., 2017).
Leveraging more than 2.7 million ORs collected and examined by means of BDA techniques, we detected that both the presence and depth of online consumers' environmental discourse have a positive impact on OR ratings.Interestingly, OR ratings are positively influenced by both environmental discourse on presence and depth on both platforms, regardless of the type of platform analyzed.
This study is not without limitations.First, while we have used several controls, further variables (such as demographics) could be included in the analysis.However, on both Booking.com and Tripadvisor most of the reviewers' profiles lack demographic information and, when present, the demographic information is not necessarily reliable.This is the reason why in other popular and recent studies using online review analytics (e.g.Xiang et al., 2017;Zhao et al., 2019), demographics are never used.Second, while this work represents the first attempt to measure the impact of the presence and depth of online consumers' environmental discourse on online review ratings across different destinations and continents, other destinations in the Asia-Pacific could also be considered.Third, a common limitation when using dictionaries is that researchers assume the words used are related to the specific context, wherein the word is produced or to which the word relates to with the higher probability; it might be possible that words in the environmental dictionary might also relate to the destination.However, as they are included in ORs of hotels it is realistic to assume, like other researchers have done (e.g.D 'Acunto et al., 2020), that the words mostly relate to the hotel.

Table 1 .
Variables and descriptions.

Table 3 .
Online review ratings and environmental discourse depth and presence inTripadvisor and Booking.comeWOM, 2017-2018.