Exploring the attributes of hotel service quality in Florianópolis-SC, Brazil: An analysis of tripAdvisor reviews

Abstract This study explores user-generated content (UGC) of the hotel sector in the city of Florianópolis-SC, Brazil, to identify the quality attributes of services and determine the polarity of the expressed feelings in the reviews of each attribute. The analysis is based on the latent topics and the polarity of feelings expressed in the reviews. UGC was collected using a crawler, resulting in a text corpus comprising 68,558 reviews. The polarity of feelings, positive or negative, was identified using sentiment analysis techniques and Latent Dirichlet Allocation (LDA) was used to identify latent topics in the corpus associated with the attributes of hotel service quality. This study found that “room,” “location,” “ambience,” “staff,” “breakfast,” “parking,” “reservation,” and “cost-benefit” were the attributes most frequently assessed by consumers in their reviews. The attributes that generate the most negative reviews were “room,” “parking,” and “reservation.” The attributes “location,” “ambience,” “staff,” “breakfast,” and “cost-benefit” were the attributes that generated most of the positive reviews. When comparing the results of this study to those of previous studies, two attributes demonstrated greater prominence: the attribute “room” that attracted a high number of negative comments and the attribute “parking” that had not presented itself with the same level of relevance in other studies.


PUBLIC INTEREST STATEMENT
The presence of a growing volume of usergenerated content (UGC) on online platforms in the hospitality sector, due to the advancement of information technologies, has proved to be a new source of useful data for the interpretation of customer needs. However, given the environmental and cultural differences in the contexts in which this content is generated, its interpretation becomes a challenge for managers and researchers interested in the subject. In this sense, this study is dedicated to exploring UGC, using natural language processing techniques (NLP), to identify the quality attributes addressed by customers and measure the polarity of the feelings expressed in the customer evaluations to provide consistent results and contribute to better decision-making in hotel management.

Introduction
Tourism is one of the most important sectors in the world economy, accounting for 10.4% of gross domestic product (GDP) and responsible for 9.9% of all jobs, according to the most recent data (WTTC, 2018b). Despite the importance of international tourism because of export revenues, domestic tourism investment represents 73% of the sector's total revenue (WTTC, 2018a).
In Brazil, tourism represents 8.5% of GDP. Brazil is the most dependent on domestic tourism, which accounts for 94% of the sector's total revenue (WTTC, 2018a). In 2017, Brazil had 60 million domestic tourists (Brasil, 2018).
Investing in domestic tourism has been an important strategy adopted by several countries to eliminate local poverty, generate jobs and economic growth, modernize infrastructure, relieve the pressure of overcrowding, and regulate the seasonality of regional visitors (WTTC, 2018a).
Moreover, the tourism industry and other economic sectors have been undergoing a transformation driven by advances in information technologies. Recommendation systems, online reservations, dynamic pricing, and interactive service review platforms have transformed the way individuals consume tourism products (Sanchez-Franco et al., 2019).
Today, customers are connected, critical in their evaluation process, have new demands, and expect to be surprised by each new experience. In this way, continuing to focus on the customer by monitoring their needs is necessary to navigate changes to survive in the market.
Previous studies conducted in different contexts in the hospitality field, such as on hotels in New York (Lee et al., 2020), Malaysia (Padma & Ahn, 2020), and in different countries (Guo et al., 2017), and on luxury hotels in Spain have shown significant aspects of the quality of services.
These studies are based on the user-generated content (UGC) on digital platforms and on websites focused on online reviews, which have allowed for a fast growth in the volume of data on the hospitality sector. In the case of hotels, customer reviews through spontaneous feedback on hotels' goods and services and UGC can be freely accessed and collected for analysis (Guo et al., 2017).
UGC enables the identification of significant structures between various aspects and attributes related to hospitality and tourism products, and is a promising area of research in social media analysis in the field of hospitality and tourism (Xiang et al., 2017). However, previous studies have not established a consensus regarding the relevance level of quality attributes about hotels. Studies have identified distinct sets of attributes that vary in relation to the level of relevance given by customers, and according to the environment and profile of the traveler (Sann & Lai, 2020a, 2020bSun et al., 2017;Ying et al., 2020).
Most of the previous studies also seek to determine the polarity of the feelings expressed in the reviews using the score scales suggested by the platforms (Lee et al., 2020;Sann & Lai, 2020a;Ying et al., 2020), where low scores are identified as negative reviews and high scores are identified as positive reviews. Nevertheless, a reviewer may express positive and negative feelings regarding different quality attributes in the same sentence (B. Kim et al., 2016), which leads the way to a more in-depth analysis of sentiments in UGC.
Thus, this study explores UGC in the hotel sector in the city of Florianópolis-SC, Brazil, to identify the quality attributes of services and determine the polarity of the reviews, positive or negative, regarding each attribute. Additionally, this study seeks to expand the knowledge base about UGC in Brazilian Portuguese, since the literature shows a predominance of studies that analyze reviews written in English compared to other languages (Ma et al., 2018).
This study is structured as follows: after this introduction, section 2 presents the concepts and results of the UGC approach in previous studies. Section 3 describes the methodology for collecting, cleaning, and analyzing the data. Section 4 presents the results. Section 5 presents the discussion and explores the polarity of feelings around the identified quality attributes. Section 6 presents the study's conclusions and limitations.

Literature review
In the hospitality and tourism industry, customer satisfaction is seen as a general emotional response to any intangible service. Whether the service meets or exceeds expectations, customers are generally satisfied (Li et al., 2013). Moreover, customer satisfaction is a factor that positively influences customer loyalty (Malik et al., 2020).
Just as people have different levels of experience and knowledge, they have different expectations and perceptions about the same product. It is up to organizations to identify and equalize the offer and delivery of products to their customers. To achieve this, it is necessary to identify the attributes of product quality that are valued by the customer and determine the level at which they are being satisfied. This is a difficult task that requires the formulation of models, which we discuss below.

Service quality models
There is a consensus in the development of service quality models that the abstract dimensions of perception, expectation, and satisfaction are defined by the customer and not by the service provider (Martin, 1995).
Thus, the correct identification of the main attributes of service quality is crucial to increase customer satisfaction. Several studies have been conducted to characterize service quality based on the subjective perceptions of clients and to identify the main factors that determine what is considered good service (Ju et al., 2019).
Instruments such as SERVQUAL and SERVPERF have been used for these purposes. SERVQUAL evaluates customer satisfaction and characterizes the gap between expectation and performance through a series of pre-established dimensions (Parasuraman et al., 1985). SERVPERF, created as an alternative to SERVQUAL, is based only on the perception of service performance (Cronin & Taylor, 1992).
Although these instruments are extensively used, they have known limitations, such as the lack of temporal resolution, meaning that they do not capture the frequency of visits to an establishment, and the dependence of the interviewees' perceptions on remembering past events (He et al., 2018).
Moreover, the development of empirical measurement scales relies on the prior identification of customer satisfaction dimensions based on the researchers' knowledge. The complexity of this type of research requires researchers to make a trade-off between the cost of collecting samples and the performance of the estimate (Guo et al., 2017).
In the hotel sector, the focus of this study is on other difficulties in implementing this type of methodology, such as the lack of co-operation from hotel managers for reasons related to the security of sensitive data about customers and about the business, and the high costs of performing the research (Malik et al., 2020).

Hotel service quality
Technological advances linked to the internet, and particularly social media platforms such as Facebook, Instagram, and Twitter, have dramatically increased UGC during recent years (Haghighi et al., 2018). Other platforms, such as TripAdvisor, Booking.com, Expedia, and Travelocity, were developed specifically for the tourism sector and promote reviews of consumer experiences in this sector (Xiang et al., 2017).
UGC is spontaneous and insightful feedback that is widely available, free or inexpensive, and can easily be accessed anywhere, anytime (Guo et al., 2017). It is an important source of information for researchers and professionals to understand customer preferences and demands in varied domains such as restaurants (Ha & Lee, 2018), accommodation (Ma et al., 2018), public transport (Nisar & Prabhakar, 2018), airports (Martin-Domingo et al., 2019), and traffic (Haghighi et al., 2018).
The greater capacity of data processing, given the advancement big data technologies in recent years, has allowed for considerable advances in the field of natural language processing (NLP). Unsupervised learning algorithms can be used to identify the attributes of quality through topic modeling and grouping. These algorithms enable the identification of aspects that users have focused on in reviews without the need for prior knowledge (Tran et al., 2019).
In the hospitality field, several studies have been conducted using UGC to understand the guests' reviews regarding the quality attributes of hotels and the factors that generate positive and negative comments.
In order to get a representative overview of previous studies in this specific study field, data were collected on scientific articles published over the last five years (2016)(2017)(2018)(2019)(2020) Table 1.
The distinct results found in these studies have shown that the relevance given by guests to the quality attributes of the hotel services may vary depending on the environmental and cultural context. For example, a study conducted in five-star hotels in Spain using TripAdvisor's UGC identified several hotel quality attributes and classified them into "staff," "services," "room," and "location" (Rios- Martin et al., 2020).
In Malaysia, a study analyzed TripAdvisor content for luxury hotels. The study revealed that the main topics related to the quality of luxury hotels' services were related to hotel attributes (restaurant and breakfast), "room," "staff," "trip" (walking, taxi, and business), and possible consequences of the "experience" (recommendation) (Padma & Ahn, 2020).
The results of these studies highlight the need to identify the quality attributes of the services in each context to inform the correct decision-making by the managers. Another important aspect, evidenced in previous studies, is the behavior variation regarding the quality attributes according to the traveler profile.
Research that used UGC of hotels in China revealed that Chinese guests expect personalized service, while North American guests prefer standardized service (Ying et al., 2020); the behavior of guests of different nationalities toward quality attributes of hotels in the United Kingdom is influenced by cultural context (Sann & Lai, 2020b); in hotels in the United Kingdom, Asian guests encounter more service failures with respect to the engineering aspect of operations (room equipment issues), while non-Asian guests encountered more service failures in housekeeping (toilets, public areas, cleaning, and bedding) (Sann & Lai, 2020a).
However, the identification of attributes alone is not enough to support decision-making in the hotel industry. It is necessary to identify the sentiment of the reviewer when expressing their opinion about a certain attribute. The analysis of users' feelings can be used, for example, to classify the polarity of sentiments as positive, neutral, or negative, and allow the analysis of many documents as a corpus (Martin-Domingo et al., 2019).
Several studies, as identified in Table 1, use the Likert scale (ranging from 1: terrible to 5: excellent) defined on the platforms to determine a negative (dissatisfaction) or positive (satisfaction) review from the guest regarding the quality attributes of the service platforms (Lee et al., 2020;Sann & Lai, 2020a;Ying et al., 2020).
However, a more detailed analysis of the sentiments expressed in reviews may present more striking results (Alaei et al., 2019). This is explained by the fact that some reviewers who rated their overall experience as excellent also made negative comments (B. Kim et al., 2016).
By associating the identification of topics related to quality attributes with sentiment analysis techniques allows one to transform qualitative data into quantitative data, measure consumer attitudes toward products, and compare its position with that of competitors (Ma et al., 2018).

Methodology and development of the study
The methodology comprises data collection and pre-processing-including data cleaning and transformation, the modeling of topics to discover the attributes of hotel service quality, and sentiment analysis to detect the polarity of the reviews.

Data set
The municipality of Florianópolis-SC was selected for this study because it is one of the most visited tourist destinations in Brazil. It is a destination recognized worldwide for its natural beauty-it has more than 100 beaches-and for the quality of life it provides. Florianópolis-SC is also the Brazilian state capital with the highest score on the Human Development Index (HDI). Its economy is heavily based on information technology, tourism, and services.
Reviews from foreign tourists were not considered in this study. Brazil receives only 0.47% of foreign tourism, which represents about 6% of national tourism. Foreign tourists are mainly from Argentina, United States, Chile, Paraguay, Uruguay, France, Germany, Italy, United Kingdom, Spain, and Portugal (Brasil, 2020). Therefore, the low volume of UGC and its distribution in different languages made it impossible to apply any of the topic analysis tools.
The data for analysis was downloaded from the TripAdvisor website, which is the largest aggregator of reviews of tourist products in the world and is visited monthly by 460 million users. The TripAdvisor website contains 830 million reviews and opinions on 8.6 million accommodations, restaurants, attractions, tours, airlines, and cruises. TripAdvisor operates in 49 markets in 28 languages (Tripadvisor, 2020).
TripAdvisor was chosen as the data source for this study because it has a few advantages over other platforms. This platform has been widely used in previous scientific studies (Taecharungroj & Mathayomchan, 2019;Xiang et al., 2017), which makes it easier to compare results. TripAdvisor also shows data reliability, since the platform has maintained its reputation by policing the system to avoid false reviews (Taecharungroj & Mathayomchan, 2019). The use of TripAdvisor is widespread in Brazil, where platforms such as Yelp and Ctrip are rarely used. Last, an earlier study proved that TripAdvisor has the best overall data quality, when compared to platforms such as Expedia and Yelp (Xiang et al., 2017).
The data were collected using a crawler developed by the authors in the Python programming language. As a result, a data set of 68,558 reviews provided between 2005 and 2019 was obtained, and included reviews of 922 hotels and inns listed on the platform. As this study focused on domestic tourists, only reviews written in Brazilian Portuguese were collected.
The dates of the evaluations and the ratings of each property-defined using metrics from the TripAdvisor platform and shown as several circles-were also collected. The circle scores were determined by user ratings on a scale of 1 to 5, where 3 was considered average, and 5 excellent. Some hotels had not yet been evaluated and were thus represented by zero. The distribution of data by year and by classification in circles is shown in Figure 1.
The first online reviews were recorded in 2005. However, since 2011, there has been a consistent growth in the number of reviews. This growth may be associated with the increasing use of online platforms by travelers and by the occurrence of the 2014 FIFA World Cup tournament and the 2016 Olympic Games (both events were hosted in Brazil).

Pre-processing
The text was pre-processed to organize, clean, and standardize the textual data to be subsequently used by the natural language processing and machine learning algorithms. The purpose of this procedure was to remove unnecessary content, thus reducing processing time and improving the performance of the models (Sarkar, 2019).
The techniques used for pre-processing the data set included removing special characters, converting capital letters to lower case, removing words without semantic value (stop words), and removing other unnecessary terms. Other techniques included applying stemming, which refers to the removal of affixes (prefixes and suffixes) from the words, and applying tokenization, which refers to the division of the text into words, sentences, or paragraphs, as necessary. These tasks were performed using the Python language and the Natural Language Toolkit (NLTK) library.
After conducting the initial tests, we decided to perform tokenization at a sentence level, as users often express conflicting feelings within the same review when addressing different aspects of their travel experience. Thus, judgment at a sentence level tends to increase the accuracy of the classifier. This process yielded a data set of 202,790 sentences for further analysis.

Topic modeling
The topics were modeled to discover the quality attributes of hotel services that influence customer satisfaction. For this, the Latent Dirichlet Allocation (LDA) method was used. The LDA is an algorithm that belongs to the class of unsupervised methods for the automatic detection of topics in texts. It is based on the idea that every document includes multiple topics and it seeks to discover latent (hidden) patterns to understand relationships between documents and words. Thus, words present in related documents are grouped into topics (Blei et al., 2003). The LDA method was implemented using the Machine Learning for Language Toolkit (MALLET) package.
The data were then transformed into vectors for later inclusion in the algorithms using the Bag of Words (BoW) N-grams attribute engineering method. The BoW model represents each text document as a numerical vector in which each dimension is a specific word in the corpus and the vector value is represented by the number of times it occurs in the document. The adding of N-grams causes a fusion of words preceding and following the position of each word according to the N value (Sarkar, 2019).
In this study, the value two was used for the parameter N of the BoW N-grams, creating a bigram. Other values were tested but a loss of precision in the model was observed as the N value increased. The model was also adjusted to remove terms with fewer than 20 occurrences in all documents and terms occurring in more than 60% of documents, consequently reducing the total number of terms from 25,704 to 4,295 terms.
This practice increases the accuracy of the model. It reduces the computational effort required to run the model because unique or rare terms have a low semantic contribution to the model and because recurring terms interfere with the interpretation of the context (Sarkar, 2019).
The optimal number of topics was defined based on the coherence score. A coherence score is a measure of the degree of semantic similarity among high scoring words in each topic. To this end, models were generated with the number of topics ranging from 2 to 50. Figure 2 shows the variation in coherence scores according to the number of topics.
As shown in the Figure 2, the greatest coherence was observed in 14 topics. Therefore, this was the value used in this study. Thus, all sentences in the data set were classified into one of the 14 topics revealed by the LDA according to the highest probability of belonging to each of the topics.

Sentiment analysis
The polarity of the users' opinions expressed in the reviews was identified through sentiment analysis. This technique allowed us to determine whether the opinion expressed in each text carried positive sentiments (e.g., "Awesome bed, complete, and delicious breakfast," "Attentive and friendly service," and "The service is excellent, and the location is great"); or negative ones (e.g., "Unbelievably bad breakfast," "It is a bit tight and charges a little more than the competitors," and "There was never anyone at the reception").
In this study, feelings were classified using a supervised learning algorithm. The authors manually tagged a random sample of 5,000 sentences from the data set, resulting in 66.19% positive reviews, 23.72% negative reviews, and 10.09% neutral reviews. In the training stage of the model, the data set was divided, with 70% used for training and 30% for testing. Thus, algorithm learning took place in 70% of the data, and the level of error was measured in the other 30%.
The data were transformed into vectors for later inclusion in the algorithms using the following formats: BoW and the Term Frequency-Inverse Document Frequency (TF-IDF). The TF-IDF model assumes that if a word is important for a document, it must have many occurrences in the document and few in the rest of the data set (Sarkar, 2019).
Term frequency (TF) means the same as BoW, the number of occurrences of a word in a given document, while the inverse document frequency (IDF) is the number of occurrences of the word in the set of documents (D. Kim et al., 2019).
The criterion used for choosing the classification model was accuracy. For that criterion, we evaluated the Multinomial Naive Bayes, Logistic Regression, Support Vector Machines (SVM), Random Forest, and Gradient Boosting Machine models. The Scikit-Learn library available for machine learning in Python was used in this step.
The accuracy of the models was evaluated through cross-validation (CV) and application to a battery of tests. While performing CV, the data set of each classification model was divided into 10 subsets, and consecutive runs were performed by alternating the test subset in each run, to obtain the mean accuracy value (CV score).
The models were applied to the test set to evaluate the behavior of the model in a data set not yet analyzed by it, that is, the 30% of the data set previously reserved for testing the model. The results of the application of the models to the test set (Test Score) and CV (CV Score) are shown in Table 2.
After comparing the accuracy of the models, the classifier we selected for this study was the Naive Bayes with the BoW attribute engineering method, which showed an accuracy of 80.36% in the CV Score, and 80% in the Test Score, both superior to the other models.
This classifier was applied on the rest of the data set, resulting in the classification of 77.89% sentences as positive and 21.28% of sentences as negative. The results of the sentiment analysis are discussed below, together with the results of the analysis of topics performed with the LDA.

Results
In this study, 14 topics were extracted from the corpus using the LDA method. Table 3 shows the main terms of each topic. As each sentence was previously disposed in a topic by the LDA, the importance of the topics can be determined by the percentage of sentences disposed in each topic, as shown in the second column of the Table 3.
The topics were manually labeled based on a combination of human judgment of each topic's main terms, as shown in Table 3, and the literature that described the quality attributes of hospitality services. Since topic 5, which corresponded to 6.23% of total stay-related reviews and group reviews, and topic 11, which corresponded to 7.32% of total reviews and group recommendations for future guests, did not have the characteristics of quality attributes, they were disregarded.
The results were validated by analyzing the words with the highest weightings disposed in each topic. In this analysis process, topics 1 and 9 were related to the "location" attribute, topics 2, 12 and 13 were related to the "room" attribute, and topics 6 and 8 were related to the "ambience" attribute. This procedure resulted in eight attributes, which are shown in Table 4.
After defining the attributes, they were combined with the results of the sentiment analysis. The combined information allowed us to determine the polarity of feelings expressed by customers concerning the attributes of service quality. Thus, besides identifying the attributes, the proportion  of positivity and negativity ascribed to the attributes could be measured, based on the polarity attributed to each sentence (see Figure 3).
Based on the polarity of feelings expressed in the reviews, the quality attributes of hotel services were divided into two groups: a set of attributes with a predominance of negative reviews and another set with a predominance of positive reviews.

Quality attributes with a predominance of negative reviews
The attributes "room," "parking," and "reservation" had predominantly negative reviews, with 32.51% of the reviews being negative. Negative reviews about these attributes accounted for 16.28% of total reviews. The words used most frequently in the reviews and classified as negative in the sentiment analysis, are shown in Table 5.
The most frequently reviewed attribute was "room," which accounted for 20.16% of total reviews and 41.76% of negative reviews. As shown in Table 5, the reviews referred to aspects such as: lack of space in the room, the poor condition of the furniture, the slow speed and intermittence of internet service, linen that was stained or in poor condition, poorly functioning doors, general uncleanliness, an uncomfortable bed, problems with the temperature and quality of the bath water, the lack of in-room furnishings such as a desk to support professional activities, and problems with the functioning of air conditioning, the television, and the minibar.
The attribute "parking," which appeared in 6.31% of the reviews, received negative reviews in 66.89% of its occurrences, and received the highest proportion of negative reviews among the attributes. For this attribute, as shown in Table 4, customers reported problems related to space,  (113), key (110) vehicle incidents, distance between the parking area and the hotel, time restrictions, exorbitant prices or prices undisclosed at the time of booking, and the lack of parking at certain establishments.
The third attribute with a predominance of negative reviews was "reservation," which received 6.05% of total reviews, 69.7% of which were negative. As shown in Table 5, in the "reservation" attribute, customers reported problems such as scheduling errors, incompatibility of room specifications with those displayed on online platforms, wait time, inflexibility with schedules, and lack of receptivity from frontline staff.

Quality attributes with a predominance of positive reviews
The attributes "location," "ambience," "staff," "breakfast," and "cost-benefit" received a predominance of positive reviews. These attributes accounted for 47.97% of the reviews, and of this total, 45.05% were positive and only 2.92% were negative. The words used most frequently in reviews and classified as negative in the sentiment analysis, are shown in Table 6.
In this group, the most frequently reviewed attribute was the "location" of the hotel relative to the places of interest to the traveler, such as restaurants, bars, beaches, the city's shopping district, the airport, tourist attractions, and event venues, among others, as shown in Table 6. This attribute was frequently mentioned in the reviews of hotels in Florianópolis-SC, with 16.8% of the reviews including this attribute, of which 97.03% were positive.
Next, the attribute "ambience" was mentioned in 13.48% of the reviews. It was associated with subjective aspects, such as tranquility, hospitality, receptivity, rest, and the views from the property. So, a good ambience makes a space more welcoming and conducive to socializing, as shown in Table 6. This attribute received 93.54% positive reviews.
Another attribute with a predominance of positive reviews was "staff," which was mentioned in 9.15% of the reviews. It was associated with aspects such as service, attention, reception, promptness, friendliness, education, helpfulness, receptivity, kindness, and friendliness, as shown in Table 6. This attribute received 94.74% positive reviews.
The attribute "breakfast" was also associated with positive evaluations by travelers. It was mainly associated with the variety, availability, and flavor of products such as coffee, fruits, and   (114) Cost benefit satisfaction (4198), cost benefit (865), quality (593), price (551), worth (222), prices (139), liked cakes, and the quality of customer service, as shown in Table 6. This attribute was reviewed in 8.54% of total reviews and received 87.5% positive reviews.
Finally, "cost-benefit" ratio was considered, and accounted for 5.96% of the reviews. This attribute was related to customers' perception of the ratio between money spent and service received, as shown in Tables 6; 95.3% of the reviews for this attribute were positive.

Discussion
Priority actions should be taken on attribute sets with a predominance of negative reviews, because negative content on travel platforms directly influences the purchase decision, and directly affects hotels' revenue. Around 35% of travelers change their hotel booking decisions after browsing social media, 53% say they do not book a hotel that has no reviews to view, and 87% say that reviews make them feel more confident when deciding to purchase accommodation (Nicoli & Papadopoulou, 2017).
The attribute related to the internal environment of the "room" is the one that managers should give the most attention, besides being most frequently reviewed by travelers, it is the one with the highest percentage of negative reviews. As shown in Figure 4, hotels with 4.5 circles received the highest proportion of negative reviews for this attribute. A longitudinal plot of the data also shows that there are no significant changes in the proportion of the categories over the period evaluated.
The attribute "room" was presented in most of the studies analyzed during article reviews, as shown in Table 1. Many of these studies identified this attribute as a positive comment generator (Alrawadieh & Law, 2019;Lee et al., 2020;Sann & Lai, 2020a), and linked it to customer satisfaction (Herjanto et al., 2017;Liu et al., 2017;Padma & Ahn, 2020;Wu et al., 2017). On the other hand, there were also studies that corroborated the results of this research, by observing the attribute "room" as a negative comment generator (Hu et al., 2019;Köseoglu et al., 2019).
According to the most commented words of each topic, according Table 5, actions to address this attribute should include routine protocols for checking the functioning of equipment and furniture, so that customers are not confronted by malfunctions. There should be a protocol for cleaning and replacing items that are in a poor condition. Criteria should be established to evaluate the adequacy of the furniture and the layout of the guest room to ensure free movement in the environment. Additionally, the room's layout should also support work-related activities, a rising trend in current times.
The "parking" attribute attracted similar reviews, as shown in Figure 4, with no changes in the order of categories over the period. This attribute appeared to be more important for guests in the area covered by this study, as it was not frequently found in related studies. When observed in other studies, this attribute was identified as a negative review generator (Fernandes & Fernandes, 2017;Xu & Li, 2016). The great number of reviews related to "parking" was another remarkable finding of the present study.
Actions related to this attribute should include offering adequate parking proportional to the demand and adapt parking spaces, while considering the technical criteria of architecture. The distance traveled by a guest is a critical factor and should be reduced whenever possible. Service personnel should be trained to serve the customer and quickly resolve potential problems.
The attribute "reservation" received the same proportion of reviews throughout the period, as shown in Figure 4. Longitudinally, this attribute was found in a higher proportion in hotels that received 4.5 circles, followed by those that received 4 and 5 circles. The attribute "reservation" also received the highest proportion of negative comments, with 69.7% of total reviews regarding this attribute. These findings confirm the results of previous studies (Hu et al., 2019;Sann & Lai, 2020b).
Actions on this attribute should include service staff training, avoiding errors, and improving customer treatment. Hotels should invest in technology to integrate internal software with external scheduling and billing platforms. Hotels should also examine the adequacy of their marketing policies for greater transparency to the customer to avoid creating disproportionate expectations of the actual service delivered. They should also adopt a clear policy regarding early check-ins and late check-outs, with flexible hours, and with no additional charges whenever possible. Figure 5 shows the distribution of the attributes with a predominance of positive reviews according to the classification using circles. Organizations with a low performance in these attributes could use these results to improve the quality of their relationship with customers by improving valued attributes and thus add value to their product. Organizations with a good performance record can enhance positive features to generate customer loyalty.
Positive ratings for the "ambience" attribute predominate in hotels rated with 4 circles, followed by hotels rated with 5 and 4.5 circles. Other studies have also identified this attribute related to positive comments on ecotourism experiences (Brochado & Brochado, 2019), and related to negative hotel reviews (Fernandes & Fernandes, 2017).
Low-performing organizations in this attribute can consider reviewing the infrastructure, improve aspects such as views from the hotel, interior decoration, comfort, and contact with nature. Actions to improve this attribute may include a discussion between users and workers to adapt architectural features. Moreover, actions can be taken to improve cleaning services, food, and recreation, and making the environment more familiar and suitable for children.
Reviews on the attributes "breakfast" and "staff" were more frequent among hotels that received 4.5 circles, followed by those that received 4 and 5 circles. Regarding the "location" attribute, hotels that received 4 and 4.5 circles predominated and had similar values, followed by hotels that received 5 circles. An important observation in this set of positive attributes is the growth in the number of hotel reviews receiving 3 and 3.5 circles, indicating that there may be significant improvements in these attributes in these categories of hotels.
Actions to improve the attribute "breakfast" may include standardizing the menu, adapting it to dietary restrictions, and using opinion polls to discover what products suit their guests' tastes. Quality programs to ensure the availability of products throughout attendance and adequate services, with extensions to suppliers, to ensure a standard flavor and punctuality in delivery.
Regarding the attribute "staff," it is important to select employees according to the hotel's target audience and to maintain a training program that aids in adapting to new technologies and valuing employees. Customers consider and value features such as friendliness, education, kindness, warmth, efficiency, agility, and courtesy. Another critical factor is the standardization of processes, which allows defining metrics to evaluate the results.
The attribute "cost-benefit," linked to an overview of the costumers' perception of the ratio of costs and services received, has not been directly pointed out in previous studies. Despite being related to a high percentage of positive comments, organizations with negative ratings on the "cost-benefit" attribute should consider revising their pricing policy. Actions in this attribute should consider hotel occupancy rates, which is an element outside the scope of this study. Finally, the "location" of the hotel is related to the proximity of places of interest for travelers to visit. Most of the previous studies confirmed the results of this research when associating the attribute "location" with a higher volume of positive comments or customer satisfaction (Alrawadieh & Law, 2019;Hu et al., 2019;B. Kim et al., 2016;Sann & Lai, 2020a;Wu et al., 2017).
Hotels with poor performance in this attribute could develop collaboration programs with other establishments and tourism agents. Such programs would allow services to be offered in the form of packages, with the possibility of monitoring the quality of the services offered to the traveler throughout the travel experience and correcting identified deficiencies.

Conclusion
This study has explored the UGC in the hotel sector of the city Florianópolis-SC, Brazil to identify the quality of attributes and determine the polarity of feelings based on reviews. For this purpose, a combination of analysis methods of topics and a sentiment analysis was used.
Through the topic analysis it was possible to establish that "room," "location," "ambience," "staff," "breakfast," "parking," "booking," and "cost-benefit" were the main attributes of the service quality of the hotels evaluated by customers.
The sentiment analysis made it possible to identify that "room," "parking," and "reservation" were the attributes that most frequently generated negative reviews. On the other hand, the findings also indicated that "location," "ambience," "staff," "breakfast," and "costbenefit" were the attributes that most frequently generated positive reviews from customers.
The results support the initial assumption about the existence of a variation in the level of relevance of attributes given by customers, based on the environmental and cultural context in question. In this study the attribute "room" stands out, which contradicts the findings of previous studies by being associated with a high number of negative comments. The attribute "parking" also stands out by not presenting the same level of relevance in other studies.
The results of the study support previous claims that UGC is a useful source of content for identifying the attributes of quality. The methodological procedure for allocating reviews to topics identified by the LDA and classifying guest feedback as positive and negative made it possible to transform the poorly structured data of customer reviews into a quantitative picture of quality attributes, allowing for replication in other studies. Finally, the results contribute to the field of quality evaluation of hospitality services in the Portuguese language, where UGC studies are still scarce.

Limitations
This study has some limitations. For instance, it only addressed attributes considered by domestic tourists, in Portuguese, and for only one Brazilian tourist destination. Further research could expand this scope by addressing the content generated by foreign tourists and consider tourist destinations in other Brazilian regions.
The data were obtained from a platform where the reviews refer to attributes of hotels. However, aspects outside the control of hotels too can influence the decision to visit or return to a tourist destination. Thus, future research should address this issue using the available content and other platforms related to tourism products, and include reviews on social networks such as Twitter, Instagram, Facebook, among others.
Last, this study does not relate the polarity of the expressed feelings to the reviews, or regard the quality attributes of the services, or general customer satisfaction. Further research could introduce this approach, including a correlation with the traveler profile.