Evaluating PPGIS Usability in a Multi-National Field Study Combining Qualitative Surveys and Eye-Tracking

ABSTRACT For designing qualitative interfaces for Public Participatory Geographic Information Systems (PPGIS), the user and use case should be clearly defined. However, PPGIS users may differ significantly, e.g. regarding their cultural background, IT-literacy, or interests. Studies examining varying user types and their impact on PPGIS usability are, however, lacking. In this paper, we analyse the user spectrum through conducting a usability study with 73 participants located in Colombia, Uganda and Austria. We combined a qualitative survey (conducted in all three countries) with an eye-tracking based survey (conducted only in Austria). Most of the usability issues arose due to inexperience in using interactive maps or applications other than social media. Based on the findings, we explored which user context information had an impact on which usability problem. With this, we designed an adaptation gradient that can be used for future research on developing adaptive PPGIS interfaces.


Introduction
Summarizing the ISO definition, an ideal application design should respond to (the requirements of) specified users and leverage context characteristics. For example, providing an interface for both elderly people and young adults would potentially diminish usability since the requirements of both user types might differ significantly. We argue that the above-depicted scenario is common for the use of PPGIS. As an example, tools like Maptionnaire, Social Pinpoint, or OpenStreetMap are used in multiple countries with a large variance of users. These users might differ in their demographic characteristics, their roles (e.g. citizens, urban planners, city or municipality staff, scientists), their education, their interests, and so on. Prior usability studies in GIScience and related fields (such as Haklay and Tobón 2003, Meng and Malczewski 2009a, Bugs et al. 2010, Newman et al. 2010, Bugs 2012, Poplin 2015, Atzmanstorfer et al. 2016, Gottwald et al. 2016 have not investigated how to deal with multiple user types while maintaining a high level of usability. Some studies include facilitators who can intervene and help users who may need assistance in using a PPGIS through capacity building or one-on-one support (Aditya, 2010;Eitzinger et al., 2019). Although this is one potential way for introducing PPGIS to diverse groups of citizens, it is also time and capital intensive in the long run and does not solve the underlying problem (Eitzinger et al., 2019).
Consequently, a gap exists between theory and practice when it comes to PPGIS usability: even though we have observed PPGIS being used by varying user types, it is still not clear how to merge the preferences and requirements of these different user types into the PPGIS design (Bugs, 2012) without accepting usability deficiencies. A logical consequence would be to design multiple PPGIS interfaces that respond to the requirements of each user type to concur with the ISO usability norm. One approach is to include adaptive functionalities into the interface (Dey and Abowd, 1999), i.e. to optimize the interface to each user type with a suitable interface. Among others, adaptive map interfaces have been discussed in studies like Ballatore and Bertolotto (2015) and Kiefer et al. (2017). For exploring new opportunities for the inclusion of adaptive interfaces into the PPGIS design, this paper sets distinct objectives and non-objectives (Table 1).
In support of these objectives, we conducted a usability study by carrying out a qualitative survey with a PPGIS used in three countries: Colombia, Uganda and Austria. We selected these countries expecting a difference in user characteristics based on their differing development level, and sub-divided the participants into groups according to their level of digital literacy and age. In Austria, we further combined the survey methodology with eyetracking to expand our understanding of potential usability problems (Çöltekin et al., 2009).
Our study aims to understand the differences in usability and user preferences of a broad range of users in three different study areas. For this, we raise the following research questions: . Which are the main usability problems of a PPGIS for novice users? . To what degree does the usability vary between participants and is there a connection to the interface preferences? . What additional insights does the combined approach, using a qualitative survey and eye-tracking statistics, offer? . What conclusions can we draw from these insights for creating an adaptation gradient?

Methodology
The study at hand analyses how intuitive the GeoCitizen application is for novice users. We selected this PPGIS since the tool is used in multiple countries, serving as an example for a globally used PPGIS. The GeoCitizen application ( Figure 1) asks for citizens' ideas and preferencesbased on their unique local knowledgeto be included in urban or spatial planning processes. The aim of the PPGIS is to enable dynamic communications and discussions among citizens of cities, towns, or municipalities (Atzmanstorfer et al., 2014). For this, citizens can respond to surveys and upload or discuss georeferenced proposals.
In 2015, Atzmanstorfer et al. (2016) conducted a usability study with GeoCitizen with participants from marginalized communities from Cali, Colombia. Since then, GeoCitizen developed into a tool that is used in Table 1. Objectives and non-objectives of this paper.

Objectives
Non-objectives . Analyse the usability of a globally used PPGIS that is not designed for a very specific use case. . Understand how user characteristics are related to usability issues. . Based on this, examine an adaptation gradient.
. Analyse the usability of a locally used PPGIS that is designed for a very specific use case. . Understand the usability issues of all PPGIS. This paper explores the usability of one PPGIS that might or might not represent other PPGIS. . Not comparing user performances and user quality, especially not between study areas.
multiple countries with a large variance of user groups. In this paper, we want to take into account this development by not only testing the usability of the tool with one user group but with multiple. In correspondence with the ISO 9241-11 usability standard, we measured the effectiveness, efficiency and satisfaction of using GeoCitizen.

Survey design
The study consisted of two parts (Table 2). In the first part (based on Atzmanstorfer et al. (2016)), we asked the participants to carry out six tasks with the application. We measured three usability metrics (Nielsen, 2001): first, the effectiveness was assessed through the error rate in the completion of these taskscompletion without errors, with light non-critical errors, with moderate non-critical errors and with critical errors. Light non-critical errors refer to slight usability problems, such as short confusions that were easily and quickly overcome. Moderate non-critical errors, on the other hand, are confusions that took a longer time to overcome. Critical errors refer to confusions that were not overcome. Second, the efficiency of the application was measured by analysing the time that the participants needed to complete an activity. Third, through a questionnaire, we asked how satisfied the participants were with the usability.
In the second part of the survey, we provided three interface sketches, serving as alternative interface designs to the interfaces the participants were using in the first part of the survey (Figure 1). We provided (1) a simple interface with a text-based wizard, which guides the user through the functionalities of the application, (2) a simple interface without a wizard, presenting the content as text elements and (3) a simple map with a highly reduced range of functionalities ( Figure 2).
In accordance with Atzmanstorfer et al. (2016), we asked the participants to express their thoughts, feelings and opinions throughout the survey to protocolling them. This gave additional insights for contextualizing the results of the survey parts.

Eye-tracking analysis
In the first part of the study, participants carried out different tasks with the application and provided us with information about their cognitive process in a think-aloud fashion. As Çöltekin et al. (2009) state, this activity might be erroneous since participants may not be fully able to verbally express their cognitive process in a precise way. For overcoming this, we used eye-tracking technology to more precisely and objectively measure the cognitive process of the participant (Poole and Ball, 2005). Eye-tracking data may also indicate which areas on the interface are complex and therefore give feedback about the usability of the interface (Poole and Ball, 2005;Çöltekin et al., 2009). With eye-tracking data, it is possible to understand when, how long, and how many times the participant looked at specific areas on the interface. These areas are commonly named Areas of Interest (AOI).

Part 1: Application interactions tasks
Evaluation of error rate, task completion duration and participant's satisfaction with the following tasks: (1) Enter the application and navigate to the map.
(3) Find a POI and review its description.
(4) Leave a comment on a POI. For this study, we opted to include an eye-tracking device for the first part of the study (completing different tasks) with the participants in Austria. The following metrics were used: -Time to first glance; -Number of fixations; and -Duration of fixations.
The comparison between the time to first glance on the target AOI (the AOI that had to be used for completing a task) and the time spent on each task indicates whether the participant understood how to carry out the task. The faster the participant looked at the target AOI, the better its visibility (Poole and Ball, 2005;Çöltekin et al., 2009). The number of fixations on the target AOI compared to the overall number of fixations on the screen indicated how much the participant had to search in other areas to understand how to carry out a task. The higher the overall number of fixations, the less efficient the search for the right functionality to carry out a task may be (Poole and Ball, 2005;Çöltekin et al., 2009). Last, the duration of fixations on the target AOI can have different meanings: longer durations can indicate that the AOI is engaging, or it implies that the participant did not understand the meaning of the representation at the AOI (Poole and Ball, 2005). Additional context is required to interpret fixation duration for these reasons (see Section 4). For the eyetracking device, we used the Dikablis professional eye-tracker with an accuracy of the glance direction of up to 0.1°−0.3°. We used the software D-Lab 3 for processing the eye-tracking data.

Participants
We carried out the survey with 73 participants: 28 from Colombia, 20 from Uganda and 20 from Austria. The participants from Colombia were students from the Camacho University in Cali (6), and fieldworker (7) and scientific staff (15) from the International Center for Tropical Agriculture (CIAT) in Palmira. In Uganda, participants were coffee farmers from the Luweero district (8), students from Makerere University in Kampala (7), and administrative (1) and scientific staff (4) from the International Institute of Tropical Agriculture (IITA) in Kampala. In Austria, students from the University of Salzburg (2), administrative or technical (8) and scientific (10) employees of the University of Salzburg participated. The participants were invited by email and assigned a scheduled time slot.
To understand the characteristics of our participants, we asked for different profile information as a basis for analysing and defining user groups (Table 3). The factors age and digital literacy were most relevant to this study. Therefore, we based the user grouping mainly on these categories. Firstly, for coherence, the same age ranges were selected as in the previous usability study by Atzmanstorfer et al. (2016). Secondly, digital literacy was defined based on a combination of these factors: . Survey questions about the educational level, the self-rated computer skills and the self-rated smartphone skills. . Observations by the leader of this study regarding the smartphone use experience of each participant defined based on the confidence and velocity of using the smartphone.
Since underrating or overrating digital literacy in self-reports is common (Warner, 1965), we combined the self-rated variables with observations of the participants by the study leader. Subsequently, the participants were divided into two digital literacy levels: low-medium and medium-high. A low-medium digital literacy level refers to participants that were not as confident and experienced using smartphone applications as other participants of the respective study area. Table 4 shows the number of participants for each user group. In the Results section, we base the visualization of the study results on these user groups.

Test procedure
We conducted the survey in a quiet atmosphere at a desk. The smartphone (Samsung Galaxy A3 (2017) with a 4.7 ′′ display, Android version 7) was lying on the desk without being moved or held in the hand by the participant. In Austria, for the first survey part, the participants were attached to the eye-tracking device, which was installed on a desk.
We welcomed each participant and explained the procedure of the survey. We gave oral instructions to the participants about each task and survey part and assured the participants that they themselves were not being evaluated by their performance. During the survey (in between tasks), we allowed the participants to ask questions in case clarification was needed. Each survey took between 20 and 30 minutes.

Application interaction tasks
For understanding how usable the application is for the four user groups of the three study areas, we measured the effectiveness (error rate), the efficiency (time spent on each task), and satisfaction (evaluation by the participant).  Figure 3 shows the effectiveness of the application by the user groups and study areas. The four plots correspond to each user group and present the error metrics (no error, light non-critical error, moderate noncritical error, critical error). The coloured columns represent the study areas.
Most of the participants of the user group 18-29 years and medium-high IT skills of all study areas completed the tasks without errors or with light non-critical errors. The same trend is evident for the Austrian participants of the user group 30-65 years and medium-high IT skills and 18-29 years and low-medium IT skills. Colombian and Austrian participants, however, showed a higher number of moderate non-critical or critical errors. The trend towards moderate non-critical errors is also noticeable for Austrian participants. Last, most of the participants of the user group 30-65 years and low-medium IT skills from Colombia and Uganda completed the tasks with critical errors.
To summarize, the error rate of all user groups and all study areas show a similar trend. The higher the ITliteracy, the fewer moderate non-critical or critical errors were committed when completing the tasks. For the user groups of low-medium IT skills, we detect a difference between the study areas. Austrian participants committed fewer critical errors than participants from Uganda and Colombia.
For understanding the efficiency and satisfaction of the application, we further measured the duration of task completion and asked the participants to evaluate the application after each task (Appendix 1 and 2). As for these measurements, they show similar results with similar trends as shown in Figure 3.

Interface design preferences
In a second step, we asked each participant about their preferences regarding the interface (Figure 4). The four plots represent the user groups, each column showing the results of each study area. The colours of the stacked columns indicate the interface choice: three alternative interfaces (guided interface, simple interface and simple map interface) and the current interface version. The darker the colour, the more advanced the interface.
For the user groups of medium-high IT skills, the study areas Austria and Colombia show a similar trend, where all participants opted for an interface with a map, and most of the participants preferred the current interface version (60% and 75%, respectively). 60% of Ugandan participants of the user group 18-29 years with medium-high IT skills also opted for the current interface. In contrast to the other study areas, 20% of the participants of this user group opted for a guided interface. For the older age group and same IT level of Ugandan participants, only 20% opted for the current interface and 80% for a simple interface.
Further, there is a striking difference between the study areas of the user groups of low-medium IT skills. Austrian participants of the user group 18-29 years and low-medium IT skills mostly opted for the current interface (80%); 20% of these participants voted for a simple map interface. Colombian participants of the same user group mostly voted for a simple map (50%), and 37.5% of the participants opted for a simple interface without a map. Only 12.5% opted for the current interface. The majority of the Ugandan participants of this user group opted for a simple interface (40%) or guided interface (40%). For the user group 30-65 years and low-medium IT skills we observe a mixed response, with a trend to a simpler interface. Most of the Austrian participants (60%) opted for an interface with a map (40% voted for the current interface), and 40% preferred a simple or guided interface. A vast majority of Colombian participants (86%) preferred a simple interface, while only 14% opted for the current interface. In contrast, 62.5% of the Ugandan participants preferred a simple interface. 50% of the participants opted for a guided interface and 37.5% voted for a simple map interface.
To summarize, most of the Austrian participants of all user groups opted for a map interface, mostly for the current version. The only participants that opted for a guided interface were users of the group 30-64 years with low-medium IT skills (40%). For the Colombian and Ugandan participants, we detected a decreasing tendency in choosing the current interface (reading the plots from left to right), meaning from higher to lower IT-literacy. Comparing the results for the interface preferences to the error rate of the participants, demonstrated in the previous section, the error rates, in majority, correspond to the interface choice of the participants. Generally, the more participants that committed moderate non-critical or critical errors, the more participants voted for a simpler interface version.

User remarks
User remarks were noted throughout the survey. Most comments were shared regardless of the study area. In Table 5, we list the most stated aspects and some participants' quotes. The last-listed remark of the table, contextualized with the findings from the previous section, particularly caught our attention. While 88% of all participants stated a map is a useful way to visualize information, only 53% of the participants (39) actually opted for an interface with a map element (see the previous section). Subsetting this, 79% of Ugandan and Colombian participants of low-medium digital literacy (22 out of 28 participants) commented on the usefulness of a map element; while only 46% of these same participants (13 out of 28 participants) preferred an interface with a map as displayed in the previous section. This is a contradiction that we think can be explained by the fact that these user groups indicated having general problems with the map element but are motivated to learn how to use it since they saw the usefulness of a map. Eye-tracking statistics Figure 5 shows the median and variance of the time to first glance on the target AOI (green) and the duration of task completion (red) by user groups (only Austrian participants). The user group with the lowest median time to first glance is the group 18-29 years with low-medium IT skills (2.9 s). While the user group with the lowest median duration of task completion is the group 18-29 years with medium-high IT skills (8.5 s). The user group 30-65 years with low-medium IT skills had the highest median duration of task completion (12.6 s) and the biggest variance of results. The user groups of low-medium IT skills had a higher median duration of task completion than the user groups of medium-high IT skills.
To carry out each task, the time to first glance and the duration of task completion would ideally be short. The short time to first glance would indicate that the participant detected the target AOI early, meaning the AOI draws sufficient attention to it for carrying out a task (better search efficiency). A short duration of task completion would indicate that the target AOI was not only visible enough but also clear enough for the participants to complete and understand the task. Ideally, task completion time and time to first glance should correspond. The younger user groups of both IT skill levels spent almost 30% of the task completion time looking for the target AOI. 70% of the time they used for understanding how to carry out the tasks. In contrast, the older user groups differed in their behaviour. Participants of the user group 30-65 years with medium-high IT skills spent 37% of the task duration on searching for the target AOI, while the same age group with low-medium IT skills spent 27% for the search. In general, it took the participants between 2.5 and 4.3 s to have a first glance on the target AOI. However, the longer task completion duration indicates that participants were not sure of the meaning of the target AOI for completing the task. Figure 6 shows the median and the variance of the number of fixations on the target AOI and of the total number of fixations on the whole screen. The lowest median number of fixations on the target AOI (4 fixations) showed participants of the user group 18-29 years of both IT skill levels. In contrast, the highest median number of fixations (total and on the target AOI) had participants of the user group 30-65 years and low-medium IT skills (43 and 7 fixations). As we have seen before comparing the age groups, it is noticeable that the older age groups had a higher number of fixations (in total and on the target AOI). Also, the older Table 5. Category of remark and participants' quotes.

Remark:
Participants' Quotes: 78% of all participants (57 out of 73) stated the button meanings not being intuitive enough.
. 'I don't like how the buttons are distributed; I can't find them right away.' . 'I don't understand the meanings of the buttons. I need more time to go and use each button to understand how to use them.' . 'The button icons don't represent what the button actually does.' 59% of all participants (43 out of 73) were criticizing the design or some aspects of it.
. 'I am having trouble finding the right information. I need a bit more time to locate were to go next.' . 'I don't like how the map is designed. It's difficult to zoom or move on the map.' . 'There is a lot of information to process.' 32% of all participants (23 out of 73) stated overall problems with the application; 30% of all participants (22 out of 73) commented on needing more guidance or time to learn how to use the application . 'I don't understand how to use the app right now, but it would be nice to learn how to use it.' . 'I don't know at all how to use the app and need somebody to help me.' . 'It would be good if you could provide more information on how to use the app.' . 'I need a little bit more guidance to understand the functionalities and what to do with them.' 38% of all participants (28 out of 73) said that they find the application and its information useful. . 'I think a map element is very useful and I can see right away where other users uploaded their observations.' . 'I like maps because the information is easy to read.' age groups exhibited a greater variance of the total number of fixations than the younger age groups, regardless of the IT skill level. Figure 7 shows the overall mean fixation duration and the mean fixation duration of the target AOI. Here, the longer the fixation, the more difficulties the participant had to extract relevant information. In general, the participants had longer overall fixation durations compared to the fixation duration on the target AOI, except for participants of the user group 30-65 years with medium-high IT skills. The same user group also had the highest median of the fixation duration on the target AOI (575 ms). The lowest median of the mean fixation duration on the target AOI had participants of the user group 30-65 years with low-medium IT skills  (399 ms). For the overall mean fixation duration, the lowest median had participants of the user group 18-29 years with low-medium IT skills (437 ms). Comparing the user groups by their IT-literacy, the user groups of low-medium IT skills had a lower median value for the mean fixation duration overall and on the target AOI than user groups of medium-high IT skills. Furthermore, user groups of low-medium IT skills had a median below the average median of all user groups, whereas the user groups of medium-high IT skills had a median situated above or equal to the average median. We further observe a higher variance of the user groups 30-65 years with medium-high IT skills and 18-29 years with low-medium IT skills compared to the remaining user groups.

Results
For the usability analysis, we divided the participants into two age groups and two IT-literacy levels, resulting in four user groups of three study areas. The map interaction functionalities, which we used in this study, required either a higher cognitive load or previous experience or knowledge for using an interactive map and its functionalities. This was particularly challenging for Ugandan participants. These participants were, in majority, not used to working with interactive maps. We observed striking usability problems by these participants, especially of low IT-literacy. Therefore, most of the Ugandan participants preferred a guided or simple interface. The study detected the same trend in usability and design preferences for low IT-literate participants from Colombia. Through observations, these Colombian participants generally seemed to be more experienced in using an interactive map. The problem for these participants in this instance was their lack of experience in using a mobile application. In contrast, high IT-literate participants from Colombia had the same trend in usability and in their interface design choice as Austrian participants. This indicates that high IT-literate Colombian participants have a similar set of knowledge and experience as Austrians and opted for advanced interface functionalities. Austrian participants were generally confident and experienced in using an interface with advanced functionalities. This is also reflected by the analysis of the usability aspects.
The survey results showed that the user group 18-29 years with medium-high IT skills were facing the fewest usability problems of all user groups. We see these results supported by the eye-tracking statistics (only Austrian participants). The short task completion duration, the short time to first glance on the target AOIs, and the low number of fixations indicate, compared to the other user groups, a better on-screen orientation (search efficiency). The eye-tracking statistics also reflect the usability problems that were faced by the user group 30-65 years with low-medium IT skills. The duration for task completion and the number of fixations were much higher than that of the other user groups and indicate less efficiency for completing the tasks. The task duration, the time to first glance, and the number of fixations further depict a difference between the user groups. The younger user groups seem to have better task completion efficiency than the older user groups. Based on these insights, we assume that a longer mean fixation duration indicates that these participants better understood the AOIs. Hence, the higher the mean fixation duration, the more engaging the AOI and the better the task completion efficiency. However, apart from observing the user groups of low-medium IT skills having a shorter mean duration of fixation, we are not able to detect the same trend between the age groups as we have seen with the previously mentioned eye-tracking metrics. This might be explained by the number of participants in the study. Increasing the number of participants may help in understanding trends in a better way and determine outliers. Therefore, and for overcoming this, it was necessary to contextualize the findings of the eye-tracking statistics with the qualitative survey to make statements and interpret the eye-tracking statistics.
Interestingly, even though a high number of low-medium IT-literate participants of all study areas opted for an interface without a map, several participants of these user groups commented on the usefulness of a map, which facilitates locating POIs. These participants stated that they would need to use a simple or guided interface since they were, at the time of the survey, not able to confidently use a map interface. However, they further stated they would be motivated to learn how to use advanced functionalities. This is interesting since the lack of experience on how to use interactive maps, and the interest and motivation to learn how to use these, should be seen independently. The statement not only depicts a general curiosity and motivation of the participants, it also indicates the usefulness for implementing mechanisms to adjust the interface to the users' learning curve and the users' needs.
Based on our research, we see a strong variance in usability between user groups and, hence, see a divide within a multi-user audience. As Gottwald et al. (2016) state, this can even be seen as social exclusion (e.g. certain citizens not being able to participate, even though they would like to). This also confirms previous research results from Eitzinger et al. (2019), where some PPGIS users needed the help of trained facilitators to be able to use the application.
In this study, we want to point out the challenge of designing a PPGIS that corresponds to the needs of its user audience. An important question to raise is how to deal with a large spectrum of user types. We suggest overcoming this problem by implementing adaptive interfaces into the PPGIS design. Adaptive interfaces detect the context of the users and adapt to it. Context is seen as information that can be used to characterize the environment and situation of a user and includes relevant information about users' location, time, activities, people and objects or resources that surround the user (e.g. noise level, network connectivity) (Schilit et al., 1994;Dey and Abowd, 1999). Creating interfaces that address the users' context (e.g. preferences, IT-literacy, age) may improve the usability of the application (Reichenbacher, 2001;Wang et al., 2001;Reichenbacher, 2005) by reducing the cognitive load of the application design and information through tailoring to the context of the user (Wang et al., 2001;Gartner et al., 2007).
With the findings of this study, we were able to design a prototype of an adaptation gradient that may be used for the implementation of adaptive interfaces into a PPGIS (Figure 8). We see the gradient on three axes: the amount of GIS features, degree of user interactivity, and guidance. The context of the user determines the degree of the gradient of each axis. Based on the study's findings, the way to visualize geospatial information depends on the map reading and interaction experience of each user. For example, map applications simply did not play a role in the daily life of Ugandan participants. Hence, we deduce that the non-use of interactive maps in Uganda is a phenomenon of that particular study area. Therefore, the degree and the way to visualize GIS features (e.g. on a map, as text) should correspond to the map reading/interacting skills of the user (e.g. the less the map-use skills of the participant, the less a map will play a role in the interface; and vice versa).
Second, the digital literacy should define the degree of interactivity with the application. For example, the less confident the participants, the more difficult it is to use the application. Last, the degree of guidance can be defined by different factors, such as the digital and map-using literacy and the individual's motivation to learn the use of the application. The more confident and experienced the participant to use the application, the less the need for guidance; and vice versa.
Even though the factors defining this adaptation gradient may not be exclusive, this gradient lies a foundation for translating usability issues and user preferences into adaptive PPGIS interfaces. With this, it may be possible to overcome the usability problems that some participants faced in this study. This way, the application adapts the interface design to the users and provides elements that are suited to their current IT-literacy, ranging from simple to advanced elements. With this, it is possible to enhance long-term participation since the content, functionalities and interface elements can be provided in a way that is comprehensible to the users.

Methodology
We observed that participants from Uganda and Colombia were hesitant when asked for their opinion or criticism regarding the application design. Austrian participants were in majority much more critical. Hence, the results are subjective and depend on the study area and the culture of criticism. We further observed that especially low IT-literate participants felt as if they were being judged when they were interacting with the application (same phenomena as Gottwald et al. (2016) describe). To mitigate these insecurities, we assured the participants that we were only assessing the application and not the participant performance. However, this might have increased their stress level and potentially biased the survey outcome. A higher stress level might have also been triggered through the test setting, the unfamiliarity with the situation and time constraints of the participants. We further detected disturbances generated by the periodic slow performance of the smartphone or internet connection and the unfamiliarity of the participants using another person's smartphone that may have differed from their personal ones. Deduced from observations, another factor to be considered is the level of frustration and stress that may differ between participants. Participants that were easily stressed might have behaved differently than more relaxed participants. And participants that got frustrated easier may have had more difficulties in the task completion of the task scenario section than other participants or may have criticized the application more than other participants.
A major problem for most of the participants of all study areas was the last task of the task scenario, where the participants had to change the profile configurations. For carrying out the task, the participants had to tap on the account name in the menu to access the profile configurations. However, since we provided an account for the participants, it did not reflect the participant's name. We hypothesize that the high number of critical errors was due to this problem. In other applications, the user profile is typically located in the same place and participants should be familiar with how to access it. Therefore, we assume that participants were distracted by the profile name since they did not relate to it.
In total, we included 73 participants in the study. We divided these participants into four user groups ranging from 5 to 8 participants per study area and group. The creation of the user groups was based on the self-rated IT skills of the participants and observing factors (confidence and experience of the participants). The type of user groups and the separation of the participants based on this may have biased the outcome of the study results. Other user groups could have been selected, such as user groups based on gender, educational level, or other user characteristics that were not considered at all. Also, the age groups and IT skill levels used for the user groups could have been selected in another way. For example, the age groups could have been separated in different ranges; the same applies to the IT skill level. Further, an increased sample size could have quantitively helped with revealing alternative trends or could have helped to underpin the trends detected in the results.
Combining the results of the eye-tracking statistics with the survey results, we were able to detect patterns in the eye-tracking statistics that supported the results from the survey. However, the eye-tracking device was sensitive towards eye-wear, eye-lids that may hide the pupil, and body movements (Poole and Ball, 2005). Through these factors, the eye-tracking data had to be re-calibrated to eliminate a generated offset. This potentially biased the results and reduced the accuracy of the data. Defining the AOIs was only possible after recording the eye-tracking data. The AOI boundaries were defined at the beginning of the recording, however, through movements of the participants throughout the recording, the AOI boundaries also moved. Through post-calibration, we corrected the boundaries. Yet, this may also have impacted the accuracy of the statistics. We further detected that some participants had a more global sight on the interface. Some participants did not look directly at the AOIs (peripheral gaze), even though they paid attention to these areas. We assume this problem was intensified by the small screen of the mobile device where most of the elements are close together. Nevertheless, we observed a high precision of the eye-tracking data. The generated statistics provided interesting details of the cognitive process of the participants. However, the results had to be contextualized with the qualitative survey since the number of participants was not high enough to make quantitative statements about patterns.
In all study areas, the participants carried out the surveys while sitting at a desk, either in a quiet environment outside or in an office. In Austria, participants were attached to the eye-tracking device. Therefore, participants were not in a position they would usually be in when using a smartphone, which may have additionally affected their stress level. This may have impacted the outcome of this study since 'real-life conditions' were excluded by the survey environment.

Conclusion
With this study, we were able to explore the spectrum of user types of three study areas: Colombia, Uganda and Austria. Through a task scenario, questions regarding the interface design preferences, and general remarks of the participants, we were able to identify usability problems and user preferences of the participants and contextualized these accordingly.
The study results indicated which and how user context information can be used to create an adaptation gradient on which further efforts to create adaptive interface could be based. We assume that adapting the degree of complexity of the interface design to the IT-literacy of the users helps mitigating usability problems. Adaptive interfaces would then adapt to the user's context (Dey and Abowd, 1999) and would respond to it by offering adequate functionalities and elements. For future work, we are planning to prototype a PPGIS that adapts with its interface to the user's context. A comparison of the usability of this kind of PPGIS with the results of the study at hand would give crucial insights for the usefulness of this kind of interface. Further, we want to emphasize that research in cross-culturally and globally used PPGIS is relevant to further the knowledge about regional conventions and user-type differences that impact the use of applications. Additionally, we would like to highlight that the insights of this study may not be exclusive to PPGIS research but could potentially be used for usability research in map applications in general.
The eye-tracking statistics of the Austrian participants supported the trend detected in the usability aspects of the task scenario. However useful, eye-tracking data provided several challenges in post-calibration and interpretation. For further work, it would be interesting to use an eye-tracking device that does not have to be attached to a computer while recording the application interaction of the participants (e.g. as in Kiefer et al. (2014)). This way, we would be able to avoid the 'lab environment'-setting and could create a more natural atmosphere for the participants.

Notes on contributor
Mona Bartling received a MSc in Applied Geoinformatics at the University of Salzburg, Austria and has been a PhD candidate since October 2017. Since April 2015 she is a visiting researcher at the International Center for Tropical Agriculture (CIAT) in Colombia where she is part of the developer team of the participatory tool GeoCitizen. Her PhD focuses on improving user experience of participatory tools and she has a particular interest in understanding user context for the design of map applications.