Towards a better understanding of the health impacts of one’s movement in space and time

ABSTRACT To better understand the interactions between physical built environment conditions and one’s well-being, we created a passive data collector for travellers and made the first step towards an explanatory model based on psychophysiological relations. By measuring biometric information from select trial participants we showed how different controlled factors are affecting the heart rate of the participants. A regression model with the impact factors such as speed, location, time and activity (accelerometer data) reveals how the factors relate to each other and how they correlate with the recorded individual’s heart rates throughout the observed period. For examples, the results show that the increase in movement speed is not linearly correlated with the heart rate. One’s heart rate would increase significantly when the individual reaches brisk walking and running speed, but not before nor after. Early morning and early evening time slots were the time where the observed individuals have the highest heart rates, which may correlate to individuals’ commute activities. Heart rates at the office would be lower than at home, which might correlate to more physical activities in the household.


Introduction
When building a sustainable society, it is not only important to make sure that the environment is climate-friendly and that the inhabitants are physically healthy. The World Health Organisation defines health among inhabitants to be a state of complete physical, mental and social well-being, not just the absence of diseases (World Health Organisation, 1946). Choices in the planning of infrastructure can often deal with the possibilities to limit the spread of diseases, however, to achieve mental well-being, one might have to look for other solutions. There is a link between input from the surrounding environment on a person's mental well-being, by changing the environment, one would also change this psychological impact on the inhabitants.
An environment could be considered well designed when people can function comfortably and effectively over an extended period in it. This was stated by Kaplan (1987) when describing how environments can affect and alter the mental state and attention of people in an area. Kaplan invited researchers to start looking at how the environment altered the overall productivity as well as well-being among people in an area. This has sparked many researchers taking up this invitation to find links between area design and effects on the inhabitants. The research is ranging from personal well being, such as the studies performed by Hartig, Mang, and Evans (1991) as well as by Grahn and Stigsdotter (2003), to crime and aggression, such as the studies from Sullivan (2001a, 2001b), to driver behaviour, such as the studies performed by Antonson et al. (2009), and the list could go on forever. However, these studies are all grasping the same topic, while performing their studies through different methods with different purposes. Velarde, Fry, and Tveit (2007) have done extensive work on collecting and compiling these methods of such previous studies performed where researchers have tried to find the difference between how various landscapes affect the mental well-being of people who are located in those areas. They concluded that the main health effects found among similar studies were revolving positive effects related to stress reduction, attention capacity, illness recovery, mood improvement and general well-being, to name a few. However, they also stress that these studies have been performed in numerous ways, with methods which by current standards might not be considered modern or even valid at points. The exposure to the stimuli ranged from physically being at the location of study to viewing it from a window to videos and pictures. These stimuli have a lot of inert issues since the perception of the environment changes drastically. The way the responses were recorded were also different, ranging from selfreports to the collection of brain activity data, showing the need for a more objective standardised method for data collection.
An example of these methods is to aggregate data over areas and compare it to different factors in each specific area, one might conclude the health effects regarding a specific location. Moore (1981) used this type of methodology as an indirect way of measuring how stress-inducing certain locations are. By measuring the number of sick-calls that would come from a certain area, Moore believed that it was possible to distinguish the relationship between the impact of the area and the number of sick people in the same location. This type of comparison might make it possible to find similarities and differences between locations, to uncover which type of environments are having a positive or negative effect, but without any knowledge regarding how the environments affected the people and if the people had the same health status regardless of the location. However, there are many other factors that can have an impact on the well-being, making this type of method uncertain to use.
The method explained in the previous paragraph also lacked many factors in its data collection as it would only register information through the number of sick-calls. As Kanhere (2011) describes, by using smartphones, it is possible to gather such information directly from more precise locations, providing a map filled with knowledge regarding citizens perception of urban areas. Further, given the smartphones combination of sensors, it is possible to report many health conditions with it, as Majumder and Deen (2019) describes. These sensors in combination with the methodology of collecting crowdsourced data makes it possible to gather knowledge regarding the health status of different areas, as the work by Kamel Boulos et al. (2011Boulos et al. ( , 2011b suggests. Another type of method is to isolate a specific part of a stimulus and collect data about it. Although it might not always hold much external validity, it has been a great test of methods for data collection. The work of Nakamura and Fujii (1991), proves how the focused exposure changes the sensory stress when comparing the effects of hedges and block fences, narrowing down their collected alpha and beta waves of the participant's brain into a function which describes if the stimuli are increasing or softening the sensory stress at the moment. Similarly, the work of Ulrich (1981) aimed to show the same results, however using more details in the stimuli, while using cruder data collected from several devices, such as brain activity and heart rate. This shows that if you increase the detail of the stimuli, the detail of the data collected does not need to be as high, but what would happen if you combined such a data collection method with less detailed stimuli but with a much larger data set?
As these examples show, there are solutions for working on a both individual and aggregated level with both direct and indirect data collection. To create one common solution for stimuli and data collection, we have set out to create a tool for collecting the effect of environments on peoples' health that would be applicable in many of the cases discussed above. With the rise of new technologies for collecting biometric information, revealing peoples' well-being, together with widely available geolocation technologies, it should be possible to design and test a prototype of such a solution as mentioned. This paper covers the early testing of, and data analysis from, such a tool.
This paper aims to uncover the theoretical possibilities of creating a tool for collecting peoples' health and linking it to certain environments and activities performed at specific locations. The introduction above presents an overview of the field while Section 2 aims to cover the details of the research gaps as well as the theoretical possibilities given the current state of the art of the technology, which is widely available today. The background in Section 2 is used to invent a technical solution that might solve the issue stated in the introduction. The method in Section 3 will describe how a prototype of the technical solution have been used as a proof-of-concept tool to show the current possibilities. The results in Section 4 and discussion in Section 5 will revolve around analysing what can be derived from the data collected which will then lead to the conclusion in Section 6 stating what the possibilities are to create such a tool with the technology currently at our disposal.

Background
This section aims to provide a theoretical background based on current technological advancements to solve the problem stated in the introduction. The chosen references are interdisciplinary and range from medical sciences to computer science together with civil engineering. Each subsection aims to cover a certain area of the solution focusing on previous work showing proof of concept in related as well as distant fields of research in order to uncover research gaps in these fields. The methodology in this paper stems from the various automated data collection methods described by Velarde, Fry, and Tveit (2007), focusing on a combination on sensory data collection, such as the collections performed by Nakamura and Fujii (1991) and Ulrich (1981), and aggregated data analysis, resembling the one performed by Moore (1981). Therefore, this section aims to cover the stateof-the-art possibilities of collecting, fusing, and analysing the data needed to create a method for uncovering the health impact of one's movement in space and time.

Biometric data collection
During recent years, systems that utilise biometric information has become so widely available that it can now be fit into virtually any mobile communications device. Companies such as Apple, Huawei and Samsung utilise biometric information as a means for controlling access to their mobile devices (Apple Inc, 2020aInc, , 2020bHuawei Technologies Co., 2020aSamsung, 2020aSamsung, , 2020bSamsung, , 2020c. But biometric information is not only limited to unique identifiers that can be used to identify and distinguish people from one another, it can also hold valuable information regarding the current condition of a person. Ranging from simple measures such as heart rate which is defined as beats per minute to more complex measures like brain activity which is measured through a combination of different wave-patterns, biometric information can be collected and analysed in many ways.

Heart rate
It is generally quite simple to collect heart rate manually, all one need is to find the pulse and measure the amount of beats it makes over the course of 15 seconds and multiply that by four (Corliss, 2016). By monitoring resting heart rate, it is possible to make prognosis of different medical conditions (Diaz et al., 2005;Jensen et al., 2013;Yang, Kim, and Jeon, 2016). With new biometric information systems, it has become easier to collect such information over time. Technologies such as Electrocardiography (ECG) (American Heart Association, 2015) and Photoplethysmogram (PPG) (Healthy Doc, 2018) can be implemented in everyday items, such as smart watches (Apple Inc, 2019;Garmin Ltd., 2019;Huawei Technologies Co., 2019;Samsung, 2019). With these technologies, it is possible to digitally monitor and store data regarding the heart rate of a person if the device containing the system is attached properly. There are numerous software available that analyse said data for the purpose of creating a more active lifestyle (Apple Inc, 2020c;Google LLC, 2020), showing how simple it is that with today's standard gain indicative knowledge regarding someone's health situation with a next to non-intrusive tool.
However, the accuracy of these devices in absolute numbers are often not perfect, and there are several studies that prove that the use of PPG in a wristworn device might not be optimal. Many of these studies also stress that it is when the device is put under more extreme circumstances that they might start to show inaccurate readings. As the work performed by Wang et al. (2017) shows, the accuracy of resting heart rate can still be quite tolerable using wrist-worn heart rate monitors embedded in a variety of smart device applications, all which were using PPG. By comparing readings from widely available smart watches and bracelets meant for activity tracking during exercise to the readings of an ECG sensor, they could reveal how well the PPG sensors of commercial products correlated with the ground truth. The results even show that some of the more general-use devices were quite close to the real readings. But, when the same devices are used during exercises, the readings start to drift away from the ground truth. The study performed by Wang et al. was held in a clinical environment, which might give just indications on real-world implications, but with the work of Gorny et al. (2017) wrist-worn PPG-sensors were put to the test in what they call 'free-living situations'. By comparing a Fitbit Charge HR wrist-worn PPG-sensor (Fitbit Inc, 2015) to a Polar H6 chest mounted ECGsensor (Polar Electro Oy, 2016), they found a systematic error where the absolute values of the PPG-sensors was being lower than the ECG-sensor. Further research by Benedetto et al. (2018) showed that other PPG-sensors had a similar issue on the individual level. However, on an aggregated level, these inaccuracies where not as noticeable, implicating that for aggregated data collection, the wrist-worn PPG-sensor might still be a good solution for a non-intrusive data collection. Therefore, there is a gap for collecting this type of data and controlling for physical movement that might affect the readings of such a device to gather more accurate test results.

Psychophysiological data collection
As mentioned in the previous subsection, biometric information can reveal information regarding medical conditions while being something that can be collected relatively easily with the current standard of technology. However, as already mentioned when describing the work of Nakamura and Fujii (1991) and Ulrich (1981) in the introduction, it is also possible to deduce psychological information from physiological (or biometric) information collected non-intrusively. As both Schell et al. (2001) and Lykken (2002) describes, psychophysiology is basically all about understanding what a person perceives and feels mentally by collecting and comparing motoric and electric reflexes and actions in the body, mainly through external means. There are many ways of collecting this type of psychophysiological data, and for a list describing how this can be performed, one can reference the work of Hugdahl (1995) who gives an extensive list of how and why to measure different forms of psychophysiological data.
A good example of how the psychological condition of a person can actively affect the physical condition of the same person can be found in the work performed by Vrijkotte, van Doornen, and de Geus (2000). There they compare different levels of work-related stress to the fluctuations in heart rate and blood pressure. Since the type of work-stress in their study is defined as purely psychological, it is a good example of how large the impact of psychological conditions is on the physiology of a person. However, this study merely showed that increased heart rate could be an indicator on psychologically induced stress, not how that biometric information could be used for predicting or explaining the psychological condition.
To find out more about the predictions and possible psychological causes for the physical reactions, more research on the combinations of sensors have been performed. With the advances of technology regarding biometric data collection, as described in Section 2.1, more studies are being performed where data is cross-referenced in order to gain more accurate knowledge about the impacts of a given situation on the person perceiving it. The work of Jercic et al. (2018) shows in a rather clear way how it is possible to combine the data collected from eye tracking as well as heart rate to analyse cognitive load of a person. Such work does not only make it possible to further make data collections more precise, but it also makes it possible to understand better which factors of data stream has contributed to the final analysis. In the work of Jercic et al., they are comparing how cross-referencing Heart Rate Variability (HRV) and Pupil Dilation (PD) indicate the cognitive load a person is experiencing. The results show that by using the HRV as a measurement of mental arousal, the PD can be used to determine if that arousal is based on something cognitively demanding, thus not only giving us more information about precise conditions but also providing information about possible implications of heart rate fluctuations.
The equipment used for psychophysiological data collection does not need to be very advanced to yield useable results. As Göbel, Springer, and Scherff (1998) proved already in the 90's, psychophysiological data in the form of heart rate and eye-tracking added more depth when understanding the psychological impact on drivers and the interaction design of their vehicles. Göbel et al. also claims that it was possible to deduce the impacts of certain tasks on the person by controlling all factors related to physical movement and only collecting heart rate information. Of course, a lot has happened since 1998, but the work of Jerčić, Sennersten, and Lindley (2018) seems to be along the same line 20 years later. As this is still to be tested with a combination of automated travel diaries, this study has used the heart rate as the dependent factor to measure the impact of space and time on the test subjects.

Location data collection
There can be many purposes for understanding peoples' movement through space and time when designing and developing built environments. By seeing the flow of these movements throughout a city, it is possible to see how changes can be made to optimise the area. As Axhausen et al. (2002) describes in their work, travel diaries can be used to collect data for the purpose of analysing trends in movement around a city and gain rather promising results. For the purpose of this study, we are interested in correlating biometric effects in different areas, which is similar to the analysing of trends of movement in an area. However, to collect information regarding effects in certain areas, a combined data set would be needed which would describe the independent and dependent variables at a location at the same point in time. This is not possible by using traditional travel diaries where one tracks one's movement manually using either surveys or interviews, but a more automated method needs to be used. Prelipcean, Susilo, and Gidófalvi (2018c, 2018b as well as Wang, He, and Leung (2018), among others, have produced in-depth inventories for various methods of collecting data in the form of travel diaries, and among those methods there are solutions which are both digital and automatic. Further, Prelipcean, Gidófalvi, and Susilo (2018a) has proven the possibilities in collecting reliable data with personally worn devices, in this case smartphones, for understanding trends in travels and more. Based upon this, Palmberg et al. (2019Palmberg et al. ( , 2019aPalmberg et al. ( , 2019b has worked on combining the methods of automated travel diaries with the collection of biometric information to further develop the possibilities of analysing the effects of travel and mobilisation of people, which is further discovered in this study using the same hardware and software.

Methodology
This section aims to explain the proof-of-concept tool used for collecting biometric information together with position data for analysis of health and psychological impact of areas. It is also meant to explain the analysis method used to derive the implications of the surrounding environment on the participants.

Data collection
To be able to collect information about the participants over a longer period of time without disturbing their daily routines, a device had to be chosen which would be less intrusive than conventional heart rate monitors used in previous studies of the same character. A smart watch was chosen together with the software MERGEN developed by the authors Gidofalvi, 2019a, 2019b). The device chosen was a Huawei Watch 2 4G which can collect biometric information as well as position. It also has gyro and accelerometer sensors, letting it detect movement from the user. A detailed list of the specifications of the smart watch can be found in Table 1. The software was set to collect one sample every minute, even if the person was sitting still to get information regarding how the effects might change over time in the same location.

Heart rate
The device collects biometric information through a PPG sensor. While it would have been possible to extract the raw PPG data, that would have led to a lot of work in optimising and filtering of the data. Since this is a work in progress focusing on proving the usefulness of smart watches in data collection regarding the impact of the surrounding environment on the participant, we chose to use the built-in function which would extract the data, interpolate and filter it and provide us with a measurement of heart rate in the form of beats per minute. This also makes it possible to include other devices based on the same operating system in the future, so it is a way of making this work possible to build upon and scale up easily. This measurement, with the accelerometer data used for control, will be used as the dependent factor in determining the effects of the surroundings in the data analysis. The utilisation of data provided through the hardware's own operating system might cause errors, since it is not disclosed how this data has been handled after collection. However, since all participant used the same model from the same brand, this should hopefully only lead to systematic errors. Since the study revolves around the fluctuations of heart rate within the range of each participant, such systematic errors that might have affected the absolute values should be dealt with automatically.

Position
For position collection, the location was collected with as fine detail as possible. The software was programmed to provide longitude and latitude with as good precision as possible at the current location. In some cases, the positioning had to be made coarser to maintain a steady flow of information, where global positioning coverage was lacking as the device was lacking assisted GPS (A-GPS) meaning that it was not possible to rely on any cellular network for position data. Just as with the heart rate, the data provided through the operating system was used and for the same purposes, the possibility to utilise other hardware running the same operating system in future studies. Data such as elevation was not collected as raw data, but instead extracted from a height map using interpolation of the coordinates collected. The height map was provided by Becker et al. (2009) and had a resolution of 30 metres. The interpolation was carried out by finding the elevation data that was closest to the coordinates for a specific data point and using that elevation as the elevation data for that data point. This means that elevation related to buildings will not be collected in this experiment.

Auxiliary information
To distinguish heart rate fluctuations based on physical exertion from psychological effects, movement data was collected as a control. By using the accelerometer data, it should be possible to make this distinction, as exercise will be revealed by a lot of motion being detected by the device at the wrist. This data is collected as the gravitational pull when moving the wrist in different directions, calibrated so that the pull is 0 when the arm is in a resting position, even though there is always a gravitational pull on the body.
The time of day might also have an impact on the person; therefore, the data was divided into two-hour intervals ranging from 08:00 to 20:00 as can be seen in Table 2. The reasoning behind this is that there might be certain patterns connected to times of the day, which can be interesting to monitor, like time spent commuting or sitting still at the desk. The time ranging from 20:00 to 08:00 was not included as the participants where not instructed to wear the device during night-time.

Data analysis
The data analysis was performed using machine learning methods. A multilinear regression explanatory model was created for analysis of the effect of independent variables on the dependent variable 'heart rate'. By creating such a model, the method of data collection and analysis of this project is virtually infinitely scalable, given that the hardware running the analysis can keep up with the amount of data collected.

Variable derivation and data refinement
As described in the data collection, several factors were collected. However, certain factors had to be derived from the collected data. The device did not collect such data as speed or elevation, and this was derived using simple physics formulas such as distance travelled divided by time and by crossreferencing locations with elevation data from a separate data set. In order to calculate the physical distance between two data points, the haversine formula was used. The haversine formula is used to calculate the distance between two geographical coordinates based on the spherical characteristics of the earth (Robusto, 1957). The calculations performed for the data analysis were rewritten for computation with the help of Veness (2002) and can be found in Equation 1 to 3 below. The earth's radius is averaged to a spheroid with the radius 6371 based on WGS84 (Macomber, 1984). Assuming that:  ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 1 À a ð Þ p � � (eq:2) The sample rate of the accelerometer and gyroscope is too low to determine elevation changes, so the results of this experiment rely on the elevation data described in Section 3.1.2. This means that all positions are assumed to be flat on the spheroid surface of the earth, with no regard to elevation changes caused by indoor changes when walking up and down the staircase or riding the elevator. The accelerometer data is instead used to understand whether the participant is stationary or moving their upper body.
The points of interest were selected using a manual clustering method where all data points were divided into 'home', 'office' or 'other'. This was done with the purpose to have the possibility to compare the data collected at home and the office, since those are two locations where the participants would spend a lot of time during the data collection.
All dependent variables were normalised to have less impact from the amount of participants in the study. There should be no bias related to the absolute heart rate values. For the purposes of this study, it is mainly important to compare which locations might cause an increase or decrease in heart rate rather than knowing the absolute heart rate in any location.

Personal biometric model
To explain the impact on the person, an explanatory model had to be developed. To deal with biasness, the heart rate of each participant was normalised to make it possible to include all data in one larger data set. Using the normalised heart rate as the dependent variable, the independent variables and their relationships had to be defined. A regular linear regression model looks like what is shown in Equation 4, where a describes the relation between the dependent variable Y and independent variable X given the constant b. In a multi-linear regression, one can add more describing variables to explain their combined relation with the dependent variable given the constant, this will be shown in the Personal Biometric Model.
To make it possible to explain how the different factors affect the dependent variable, one must account for control variables, and in this study that would be speed and accelerometer data, as the independent variable would inherently increase with an increase in those factors. However, as there was no distinction in speed between using a vehicle and mobilising oneself, some speed categories had to be decided upon to see how different speeds affected the dependent variable. The speed was decided to be divided as is shown in Table 3. A high speed correlated with a lot of accelerometer activity could be derived as exercise while high speed with low accelerometer data would be considered as using a vehicle to transport oneself, which in return should give less impact on the dependent variable given that the participant was not negatively affected by the vehicle. The time was divided into categories with two hours each from 08:00 to 20:00 to pick up on times which required more activity, such as commuting. The final, updated personal biometric model looks as it is shown in Equation 5, with an explanation of the variables found in Table 4.
The factors are affecting the heart rate in different ways, these are described in Figure 1. As Ulrich (1981) highlights in his research, it is quite possible to detect how psychological effects can be measured with simple heart rate monitors, measuring the overall heart rate arousal. This means that effects of space and time should be derivable given that one controls for physical factors that might affect the heart rate, such as full-body movements and arm movements (as the accelerometer data is collected only from the wrist).

Results
In this section, the results from the data collection as well as the analysis will be covered. Based on the method and background, these results will then be discussed further in Section 5.

Data collected
Over the course of three months, three participants wore the device to collect the necessary data. The three participants were all male in the age group of 40-50 years old, all working at desks (no physical labour). After data cleaning and refining, 13,327 data points remained and were used for the data analysis. The data was cleaned and refined using MATLAB R2019a with the 'Statistics and Machine Learning Toolbox' and 'Mapping Toolbox' and was then put into SPSS 26 for further data analysis. The data visualised can be seen in Figure 2, where each coloured dot represents where a data point has been collected in space and the three different colours represent the three participants.

Data analysis
The data was inspected to find any anomalies, which were then adjusted in MATLAB by either interpolating the faulty data point or removing it completely, depending on the type of error. The most common issue in the data used for the analysis were so-called 'speed peaks', which occurs when the GPS data is lagging and the next data point becomes very far away from the previous data point, this was typically solved with interpolation. The data was then analysed using multi-linear regression in SPSS with the Personal Biometric Model. As can be seen in Table 5, the model showed an Rvalue of 0.716 and all variables except for the speed-related variables had a significance at the 99% level, which can be seen in Table 6. The variable 'ParticipantID' was added to uncover possible differences between the participants heart rate patterns, but no further information regarding the participants were either collected or added. Data regarding external changing parameters such as current health status, diet and caffeine intake was not collected because the data collection aimed at being completely implicit, excluding any active interaction between the user and the data collection tool.
As can be seen in Table 6, the variables related to movement does not have as large effect as the variables related to space and time. To compare if the addition of accelerometer data affected the results in any crucial way, the same model was run without the variables AX, AY and AZ. The results can be found in Tables 7 and 8. As can be seen in Table 7, the model had a lower R-value of 0.680 without the accelerometer data included, meaning that the model is less accurate to the dataset. The significance of some of the coefficients also changed, as can be seen in Table 8, mainly regarding the speed at which the participant was travelling, but also regarding to elevation. Notable changes are, for example, in Marathon, where the significance level dropped from 0.714 to 0.114 meaning that it is more accurate without the addition of accelerometer data, while the opposite is true for Walking which increased from  0.335 to 0.515. Running also increased just enough to leave the margin of 90% confidence. Another factor that changed is elevation, where the significance went from the 99% level down to the 90% level. This means that this factor was also affected by the addition of the accelerometer data, which might not seem as obvious as the changes to the speed factors as elevation should affect the participant regardless of motion as it is a location-based factor which should probably affect the heart rate only through psychological effects. This assumption is made due to the fact that the elevation data is only calculating the surface elevation (calculated as described in Section 3.1.2) and not the elevation in buildings, so the movement up and down stairs should not affect the elevation factor in this experiment.
Given the results shown in this section we can start to uncover what possible impacts of the built environment that can be collected and derived using the tool. However, this is only based on the data collected through the device used in this study, meaning that it is not possible to compare this model to any ground truth data now, but as Gorny et al. (2017) and Benedetto et al. (2018) suggests, the aggregated data could be deemed useable to show indications of the coefficients. Looking at how well the model  fits the dataset in Tables 5 and 7, the exclusion of accelerometer data shows us that the accelerometer data does have an impact on the validity of the model, if this does have an actual impact or if this is purely based on this data set has yet to be examined by performing further data collections. In the following Section 5, it will further be discussed what these indications of the coefficients might indicate.

Discussion
Given the information presented in Section 4, the Sections 5.1 and 5.2 will now further describe what these coefficients might implicate, and which possible errors can have affected the outcome, respectively.

Implications
By looking at Table 6, it is possible to see how much each variable is affecting the heart rate of the person. Variables related to movement, such as speed and physical exertion does seem to have less of an impact on the heart rate, while variables related to location and time does seem to impact more. As the heart rate is fluctuating based on these controlled external factors, one can presume that they might be related to psychophysiological indicators of what a person is experiencing in a certain situation. As we can see, there is an increase in heart rate while being at work as opposed to travelling, meaning that there are features in the work environment that are inducing stress and therefore heart arousal. This can be argued for to be caused by mainly psychological factors as all the participants had desk-oriented jobs. Looking further at how homes are affecting the participants, we again see an increase in the heart rate, which might be caused by other psychological factors, as all participants where adults, having to deal with chores and other everyday-life related activities which might not always be mentally relaxing. This is similar to the effects found in the work performed by Vrijkotte, van Doornen, and de Geus (2000), where they found that leisure time would increase the heart rate even more than work time.
Presuming that their overall condition at home is positive, we cannot assume that an increase in heart rate would automatically mean something negative or positive as an increase in heart arousal could also be caused by excitement or stress, to name a few reasons. In these upcoming sub-sections, the implications of physical movement, speed, location, time, and activity will be discussed regarding heart rate and possible psychophysiological signals.

Physical movement effects
By comparing Table 5 (with accelerometer data) and Table 7 (without accelerometer data), we see that the model is less reliable without the accelerometer data since the R-value is lower. This means that the accelerometer data, and therefore physical movement of the arms of the subject is affecting the heart rate. By controlling for this, we get more reliable results, which tell us more about how space and time is affecting the subject. However, the exclusion of accelerometer data does reveal possibly why the effect of speed is nonlinear and inconsistent. One could assume that the heart rate would increase as the speed increases, as this would mean that the subject would be moving faster, by either exerting more physical strength or by manoeuvring a vehicle. Even though one can draw the conclusion that the lower heart rates at higher speeds must be caused by vehicles, it would be safe to say that by controlling for arm movements, one should be able to distinguish whether the vehicle is controlled by themselves or if it is driven by someone else. Given that the effect of the two fastest speed categories is more accurate when removing accelerometer data (Table 8) as compared to the data controlled for accelerometer data (Table 6), one could argue that this means that the subject was using a mode of transport where they themselves did not have to input any physical interaction, for example, public transport or carpooling. The next subsection further discusses how these speed effects can be derived further. Another factor that is affected by the inclusion/exclusion of accelerometer data is the elevation. This factor shows higher impact as well as less accuracy when the accelerometer data is removed. One could argue that this is because of the physical exertion required to climb the elevations, but that does seem to be less accurate without the inclusion of accelerometer data, which seems reasonable given the knowledge that physical exertion raises the heart rate. However, there is currently no factor that is controlling the current change in elevation, given that the speed is only calculated flat on the surface of the earth. This means that the elevation data, which is only absolute values of the elevation at the location where the data point was collected, only should be able to contribute to indirect physical effects as the change in elevation over time would be needed to see direct physical effects of it.

Speed effects
As can be seen in Table 6, the speed does have an impact on the heart rate of the participants, which is to be expected. Speeds related to slightly heavier physical exertion such as brisk walking and running increased the heart rate the most among the speed-related categories. However, faster speeds had a decrease in heart rate, which could be considered as abnormal if the data set could prove that the participants were achieving this speed by themselves. But since there is no control for the utilisation of vehicles, this could be caused by, for example, using public transport. However, if we look at Table 8, we see that the reliability of low-speed movements where lowered while not using accelerometer data, while the reliability of high-speed movements increased. This might be caused by the accelerometer not being part of the calculation, controlling less variables affected by physical exertion further explaining why the effects of lower speeds are higher than the effects of higher speeds, as mentioned in the previous subsection. That walking had a slight increase in heart rate over being stationary was to be expected, but the fact that the higher speeds that might have been caused by vehicles decreases the heart rate compared to standing or sitting still is noteworthy. One could argue that this would be treated as psychophysiological data, showing that the participants, with all other collected factors controlled, enters a psychological state, which does decrease the heart rate when moving by vehicles, but further research is needed in order to prove that. On the other hand, as both Wang et al. (2017) and Gorny et al. (2017) has proven, the PPG sensors do tend to loose accuracy under extreme physical exertion, which running at such speeds would require. That could also explain why running has less increased heart rate than walking and why reading speed has such difference in significance compared to other variables. This opens for a discussion regarding how to handle data collected during more extreme physical exertion, as it might provide useful information but might not be as reliable. Further research could focus on finding a break-off point at which the data becomes too unreliable and should be disregarded.

Location effects
The results of this study imply that there is an effect on the heart rate based on location when controlling for speed and physical exertion. If compared to the work performed by Vrijkotte, van Doornen, and de Geus (2000), there is a similarity between this study's results to the measurements collected from people working in low-imbalance conditions in their study. They also found that though work might increase the heart rate of their participants, the leisure time would increase it even more, just as can be seen in Table 6 of this study. However, they measured absolute values, creating a possible bias if the participants of the group had irregular ranges in heart rate from one another, which would then give an inaccurate result. This could possibly imply that it is now possible to collect psychophysiological data with such simple means as smart devices collecting biometric information in a non-intrusive way, due to the similarity in trends. If the claims of Benedetto et al. (2018) as well as Gorny et al. (2017) regarding the external validity of aggregated data collected through wrist-worn PPG-devices is true, it would mean that further studies in the form of those as Jerčić, Sennersten, and Lindley (2018) and Göbel, Springer, and Scherff (1998) could be performed with methods that would not affect the participants perception of their situation and thus refrain from possible influence errors that might arise from the constant reminders of the participants data being collected.

Time and activity effects
Time also does seem to influence the participants' health in terms of heart rate fluctuation. The heart rate was generally higher during 8-10, 12-14 and 18-20 as opposed to 10-12 and 14-18. Given the work situations of the participants, the high heart rate times could be indicators of the effect given by mobilising oneself during the commute to work, going out and getting lunch and commuting back from work, while the low heart rate times could correlate to times spent at the desk, in meetings or lecturing. A way of having this data more specified would have been to use a manual log, in which the participants would have been able to state their current activity, much like what Axhausen et al. (2002) did when analysing trends of movement, but that would defeat the purpose of having a non-intrusive method for collecting health effects of the surrounding environment as the participants would start to reflect on their actions. A method more similar to that of automated travel diaries as described by Prelipcean et al. (2018) could possibly be a solution for understanding better what actions are performed in order to control for said factors. This could possibly be done by refining the accelerometer data and utilising that information to understand what type of activity is being performed, much like how Prelipcean et al. are predicting travel modes. As described in section 5.1.1 it can be seen that by comparing Tables 5 and 7, controlling for accelerometer data does affect the fitness of the model to the data set. Including the accelerometer data among the controlled variables does seem to give a better model, however slightly that improvement might be.

Possible errors
This study is not an attempt to map any specific locations with any health impact at this point in time, but rather meant as a step towards a non-intrusive method for collecting and analysing how people are affected by the built environment that is surrounding them. Because of this, the study had a few limitations, which could impact the outcome of the work. The most obvious limitation is the fact that the whole data analysis relies on the premise that the data collected from the device, the formula for analysis and the validation can be considered reliable. There was no calibration towards any clinical devices, but the assumption was that the aggregated data from several participants over longer time would still give a good indication of the ground truth, as Benedetto et al. (2018) claims. There was also no survey being analysed to go along with the data in order to know more about the effects of the participants' behaviour and actions, meaning that the results rely solely on the data being collected and deduced from the device itself. The clustering used in this study relied on knowledge regarding the participants work and living situation. For future studies, an automatic clustering method would be preferable as it would not limit the number of locations to be used when analysing the effects, making it possible to scale up the data analysis virtually infinitely.
The accelerometer data has been collected with a sensor that lacks any data regarding accuracy or native sample rate. However, this sensor is only meant to indicate if there is any physical activity during the collection of a data point. This could for future studies be exchanged for a single variable that had a value based on all the motion recorded in the accelerometer or gyroscope data instead of having the three accelerometer axes analysed separately when the sample rate is so low anyways.
The limited data set leaves much to be desired to make any externally valid conclusions deduced from the results. What has been proven is that it is possible to collect biometric information from a commercially available product, which can be controlled and analysed to deduce psychophysiological and health effects on the participants involved in the study.
Another assumption which might complicate the analysis is that the subjective perception of height and speed is similar among all participants. In such a small data set it is hard to neglect such factors, as one participant might be afraid of heights or high speed while the others might find the same factors exciting or neutral. This is something that one could control with either a questionnaire, exploring subjective perception, or by having a dataset large enough where this factor can be assumed to be random and neglectable.

Conclusion
The aim of this paper is to show the possibility to utilise information relating to biometric data while controlling for movement and time in a step towards making a non-intrusive method for collecting psychological and health effects of the built environment on people through psychophysiological data analysis. By utilising smart watches equipped with hardware and software for collecting biometric information as well as positional data along with auxiliary information, it has been shown what we can expect from state of the art commercially available equipment. The results are that within the limited data set collected from three participants, we can see trends and indications on how different locations, times and activities might affect the heart rate differently. By controlling for heart rate affecting factors, an attempt to analyse psychophysiological data has been performed, showing promising results when compared to select studies in the field. Further research would include a larger population with a refined prediction model for different activities to better control for what effects are caused by the surrounding environmental factors and which are caused by the actions of the participants. Investigation is needed regarding the cut-off points for reliability found in data collected through PPG-sensors.