Development of Dutch occupancy and heating profiles for building simulation

ABSTRACT Building simulations are often used to predict energy demand and to determine the financial feasibility of the low-carbon projects. However, recent research has documented large differences between actual and predicted energy consumption. In retrofit projects, this difference creates uncertainty about the payback periods and, as a consequence, owners are reluctant to invest in energy-efficient technologies. The differences between the actual and the expected energy consumption are caused by inexact input data on the thermal properties of the building envelope and by the use of standard occupancy data. Integrating occupancy patterns of diversity and variability in behaviour into building simulation can potentially foresee and account for the impact of behaviour in building performance. The presented research develops and applies occupancy heating profiles for building simulation tools in order create more accurate predictions of energy demand and energy performance. Statistical analyses were used to define the relationship between seven most common household types and occupancy patterns in the Netherlands. The developed household profiles aim at providing energy modellers with reliable, detailed and ready-to-use occupancy data for building simulation. This household-specific occupancy information can be used in projects that are highly sensitive to the uncertainty related to return of investments.


Introduction
The building stock in the Netherlands consists of 7.5 million dwellings (CBS, 2014). Dwellings of the postwar period account for approximately one-third of the residential stock (Itard & Meijer, 2008); a large number of these properties are in need of renovation. Housing associations are important stakeholders in this context. There are approximately 400 housing associations in the Netherlands that manage 2.4 million residential properties, constituting 34% of the total housing stock (Aedes, 2013). Dutch housing associations have the ambition of achieving an energy rating of C for 80% of their properties and an average rating B by 2020 (Aedes, 2013), while currently the average rating for the post-war building according to AgentschapNL (2011) is D-E (approximately 350-400 kWh/m 2 /year primary energy), resulting in an expected energy consumption of approximately 20 000 kWh/dwelling/year. Therefore, the energy retrofit of post-war buildings offers great potential for carbon reductions. However, there is a lack of fast, affordable and robust processes for largescale building renovation. This problem is magnified in multi-family rented buildings in which the incentives for saving energy and increasing indoor comfort are split between owners and tenants, therefore increasing the risk of a large gap between the predicted and actual energy consumption.
This study focuses on a retrofit approach that is currently under development by a consortium of academic and industry partners in the Netherlands. It addresses the challenges of retrofitting the existing building stock and is sponsored by the European Union Climate-KIC's flagship Building Technology Accelerator (BTA) project and the Dutch TKI/Energy programme. To support the transformation of the built environment, the BTA aims to stimulate the large-scale dissemination and acceleration of new low-carbon technologies into the market. This paper focuses on the challenge related to the effect of building operation and occupants' behaviour on the energy expectations of renovation projects, thus tackling the so-called prebound effect (Sunikka-Blank & Galvin, 2012). The prebound effect refers to a gap between the expected and the actual energy consumption caused by households using less energy than expected before the renovation due to the lack of consideration of actual behaviour of buildings' occupants. This effect has implications for the economic viability of energy retrofit programmes (Sunikka-Blank & Galvin, 2012). For example, the payback periods for low-carbon technologies would be longer than calculated. The goal of this research is the development of occupancy and heating profiles that can be applied to building simulation tools to predict more accurately and to fine-tune the energy performance of the building.
The objective of this study is to define more accurate occupancy profiles per household type that can lead to more accurate predictions of energy demand. More certainty on the occupancy behaviour before a retrofit could potentially help to reduce the financial risks associated with the prebound effect. The rebound effect is not tackled in this phase of the project, since measures to reduce it should be implemented in the post-renovation phase of the process. The rebound effect is thus outside the scope of this paper.

Influence of occupant behaviour in building simulation
Energy simulation tools can be used during the design phase to predict energy demand and help designers choose and size different fabrics (for the external envelope) and mechanical systems (Azar & Menassa, 2012). However, recent research has widely documented the differences between the actual and the predicted energy consumption (Virote & Neves-Silva, 2012), which are thought to be caused by faults in the building envelope or commissioning of the systems, occupants' behaviour being different than assumed, and the interaction between occupants and building technology. According to Yu, Fung, Haghighat, Yoshino, & Morofsky (2011), energy consumption is determined by climate, building characteristics, occupants' behaviour, socio-economic factors and indoor environmental quality. While the impact of climate, building characteristics and indoor environmental quality requirements can be readily investigated and tested in current building simulation software, the impact of user-related characteristic and occupant behaviour are still not fully incorporated into simulation tools.
It is important to understand both the existing behaviour and the drivers causing the behaviour (Wei, Jones, & de Wilde, 2014). Researchers have found significant relationships between occupancy characteristics and socio-economical factors (Guerra-Santin & Itard, 2010). Employment, house ownership, income and educational level have been found to have an effect on energy consumption. However, some factors depend greatly on the country of study. For example, McLoughlin, Duffy, and Conlon (2012) used household social class as an indicator of income and found that higher professionals (high and intermediate managers and professionals) consume more electricity per household per year than middle and lower social classes (supervisory positions, skilled, semi-skilled and unskilled workers, the unemployed) in the UK; while Guerra Santin, Itard, and Visscher (2009) found no relationship between income and energy consumption in the Netherlands.
Therefore, occupancy profiles and occupant behaviour not only differ per household type but also can vary between regions. Regional responsive data can help to achieve better predictions (Al-Mumin, Khattab, & Sridhar, 2003). According to Kane, Firth, & Lomas (2015), understanding heating patterns in British homes is crucial for energy policy formulation, the design of new controls and heating systems, and for accurate stock modelling. Therefore, the development of occupancy profiles for the specific region of study is necessary.
Integrating occupancy patterns diversity and variability in behaviour into building simulation can potentially foresee and overcome the impact of behaviour in building performance (Stokes et al. cited in Richardson, Thomson, & Infield, 2008;Lee & Malkawi, 2014). Occupancy is considered to have a great influence in occupants' heating and ventilation behaviour, as well as on electricity consumption patterns (D'Oca & Hong, 2015). Therefore, determination of occupancy profiles and heating and ventilation patterns that more accurately reproduce building operation are considered crucial in the area of building simulation (Johansson, Bagge, & Lindstrii, 2011;Virote & Neves-Silva, 2012).
In this context, occupancy behaviour refers to how the building would be operated (heating, air-conditioning, ventilation systems), what would be the occupancy level (number of people present at a determined time), and what would be the internal heat gains related to the presence and use of lighting and appliances (Hopfe & Hensen, 2011;Ryan and Sanquist, 2014).
Several models have integrated the influence of occupants' behaviour into building simulation programmes, however only focusing on a limited set of parameters, for example a simplified and schematic representation of the operation of heating controls or windows (Azar & Menassa, 2012;Lee & Malkawi, 2014;Wei et al., 2014;Yu et al., 2011). In addition, current simulation tools, for both energy performance certification or design, lack an approach to evaluate the impact of occupants' characteristics (Martinaitis, Zavadskas, & Motuziene, 2015).
A number of building simulation studies have focused on understanding the effect of occupants' behaviour on specific designs or low-carbon technologies. For example, occupancy profiles can be defined with a specific purpose such as improving the design of buildings (Flores Larsen, Filippin, Beascochea, & Lesino, 2008), improving the efficiency of ventilation systems (Johansson et al., 2011), or determining the influence of specific internal or external building conditions (Ampatzi & Knight, 2012).
However, there is no standard method to assign the heating set-point for building simulation. Occupancy patterns are defined from standards or estimates (Wei et al., 2014). For example, The American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE) Standard 90.1. 2004 provides standardized occupancy factors for different building types which can be used to design when occupancy schedules are unknown.
In current simulation tools, occupancy level and intensity of use of appliances and lighting are considered for the calculation of internal heat gains; while building operation is included as a 'standard' or 'average' fixed schedule for the thermostat-setting and window operation (Lee & Malkawi, 2014;Wei et al., 2014). Heating and ventilation patterns that reflect the diversity of occupancy profiles followed by different households could produce more realistic and reliable predictions (Kane et al., 2015).
Several studies have focused on the development and use of occupancy profiles. These determine occupancy input based on surveys or datasets. For example, Santos Silva and Ghisi (2014) analysed uncertainties in building simulation through a probabilistic approach. Uncertainties of the user behaviour and physical parameters were obtained through a literature review and field survey. Martinaitis et al. (2015) performed an analysis on the effect of domestic occupancy profiles on the performance of energy-efficient houses and assessed the applicability of default simulation software occupancy profiles. The daily occupancy patterns were created according to the Harmonized European Time Use Survey.
Conventional statistical analysis has been used and reported extensively in this area of research. The main purpose of these studies has been to identify relationships between different factors affecting energy consumption. For example, regression analysis has been used to identify factors influencing energy use and their relative importance. For a complete review on these studies, see Guerra Santin et al. (2009).
Engineering models use information such as appliance power ratings and end-use characteristics to build a bottom-up description of electricity consumption patterns (McLoughlin et al., 2012). In engineering models, appliance, lighting and electricity load profiles are generated using either metered data or a combination of timeuse data, appliance ownership and power information about the appliances. McLoughlin et al. (2012) give some examples in their review (see also Capasso et al., 1994;Wilden & Wackelgard, 2010;Yao and Steemers, 2005).
Machine-learning algorithms have been more recently used in the area of building simulation to develop occupancy profiles. Occupancy profiles can be classified into deterministic models and stochastic models. In deterministic schedules, a standard day profile is usually the same for all weekdays and both weekend days. For these models, data-mining can be used to obtain information on user-building interaction. Depending on the available data, this method assumes no change in occupancy schedules throughout the year (Duarte et al., 2013). Other studies have focused on the development of stochastic occupancy profiles with data from monitoring campaigns. Diversity profiles, generated by these models, represent typical probability profiles and are derived from long-term monitored data. The probabilistic models generate random non-repeating daily profiles of occupancy for a long-term (annual) building performance simulation (Mahdavi & Tahmasebi, 2015). These models can be used to randomly generate multiple building occupancy patterns to evaluate the uncertainties related to occupant behaviour. For these models, diverse machine-learning algorithms are used such as Markov chains or artificial neural networks (Davis & Nutter, 2010;Jovanovic et al., 2015;Virote & Neves-Silva, 2012). Prediction models aim to generate artificial occupancy patterns that are similar to the actual (measured) patterns. Thus, the limitations of studies using monitoring data is that the mined or predicted occupancy profiles are circumstantial to the given dataset (D'Oca & Hong, 2015;Virote & Neves-Silva, 2012).
In building-simulation studies, the method used to define occupancy depends on the purpose of the study and the availability of the data. The following sections describe the approach taken in this study.

Methods and data
As there are large differences in energy consumption between households, it is very important to get clear insights into the relationship between type of occupancy and energy use. For example, in the context of this research, these insights will help to assess the feasibility of the 'zero-on-the-meter' (in Dutch: nul-op-de-meter) target in the retrofit of apartment complexes, to evaluate the effectiveness of technological measures, and to reduce the risks of unexpected energy bills. In the Netherlands, zero-on-the-meter is defined as a building (usually renovated social residential buildings) in which the yearly building-related and user-related energy consumption in MJ equals the generated energy in the building and surrounding area (RVO, 2015).
This study consists of the definition of household types and corresponding occupancy patterns. The process to define household types and occupancy patterns can be seen in Figure 1. As a first step, household types are defined as the most representative household typologies on a national sample in terms of demographics. As a second step, occupancy patterns are defined with exploratory factor analysis. Occupancy profiles are defined, in the context of this study, as a set of building operation patterns, for example heating patterns, ventilation patters and presence at home. As a third step, analysis of variance (ANOVA) tests are used to determine household profiles as the specific occupancy patterns followed by a determined household type. Household profiles are determined based on the relationship between household types and occupancy patterns. The main goal of the household profiles is to characterize the intensity on the use of the building, installations and appliances.
The occupancy patterns linked to the household types will be used to calculate the expected energy consumption through building simulation. A different combination of household profiles can be used to determine worst-and best-case occupancy scenarios, as well as average scenarios. The occupancy scenarios can be examined to determine whether the energy targets are reached in all instances. The results can be compared with the results from a common approach to calculate energy consumption (i.e. using an 'average' household). These results would indicate how realistic are the energy targets of a determined project. These steps are, however, outside the scope of this paper. The definitions of household types, occupancy patterns and household profiles are presented in the following sections.
This investigation, on the effect of occupancy and behaviour profiles diversity, aims to be integrated into renovation processes. Therefore, the method to determine the expected building performance accounting for household variation should be predefined, readily usable and representative for the region of study (the Netherlands).
It is anticipated that housing associations would be able to determine in advance the household's typology in the building to be renovated based on their client portfolio and, thus, deterministic occupancy profiles per household are preferred to stochastic and predictive models. Deterministic models would be also more easily and readily applied to building simulation tools. The use of survey self-reported data is in this case preferable to monitored data because of the complexities of collecting data in buildings to be renovated; however, the authors acknowledge the fact that self-reporting data are not exempt from errors. Furthermore, the use of survey data with a large number of cases is preferable to monitoring data based on a limited number of dwellings, as the aim is to investigate the impact of different households representing the variability within the country.
Since simulation tools only focus on building-related energy demand, the profiles discussed here are only related to space heating and ventilation. Occupancy (the presence of people at home), lighting and appliances use are defined only to calculate internal heat gains.
To develop country representative occupancy and heating patterns, a nationwide dataset is used. Statistical analyses were used to determine the most common types of households in the Netherlands. The Woononderzoek Nederland (WoON) dataset 2012 (see www. rijksoverheid.nl) was used to carry out this analysis. It is based on a nationwide survey carried out by the Dutch Ministry of the Interior and Kingdom Relations (BZK). The WoON dataset 2012 is the third survey carried out; the first and second surveys were carried out in 2006 and 2009 respectively. (The WoON dataset 2015 is not yet available.) The goal of the survey is to determine how Dutch people live and want to live. The dataset includes information regarding household composition, housing needs, energy consumption and building operation. The advantage of using this dataset is that it has been previously used for behavioural research (Guerra Santin, 2010;Jeeninga, Uyterlimde, & Uitzinger, 2001) since the dataset is openly available to researchers. In addition, the survey was carried out several times, and so the type of data collection and data coding has improved at every iteration.
The dataset consists of the compilation of 4800 dwelling audits and over 69 000 household questionnaires, which are also linked to external data (Tigchelaar and Leidelmeijer, 2013). The building audits aim to gather data on building characteristics, while the household questionnaire collects data regarding occupants' behaviour and household characteristics, among others. The WoON dataset (version 2006) has previously proved useful in the study of occupant behaviour in residential buildings (Guerra-Santin & Itard, 2010).

Results
This section presents the results of the statistical analyses to define the household types, occupancy patterns and household profiles.

Household types
The WoON dataset was used to determine households types in relation to their size, composition, age, and the absence or presence of seniors and children, which are important variables on energy consumption (Guerra-Santin & Itard, 2010). Eleven types of households were identified in the sample. Four groups were too small in the sample and therefore were not further studied. Table 1 shows the descriptive statistics of the groups.
ANOVA tests were conducted to investigate the relation of these types of households with electricity, gas and water consumption, as an indicator of domestic hot water (see Table 2 for descriptive statistics). The results showed that gas consumption (F(6,16 080) = 659.1, p < 0.001 Welch statistic), electricity consumption (F(6,16 059) = 3054.8, p < 0.001 Welch statistic) and water consumption (F(6,15 546) = 73059.5, p < 0.001 Welch statistic) are statistical significantly different for the seven types of households. Post-hoc Tukey comparisons were used to ascertain differences between specific household on energy and water use.
For gas consumption, post-hoc comparisons showed that there are statistically significant differences between all groups except between 'one senior' and 'two seniors', 'two seniors' and 'nuclear family', and 'two seniors' and 'three adults'. For electricity consumption, post-hoc comparisons showed that there are statistically significant differences between all groups. For water consumption, post-hoc comparisons showed that there are statistically significant differences between all groups except between the groups 'two adults' and 'two seniors'.  Figure 2 shows that one-person households use the least amount of gas followed by single-parent households, while larger households and those with two seniors use more gas. Figure 3 shows that for electricity the important factors are household size and the presence of children. For water consumption, the main determining factor is household size.

Occupancy patterns
This section defines the occupancy patterns that Dutch households are more likely to follow.
Occupancy patterns are defined as the use of the heating system, opening windows, preferences for temperature settings and presence at home. To define the occupancy patterns for heating, it is assumed that households with similar composition will have the same occupants' behaviour regardless of other socio-economical variables. This assumption allows an investigation of regional household profiles. In addition, research has shown larger effects of socio-economical variables on electricity use than on energy for space heating.
Exploratory factor analysis is a technique used to reduce the number of variables, and it can help to determine related behaviours. The variables used refer to self-reported heating-related behaviour at home, namely: presence at home, thermostat setting, use of radiators and ventilation while heating (Table 3).
Factor analysis describes the variability among variables in terms of factors. The behaviour factors resulting from the analysis (groups of related variables) were further analysed in relation to the intensity of behaviour they represent and their relation to the previously determined household types. According to Field (2005), a factor can be described in terms of the variables measured and the relative importance of these variables to that factor.
Eighteen variables were used in the analysis. They were first examined to determine whether factor analysis was a suitable method, examining the correlation between them. All variables correlated at least .3 with other variables, thus suggesting reasonable factorability. The initial Eigen values showed that the first factor explained 20.1% of the variance, the second 17.3%, the third 10.1%, the fourth 7.7%, the fifth 6.6% and the sixth 6.2%. Factors 7-18 could each explain less than 5%. After examining the Eigen values in each of the resulting factors, and analysing the scree plot, the   solution that included six factors and explained 68% of the variance was preferred. The factor loading matrix (contribution of each variable to the solution) and communalities (common variance shared with other variables) are shown in Table 4. Scores were created for each factor based on the mean of the variables that have their primary loadings on each factor. The composite scores were named after the variables contributing to each factor. The factors represent the occupancy behaviour, these are: Presence at home, Day temperature, Setback temperature, Radiators in bedrooms, Ventilation while heating and Radiators in service rooms (Table 5).
In order to maintain a large number of cases for further analysis, missing values were replaced with the mean (Table 3). However, since this method could suppress the true value of the standard deviation (SD), pairwise analysis was also executed to make sure that replacing the missing values with the mean did not affect the results. The results of both analysis were very similar and, thus, the results of the first analysis are used.

Household profiles
The household profiles are the specific occupancy patterns followed by a determined household type. To determine the household profiles, analysis of variance (ANOVA) tests were carried out between the factor scores (occupancy patterns) and the household types (for statistics, see Table 6). All behavioural factors were statistically significant different between household groups, except for factor 5, Ventilation while heating (Table 6, column 1). Previous studies have also failed to find statistical correlation between ventilation habits and household types (Guerra-Santin & Itard, 2010) suggesting too little variability on ventilation patterns between Dutch households. The second, third and fourth columns of Table 6 show the household types scoring lower, average or higher on each factor, representing the intensity of the behaviour per household type. This clustering was made in accordance to the ANOVA post-hoc Tukey tests. This information was used to identify the intensity of the use of the building and building systems (e.g. thermostats setting, use of radiators, ventilation, presence). Figure 4 summarizes graphically the results from the ANOVA tests, showing the factor scores (columns) for each of the household types (colours). It shows that seniors (singles and couples) and nuclear families tend to be more time at home, while adults (especially single adults) spend less time at home. The thermostat setting in seniors households seems to be the highest, while adults tend to set their thermostat lower. Single adults seem to have the lowest thermostat setback; while nuclear families and single seniors have the highest thermostat setbacks. Households with children seem to heat the bedrooms more frequently, while households with two seniors, three adults and nuclear families tend to heat service rooms such as the kitchen and bathroom more frequently. Ventilation preferences seem to be similar in all household types, only the single-parent households seem to differ from other households, ventilating more frequently while the heating is on.

Definition of occupancy patterns for building simulation
To develop the occupancy patterns, this study is based on the dynamic building simulation programme Bink The Kaiser-Meyer-Olkin measure of sampling adequacy was .742, above the recommended value of .6, The diagonals of the anti-image correlation matrix were all above .5, supporting the inclusion of each item in the factor analysis. Finally, the communalities were all above .3, further confirming that each item shared some common variance with other items. Given these overall indicators, factor analysis was conducted with all 18 variables.   DYWAG, which has been developed according to NEN-EN-ISO 15255, 15256, 13792 (see Binksoftware.nl). The household profiles have been defined in accordance to the required input values in this software. In the software, the authors can define specific heating patterns per day, week, month or year, as well as the presence of people, heat gains and artificial lighting and appliances use in each room.
In the Netherlands, individual rooms are usually heated by radiators fitted with thermostatic radiator valves (TRVs), the valves modulate the flow to the radiator in response to the locally sensed temperature, enabling different rooms to achieve different temperatures (Kane et al., 2015). From previous studies, it is known that in Dutch houses the radiators are usually left closed or half open in the least-used rooms (Guerra-Santin & Itard, 2010). In addition, authors have found that large amounts of energy are wasted due to unoccupied space. In order to take into account the influence of the thermostatic valves in the simulation, more than one thermostat is defined per household, reflecting the state of the radiator in a room as open, semi-open or closed. A similar approach has been followed by Monetti, Fabrizio, and Filippi (2015). For each household profile, up to three thermostat programmes are defined; each thermostat can be linked to different rooms depending on the household type and building layout. For example, a first thermostat set to 22°C can be linked to the living room (or the room with the thermostat) where the radiators are kept completely open; a second thermostat set to 16°C can be linked to the kitchen, bathroom and other rooms where radiators are left closed; and a third thermostat set to 19°C can be linked to the bedrooms where the radiators are kept half open.
The Bink simulation program does not allow the specification of the natural ventilation patterns per hour; natural ventilation can be only defined based on outdoor and indoor temperature. Therefore, the windows will be simulated to be closed during the winter.
As previously stated, each household profile was defined based on household type and their relationship with the occupancy patterns (defined with factor analysis). For each household, the intensity of the behaviour (e.g. thermostat setting, presence at home) was determined based on the results of the ANOVA tests carried out between household type and the occupancy patterns (shown in Table 6). For example, a household type scoring higher in temperature setting would have a higher intensity behaviour for thermostat setting (i.e. temperature setting is higher) than a household with a low score. The household profiles are defined in terms of the presence in the dwelling, thermostat setting, thermostat setback, use of radiators and natural ventilation frequency (when the heating is on). The use of appliances and artificial lighting is based on the presence of occupants in the dwelling. Table 7 shows the resulting household profiles, which consist of a relative measure for intensity of behaviour (e.g. seniors use higher set-points than singles). The actual input values for the simulation are obtained from descriptive statistics from the same dataset (Table 8). The input values are defined in the following section and summarized in Table 9.

Presence
The household profiles consist of the schedule for the presence for a whole week. The presence of the occupants is based on the mean number of days that the occupants reported to be at home. It was assumed that all households were more often at home at the beginning of the week and on weekdays than on weekends since previous research has shown that households have an irregular schedule at weekends. This assumption has, however, no implications for the results of the simulation, but it simplifies the input into the software. To determine the number of people present in a room, the rooms of the building were categorized as (1) living area (living room and kitchen), (2) sleeping area (bedrooms), and (3) short-presence spaces (corridors, bathrooms). The short-presence areas were considered to be always empty, while the living area was considered to be occupied during day hours, and sleeping areas during night hours. In the case of singles and couples   Table 9. Definition of specific occupancy profiles for building simulation. living in a two-or three-bedroom dwelling, the rest of the bedrooms were considered to be unoccupied, while for households with more than two adults, the bedrooms were considered occupied during day and night. Table 10 shows the occupancy patterns for each household type for common areas (living room) and bedrooms (0 = absence, 1 = presence).

Internal heat gains
For internal heat gains, the use of lighting and appliances was defined based on the presence of people.
In instances in which people are present in the room, the appliances and lighting will be considered to be in use. Two appliances and lighting use patterns per household type were generated: a 'best-case design' in which the use of natural light is maximized and thus the artificial lighting demand is determined by the time of the day and presence (artificial light is not used in the absence of people or during daytime); and a 'poor natural light design' in which artificial light is determined only by the presence of people (except in the night-time). The selection of the scenario to be employed would depend on the renovation requirements of the project. Table 10 (background colours) shows the appliances and lighting profiles for each household type in the 'base-case design' pattern.

Heating (thermostat setting and radiators use)
Two different target temperatures can be defined in a thermostat: the set-point (or comfort) temperature and the setback temperature. In smart thermostats, the setback can be low enough to allow switching off systems and so save energy but high enough so that the house can be heated again in a reasonable amount of time (Kleiminger, Mattern, & Santini, 2014). However, the setback temperature in houses with manual or programmable thermostats depends on the preferences of occupants.
thermostat (where the actual thermostat would be located), radiators in bedrooms thermostat, and radiators in other rooms thermostat. These three thermostat settings aim at reflecting the use of radiators in different rooms of the dwelling.
To determine the input value in the simulation programme, descriptive statistics per household were used ( Table 8). The results of the ANOVA post-hoc analysis determined the descriptive statistic to use as an input. For the households with middle factor scores (> −0.1 and < 0.1), the thermostat setting was defined as the mean reported thermostat setting; for households with factor higher scores (> 0.1), the thermostat setting was defined as the mean + 1 SD; and for the households with lower factor scores (< −0.1), the thermostat setting was defined as the mean -1 SD.
The input value for the thermostat setting in the living room thus consists of the statistic defined by the ANOVA post-hoc test between factor 2 (thermostat setting) and household type. For example, for the nuclear family, the thermostat setting for Monday at 10:00 hours is the mean value of all households defined as 'nuclear family' in the dataset, for the time slot 09:00-12:00 hours.
thermostat settings in the living room (T1) for each household type.
To define the temperature settings in the bedrooms and in the other rooms, the results from the ANOVA analysis were used to define households likely to turn on the radiators in bedrooms and service rooms (factors 4-6). The temperature for radiators open was considered as equal to the main thermostat settings; the temperature for radiators closed was equal to the setback setting or (in case of households with no thermostat setback) the lowest temperature in the main thermostat schedule. The temperature for radiators half open was defined as equal to the average between the highest and the lowest temperature setting per household type. The heating profiles for bedrooms and service rooms are shown in Table 11 (T2 and T3 respectively).

Ventilation profile
Differences in ventilation while heating patterns were not found to be statistically significant for the different types of households. Table 12 shows the descriptive statistics for the natural ventilation frequency during the winter period in the dataset. Nearly 50% of the respondents for each household type reported always using natural ventilation during the winter. The percentage of household in each frequency category was very similar. Thus, for the occupancy profiles developed, it is assumed that all household profiles have the same ventilation behaviour at all times. Figure 5 shows the complete profile for a 'single senior' household. The profile consists of a profile for the presence, artificial lighting use and thermostat setting for the living room (or the place where the thermostat is located), bedrooms and other rooms. The profiles show the thermostat settings in degrees Celsius, and the presence (1) and absence (0) of people and artificial light per hour and day of the week.

Validation of household profiles
The profiles developed in this study aim at discerning the differences in behaviour between household types on a national sample. Although the household profiles are not completely related to energy consumption due to the effect of building characteristics, a certain level of correlation is expected between the profiles and gas consumption. Therefore, in a first attempt to validate the profiles created, Pearson correlation tests between the factors (occupancy patterns) and gas consumption were carried out (Table 13). The results show small but statistical significant correlations between gas consumption and all factors except Radiators in bedrooms. The lack of correlation between Radiators in bedrooms and gas consumption seems to be originated by little variance on this behaviour within the sample.
More important than the absolute energy consumption per household is the relative difference in the intensity of behaviours between household types, thus looking at behaviour and not to the influence of building characteristics (such as dwelling size). Figure 6 shows the relationship between gas consumption per household type and the household profiles developed in this study. It shows that households with more intensive heating behaviours (i.e. bars towards the right): one senior, two seniors and a nuclear family show higher gas consumption than their household size counterparts (i.e. one adult, two adults, a single parent). The higher gas consumption of larger households (three adults and households with children) will be evident in the results of building simulations, when the number of spaces heated are considered.
Given that the household profiles in this investigation are generated using statistical analysis of self-reported data (i.e. the respondents reported on their own behaviour), it would be necessary to validate the results with data from building monitoring campaigns in terms of measured behaviours per household type. A companion paper will deal with the development of household profiles based on monitoring data, and their comparison with the profiles developed in this study.
It is important to add that the development of occupancy and heating profiles in this paper aimed at determining household-specific profiles, and not with the intention of predicting occupancy patterns or energy  Figure 6. Relationship between household profiles and gas consumption per household type.
consumption (i.e. stochastic models). The approach followed is deterministic and descriptive in nature, and thus the use of statistical data allows generalizations to be made to the population of study.

Discussion
Seven household profiles were developed based on statistical analysis with the aim of providing nationwide occupancy input data for building simulation. The use of national statistical data allows the results to be generalized. The profiles developed are made up of information known to have an effect on energy consumption, and of information needed for input in the building simulation program Bink, although similar information is required in most simulation programs. The household profiles developed aim to reflect the lifestyle and preferences of seven representative household types in the Netherlands, with the objective being to determine the effect of different household characteristics during the design phase of buildings. It is important to add, however, that these profiles could slightly change if a specific sector of the population is under consideration, for example in projects directed to social rental properties, where households with lower incomes are the target group. Future research should aim at defining these differences.
The advantage of the household profiles developed in this study is related to the practicality of using deterministic occupancy data as input in building simulation programs. The relative simplicity of the method would allow its use in practice, especially in the design phases of construction or renovation processes, when fast iterations of calculations are required. Software libraries can be easily implemented to be employed in different projects and by different energy modellers.
The main disadvantage of this method is related to the reliability on self-reported questionnaire survey data. Previous research has found that self-reported behavioural data are not always accurate. However, the large sample sizes provided by these methods (which would be prohibitive in other methods) makes it possible to create generalizations for the Dutch population. Further phases of this study aim to use monitoring datasets to validate the profiles. Therefore, the limitation of this study is related to the validation of the developed profiles with actual occupancy data, which could only be obtained through numerous and extensive monitoring campaigns. However, given that the profiles were determined based on a large dataset and with a random sampling in the population, they provide a much improved alternative to 'standard' occupancy profiles based on rules of thumb.
The results shown in this investigation are in line with trends found in other studies. For example, Kane et al. (2015) found that heating patterns vary depending on the age of households and employment status. Households over 60 years old or unable to work turn the heating on earlier in the year, heat longer each day and heat to higher temperatures in comparison with younger households and those in employment. Yohanis, Mondol, Wright, & Norton (2008) found that households over 65 years old are usually at home during daytime hours; young householders (less than 40 years) tend to have active evenings but low daytime consumption; and middle-age households (50-65 years) usually with children at home have higher electricity consumption in the evenings. This paper goes further by offering complete heating patterns per household type, integrating presence and heating-related behaviour.
The approach presented in this paper is intended for implementation (with some adaptations) in other countries in which datasets as the one employed in this analysis might not be available. Therefore, to determine the patterns in a country without statistical information, or to validate the statistical patterns, building monitoring campaigns could be used. In addition, more information is needed regarding ventilation patterns. In the WoON dataset, around 50% of the households responded to make use of natural ventilation during the winter; however, it is unclear whether the users completely open the windows or only use vents (the latter is a common ventilation practice in the Netherlands). Monitoring data could provide more information about these patterns.

Conclusions
Energy refurbishment approaches are attractive, not only from a CO 2 mitigation perspective but also from a financial point of view. For the acceptance by the end user and the feasibility of the business cases of these refurbishment approaches it is important that uncertainty about the actual energy consumption is minimized. Will the energy use be zero in practice? Today the differences in energy use between the households are huge. It is unhelpful to speak of an average household in this perspective. Therefore, it is important to understand the relation between occupancy and energy consumption.
In this research, occupancy patterns for energy consumption in the Netherlands were defined. Seven statistically defined household types were linked to occupancy patterns (building operation). Factor analysis and ANO-VAs were used to define the relationship between the household types and the occupancy patterns.
The results showed that households with seniors and nuclear families have more energy-intensive heating practices than households with single adults or singleparent households. Households with two adults could be considered to be close to an average household. The differences in heating behavioural patterns seem to be caused by differences in lifestyle between households (e.g. hours present at home), by comfort preferences (e.g. senior households keep higher indoor temperatures) and household composition (e.g. presence of children). However, the less energy-intensive heating practices of the single-parent household might indicate that other household conditions could also be affecting the occupancy patterns for heating.
The use of statistics to determine the occupancy patterns proved useful to define the occupancy of a building when real information about the occupants is not available due to the building renovation schedule, a sensitive processes or when the building is unoccupied. This method can be applied to any type of building renovation projects in the Netherlands, or even in new housing projects. The approach could also be used in other countries provided that datasets containing information about household demographics, building characteristics and occupant behaviour are available.
The household profiles developed in this study aim at providing energy modellers with reliable, detailed and ready-to-use occupancy data for building simulation input. Household type-specific occupancy information can be used in projects that are highly sensitive to the uncertainty related to payback periods and return of investments. By calculating the energy requirements per household type, the designers can make sound data-based decisions leading to energy targets that are true for all users, and not only for an average household.
The calculation of energy requirements taking into account the effect of household typology aims at reducing the gap between the expected and actual energy performance of buildings and at tacking and minimizing the consequences of the prebound effect in renovation projects.