Reference dependence in the UK housing market

Abstract The study of reference dependence in housing markets is of practical importance due to the unusual characteristics of property transactions, such as high information asymmetry caused by many individuals’ lack of experience in housing markets. The overall low transaction frequency and general illiquidity of housing markets can exacerbate and reinforce behavioural anomalies such as reference dependence. The knowledge gained through an empirical investigation in the UK housing market can assist in the understanding of these behavioural biases. By conducting an online experiment at a UK online panel data platform, we identify the presence of reference dependence in the UK housing market, and the extent to which they are caused by both historical and recent prices. The influence of expectations and social norms is also investigated in this novel context. The findings of this study pave the way for reliable economic modelling of such anomalies and a better understanding of behaviours in the housing market.


Introduction
Traditional economic theory predicts that people will maximize their expected utility under uncertainty and seek to make a rational decision by considering all relevant information. However, observations and experimental evidence across various disciplines have highlighted the existence of behavioural biases in the decision-making process. Many researchers have attempted to integrate psychological understandings into classical economic models to provide an explanation for such observed irregularities.
Reference dependence, in the context of prospect theory (Kahneman & Tversky, 1979), is the most popular of these explanations. The concept presumes that individuals exhibit behavioural biases because they evaluate choice outcomes as gains or losses around a reference point. As reference points are vital to the categorisation of a gain and loss domain, and therefore to overall choice outcomes, it is beneficial to obtain a good understanding of how they are formed and how they can be adapted. It is especially pertinent to study reference dependence in housing markets due to their unusual characteristics, such as high information asymmetry caused by many individuals' lack of housing market experience. Indeed, this could be exacerbated in the UK market, which recently saw homeownership fall to its lowest level in over 30 years (Evans, 2017). Therefore, since Arkes et al. (2008) find that cultural factors bear some weight in reference point adaptations, this study of the UK housing market fills an important cultural gap in the literature. Moreover, the overall low transaction frequency and general illiquidity of housing markets can exacerbate and reinforce the observed anomalies. However, the knowledge gained through an empirical investigation in the UK housing market can assist in the attenuation of these distortions.
Based on the analytical framework in Paraschiv & Chenavaz (2011), where reference dependence was studied in the French housing market, we conduct controlled experiments to investigate the presence of multiple reference points in the UK housing market. Our study adds to the behavioural housing studies literature in the following ways.
Firstly, we segregate respondents into sellers, buyers who are homeowners, and buyers who are renters. This is to recognize the potential effect of homeownership on housing decisions. We design separate questionnaires for buyers and sellers, with one respondent taking only one rule (i.e. a seller, an owner-buyer, or a renter-buyer). This avoids participants confusing their market roles, thus strengthening the robustness of the findings.
Secondly, in an important work on reference point formation and adaptation by Baucells et al. (2011), intermediate prices are the least important determinants of reference point formation and adaptation. However, Paraschiv & Chenavaz (2011) found that this is not true, at least in the French housing market. As the current literature is somewhat unclear on the influence of intermediate prices, we introduced two questions specifically testing the effect of historical peak and trough prices alongside the test of other intermediate prices.
Thirdly, we also considered two factors that are specifically relevant and important in housing markets: expected profits as reference points and the effect of social norms on the formation of reference points. At the time of writing, these factors have not been incorporated in reference point dependence studies in the housing market. However, given the fact that properties are seen as both consumption and investment goods, and are often used as a signal for social status, it is important to study the role of sellers' target profit level and the role of social comparison.
Finally, we used online panel data (OPD) in this study. This approach is a significant improvement over the traditional online survey method used in Paraschiv & Chenavaz (2011) in terms of sample representativeness and efficiency. An online panel is an electronic database of registrants who have indicated a willingness to participate in future web-based research studies. Examples of online panel data platforms include Amazon Mechanical Turk and Prolific. OPD and online panel participants have been gaining attention from researchers in social science studies. Given the cost and complexity of conducting behavioural studies in housing markets, it is worthwhile to explore the applicability of OPD in housing research. This is particularly true when the COVID-19 pandemic has already made virtual or distance activities the new norm for many aspects of our personal and professional lives. Our findings will be helpful for researchers to explore new data collection venues in response to these changes.
The rest of the paper is organized as follows. Section 2 provides a critical review of the literature. Section 3 details the analytical framework and the five testable hypotheses this study aims to test. Section 4 explains the experimental design and justification for the data collection method. Section 5 presents analysis of the data and reports the empirical results. The final section summarises the paper and offers areas for future research.

Behavioural housing studies
Although both Marsh & Gibb (2011) and Smith (2011) pointed out the potential of applying behavioural economics in housing studies, the challenges of finding suitable theories and empirical strategies are not to be underestimated. As Watkins & McMaster (2011, p. 286) said in the same special issue of Housing, Theory and Society, 'An urgent next step is to further elaborate on what an inter-disciplinary agenda might look like and how the insights generated might be reconciled to establish a coherent alternative (or addition) to the insights and stylized facts revealed by mainstream economists' (page 286).
To illustrate the many aspects of housing choice decisions that have been analysed with behavioural tools, we constructed Figure 1 based on the housing choice process diagram in Maclennan (1982). In this simplified version of a homebuyer's move-stay decision process, we listed important steps from the trigger factors of a moving decision to the final decision to move or stay. We found representative behavioural studies in each of the steps listed between 2010 and 2020.
The housing choice process starts with some trigger factors, such as job changes and downsizing. Behavioural housing studies focus on psychological factors that trigger move decisions. For example, Andrew & Larceneux (2019) investigated the effect of emotion in owning a new building apartment in influencing the decision to purchase off-plan apartments in France; Clark & Lisowski (2017) studied the role of endowment effects in decisions of whether to migrate or not; and Morrison & Clark (2016) discussed the relationship between loss aversion and the duration of residence. These psychological factors, along with other conventional triggers such as job changes, will lead an individual to collect information about how to finance the move, where to relocate, and what kind of property to buy. Studies show that personality and cultural factors significantly affect choice of mortgage products (Ben-Shahar & Golan, 2014;Mori et al., 2010). The advantages of using behavioural theories to explain irrational housing search decisions have been explored in Cardoso et al. (2019) and Dunning (2017). Evidence has also been found that expectations play a significant role in the decisions and outcomes of mega-event-or regeneration-led relocations in China (Wang et al., 2015;Yan & Bao, 2018).
The most important stage of the housing choice process is the 'Price offer' step, where the homebuyer places bids on identified properties. A wide range of behavioural factors have been investigated when studying how price offers are determined. For instance, Arbel et al. (2016) found that past price information serves as reference points in public housing tenants' decisions; Hung & So (2012) showed how much extra a loss-averse home buyer will pay for a house; and Levy et al. (2020) investigated the role of framing effects on homebuyers' estimation of future house prices. All of these behavioural factors can affect the possibility of price offers to be accepted by sellers, and eventually whether the move is successful.
The papers included in Figure 1 are not a comprehensive list of behavioural housing studies in the last decade. They are chosen to demonstrate the scope of the literature, instead of its depth. To comment on the depth of behavioural housing studies literature, we need to focus on prospect theory, because it is the only behavioural economic theory that has been widely employed and rigorously tested in the housing studies literature.

Prospect theory and reference dependence
The most influential theory used to emphasize the importance of psychological aspects when modelling choice behaviour is prospect theory, developed by Kahneman & Tversky (1979). Prospect theory illustrates how people's observed preferences systematically deviate from the predictions of standard economic theory (SET hereafter). Three fundamental behavioural aspects for modelling individual decision-making processes are presented by prospect theory: reference dependence, loss aversion, and diminishing marginal sensitivity. According to prospect theory, the utility derived from consuming good/service X can be estimated by the value function below.
where v X ð Þ is the value function based on the consumption of X, and r is the reference point. The reference point divides the value function into two parts: one for a gain domain where X is greater than r and another one for a loss domain where X is below r. k > 1 reflects that individuals' value function in the loss domain is steeper than that in the gain domain (i.e. loss aversion). 0 < a < 1 and 0 < b < 1 capture the effect of diminishing marginal sensitivity.
Equation (1) shows that the role of reference point is instrumental for the applications of prospect theory. Reference dependence assumes individuals evaluate the outcome of a choice based on positive or negative changes relative to a reference point. This is contrary to SET which assumes individuals judge the outcomes of decisions on final states of wealth or assets. The observed inconsistencies between actual behaviour and the predictions of SET have been the subject of much academic interest in the real estate domain. For example, studies show that the behaviour of housing valuers and appraisers can be subject to behavioural biases. Appraisers tend to seek evidence in support of initial beliefs they form during the valuation process (Clayton et al., 2001).
Although there is ample empirical evidence on reference dependence, the literature comes to little agreement on the nature of the reference point. Indeed, in their formalization of prospect theory, Kahneman & Tversky (1979) do not provide guidance or rules for how reference points are formed. This makes the identification of reference points an empirical issue, which has attracted much research interests in the past decades.

The search for reference points in housing markets
We now focus on studies of reference dependence in the residential property market. A total of 13 publications are identified from searches in the Web of Sciences database. A few papers, such as Genesove & Mayer (2001), study loss aversion instead of reference point formation. However, the crucial first step of loss aversion analysis is to define loss and gain domains correctly. This inevitably involves the identification of reference points. Consequently, important loss aversion studies such as Genesove & Mayer (2001) are included in our literature review because they have significant implications to reference dependence research. We summarize the research design and findings of these papers in Table 1.
Several observations can be drawn from Table 1. Firstly, housing decisions involve multiple reference points. Although most of the reviewed papers employed one single reference point, the choice of the reference point varies among these studies. For example, initial transaction information is the predominant reference point identified among these studies. Ten out of the 13 studies use initial transaction information, such as purchase price (Anenberg, 2011;Einio et al., 2008;Genesove & Mayer, 2001;Leung & Tsang, 2013;Paraschiv & Chenavaz, 2011;Sun & Ong, 2014) and rent paid in previous cities (Simonsohn & Loewenstein, 2006) as a reference point. Decision makers also refer to the most recent transaction information when determining property prices or rents (Arbel et al., 2016;Hirota et al., 2020;Paraschiv & Chenavaz, 2011;Sun & Ong, 2014). Other significant historical information, such as all-time high or low prices, are valid reference points as well (Paraschiv & Chenavaz, 2011;Seiler et al., 2008).
Secondly, the reference dependence literature tends to focus on either seller or buyer behaviour, with nine out of the 13 papers study reference points used by sellers in their decisions. Paraschiv & Chenavaz (2011) add an important dimension to the current research on property market behavioural biases by identifying that the reference point is influenced by the buyer or seller role. However, whilst their findings are illuminating, it can be argued they are specific to the French housing market and therefore may lack general applicability. More empirical evidence is needed to verify their findings.
Thirdly, there is a geographic shift in term of empirical investigations over the last two decades. Earlier behavioural housing studies draw conclusions primarily based on data from the US housing market (Anenberg, 2011;Bucchianeri & Minson, 2013;Genesove & Mayer, 2001;Simonsohn & Loewenstein, 2006) and EU countries (Einio et al., 2008;Paraschiv & Chenavaz, 2011). Recent investigations of reference dependence are mainly from Asian counties, such as China (Leung & Tsang, 2013), Israel (Arbel et al., 2014;Arbel et al., 2016), Singapore (Sun & Ong, 2014), and Japan (Hirota et al., 2020). Despite the UK's importance in the world economy in general and the global financial market in particular, empirical evidence from the UK is lacking. The only study using UK data is Scott & Lizieri (2012). However, the reference point investigated in the paper is a 'false' one: respondents' telephone numbers. Specifically, the paper shows that first-time homebuyers' pricing decisions are affected by irrelevant reference points (i.e. anchors). Although this is a useful exercise to verify findings from psychology studies in housing markets, the effects of valid reference points (e.g. purchase price and all-time high/low price) has not been verified in the UK property market yet.
To conclude, the identification of reference points in housing market is not straightforward. There is more than one reference point at work, and reference point dependence may differ between buyers and sellers, as well as vary among geographic regions. There are also evidence that the effect of reference points is moderated by gender (Seiler et al., 2008), age and income (Arbel et al., 2014), and market conditions (Paraschiv & Chenavaz, 2011). In the next section, we propose an analytical framework that incorporate all of these critical factors in the study of reference dependence in property price determinations. The framework is subsequently tested by using experimental data from the UK.

Analytical framework
Assuming a buyer/seller is trying to determine the market price of a property, the following equation can be constructed based on SET.
where P is the price that the buyer/seller is willing to pay/accept for the property, S is a collection of structural features of the property such as size and the number of bedrooms, N is a matrix of neighbourhood characteristics such as property crime rate and the distance to the nearest primary school, I denotes personal traits of the decision maker such as age and income, and e is the error term. Prospect theory suggests that one or several reference points should be included in the general function f Á ð Þ in Equation (2). This gives us the following equation.
where R is a matrix of reference points. In the literature, Equation (2) has been routinely estimated with linear regression and transaction data (i.e. hedonic price modelling). Using the same technique and data to estimate Equation (3) can be problematic for two reasons. Firstly, R is usually highly correlated with S, N, andI, and determined by S, N, andI as well. This gives raise to multicollinearity and endogeneity problems, both of which will bias the estimates. Secondly, the multiple reference points included in R are usually highly correlated among themselves too. It will be difficult to separate the net effect of each of the reference points considered. Finally, if any important attributes in S, N, andI are missing, omitted variable bias could lead to unreliable estimation of reference dependence.
To address these issues, we adopt controlled experiments in our analysis. This method allows explicit control of the participants' decision-making environment, making it possible to isolate the impact of important variables and identify any causal relationship between them (Rabin, 2002). This extra influence and ability to isolate variables is what makes controlled experiments one of the most widely used research methods in behavioural economics. For example, most of the studies of judgement bias in real estate valuation have been generated using experimental settings (Klamer et al., 2017).
In controlled experiments respondents are randomly allocated to a control group and a treatment group. The two groups are identical (on average) except for the treatment, such as the availability of historical transaction information. As a result, any difference in the average responses from the two groups can be attribute to the treatment, and a causal relationship can be established between the respond and treatment variables. If we let P C and P T be the average price of the control and the treatment group respectively, the difference between P C and P T will be entirely due to the effect of R: This is because the average value of S, N, andI are identical between the two groups. As a result, P C À P T ¼ f R ð Þ: A rejection of the null hypothesis of P C À P T ¼ 0 or P C ¼ P T means that R has a significant impact on prices. In other words, reference dependence presents when P C 6 ¼ P T or f R ð Þ 6 ¼ 0:

Testable hypotheses
Our research is developed from the experimental design in Paraschiv & Chenavaz (2011), which is the only paper in the literature that considers multiple reference points in one setting, recognizes the role of buyers and sellers, and allows the effect of moderators such as market conditions to be investigated. We extend the work in Paraschiv & Chenavaz (2011) by considering the effect of expectations and social norms in reference point formation. A total of five hypotheses are derived as outlined below.
Hypothesis 1: Initial purchase price is used as a reference point Often the most salient reference point is the price the property was initially purchased for, as it acts as a natural indicator for whether money has been gained or lost on a transaction. Although many reference points are context-dependent, initial purchase price is one of the few that has been proven significant across different contexts. For example, Leung & Tsang (2013) also found support for initial purchase price as the reference point using empirical evidence from the Hong Kong property market. It would seem, therefore, that initial purchase price has been well-studied and proven to be used as a reference point in a variety of contexts.
Hypothesis 2: Intermediate prices are used as reference points An intermediate price, for this study, is that price which is observed after the initial purchase price and before the current price of the property. This includes both linear increases/decreases and historical peaks/troughs in price. Paraschiv & Chenavaz (2011) determine that non-extreme intermediate prices have less of an impact than extreme ones, such as a historical peak, and show that individuals actually shift their reference points towards the historical peak. Yet, there are important empirical results that contradict this view. For example, Baucells et al. (2011) adapted the experimental design of Arkes et al. (2008) enabling them to directly observe the reference point formation process. The purchase price and current price of the time-series seem to be the main determinants of the participants' reference price, while intermediate prices are relatively unimportant. The implications of the current findings on intermediate prices are therefore somewhat contradictory. As such, this investigation of intermediate prices as reference points in a UK property market context is a useful and necessary extension of the current literature that can hopefully bring some clarity to the current findings.

Hypothesis 3: Recent prices are used as reference points
Recent prices take on a slightly different meaning for a buyer and a seller in this study. For a seller, a recent transaction price of a similar property may act as a reference point for their valuation of their own property. Whereas, for a buyer, knowledge of an alternative offer received for a similar property may act as their reference point. Consequently, for a buyer, information of an alternative offer for the property is introduced in the hypothetical scenarios, while information about a recent transaction price of a similar property is introduced for sellers. Paraschiv & Chenavaz (2011) find that the introduction of an alternative offer price shifts buyers' reference points further than when initial purchase price is introduced. This result suggests recent information is weighted more highly than past information in the formation of reference points, supporting the findings of Baucells et al. (2011).
We also investigate the effect of factors not previously applied to a property market context have on reference points. Specifically, this study will look at the role of aspirations in terms of an expected profit, and the moderating effect of social norms on reference dependence. Two more hypotheses are set up accordingly as follows.

Hypothesis 4: Expectations serve as reference points
The expected profit is defined as the level of profit that the seller would like to make between when they bought it and when they eventually sell it. There is ample evidence from the labour market to suggest that workers, specifically those that are free to choose the hours they work in a day, tend to serve a target level of income instead of maximizing their daily income (see, e.g. Camerer et al., 1997;Fehr & Goette, 2007). This seemingly illogical result can be explained by an expected daily income being used as a reference point. Home sellers will likely be affected by an expected profit because houses are both durable consumption goods and financial investments. In fact, this dual role of houses has significant implications for individuals' decisions regarding housing and non-housing consumptions ( Cern y et al., 2010;Hung & So, 2012;Yang et al., 2018). For example, properties can enjoy unearned land value uplifts resulting from wider societal and infrastructural change in an area, which may occur despite no aspect of the property changing. Such market characteristics may result in agents expecting to sell their properties for more than they bought them. Therefore, it is important to verify whether expected profit affects reference point formation among home sellers in the UK.

Hypothesis 5: Social norms influence reference point formation
The effect of social norms, or the tendency for people to compare their lifestyles to the lifestyles of their peers, will be investigated as well. Decisions are rarely made in the socially isolated conditions adopted by most of the literature in this area, and hence the social context should be considered in the field of decision-making under risk (Linde & Sonnemans, 2012). One conceptualization of the reference price is that it represents a normative price, that is, the price the buyer believes to be 'fair' for the seller to charge (Bolton et al., 2003;Campbell, 1999). Viglia & Abrate (2014) build on earlier work that looked at price sequences (Ariely & Zauberman, 2000;Baucells et al., 2011), but control for the presence of social influence. They find that individuals appear to be averse to paying much more than their peers as they would categorize this as a 'loss' when using a social reference point. The nature of housing markets may enhance the likelihood of individuals making social comparisons. Goethals (1986) notes that where there is no physical reality to compare to, people tend to make social comparisons to satisfy their need to evaluate their opinions and abilities. Therefore, the infrequent trading and lack of transparency in property transactions may invite social comparison among housing market participants. Consequently, this investigation into whether reference point formation is influenced by social norms is beneficial for gaining an improved understanding of what drives the observed behavioural irregularities in the housing market.

Experimental design
To test the hypotheses outlined in Section 3, we set up a series of one-factor controlled experiments based on the framework in Paraschiv & Chenavaz (2011), which is the only online experiment among the 13 papers included in Table 1. Conducting behavioural experiments online has the benefits of reaching respondents that other than university students, i.e. real decision makers. This is particularly important for housing decision studies, because university students are unlikely representative of the target population. We make adjustment to their experiment design by segregating sellers and buyers (which are further divided into homeowner and renter subgroups). Specifically, Paraschiv & Chenavaz (2011) asked each respondent to answer questions for sellers first, and then questions for buyers. In our experiment, respondents take one and only one role: seller or buyer. This approach keeps the number of questions and the length of experiment shorter, which is critical to obtain reliable answers in online experiment; it also avoids the potential spill-over effect when respondents cannot quickly switch their role between seller and buyer. Admittedly, this approach will substantially increase the cost of the experiment but is necessary to ensure the reliability of the study.
Although buyers and sellers are segregated, both are put through the same ten hypothetical scenarios with the wording changed slightly to reflect their market role. The first four questions have been adapted to a growing market and a declining market (see Table 2). The seller questionnaire asked for the minimum price the respondent would be willing to sell the property for; the buyer questionnaire asked for the maximum price the respondent was willing to pay for the property. From this point Intermediate Price Declining Two years ago, the property was worth £350,000 4 Alternative Offer Price Declining A similar property has just sold for £250,000 5 Benchmark Price Growing Similar properties trade between £450,000-£550,000 6 Initial Purchase Price Growing Four years ago, the property was bought for £400,000 7 Intermediate Price Growing Two years ago, the property was worth £450,000 8 Alternative Offer Price Growing A similar property has just sold for £550,000 9 Historical Peak -Four years ago, price of the property was £300,000. You wanted to sell (buy) but deal fell through. Two years ago, the property's price had risen to £400,000. Recent valuation of £350,000 by real estate expert. 10 Historical Trough -Four years ago, price of the property was £300,000. You wanted to sell (buy) but deal fell through. Two years ago, the property's price had fallen to £200,000. Recent valuation of £250,000 by real estate expert. 11 Expected Profits -Low -You bought a property for £500,000 several years ago. Set a target of earning at least £25,000 in profit when you sell. You now have to sell. Property currently valued between £525,000-£575,000. 12 Expected Profits -Mid -You bought a property for £500,000 several years ago. Set a target of earning at least £75,000 in profit when you sell. You now have to sell. Property currently valued between £525,000-£575,000. 13 Expected Profits -High -You bought a property for £500,000 several years ago. Set a target of earning at least £100,000 in profit when you sell. You now have to sell. Property currently valued between £525,000-£575,000.
Note: All questions asked for the minimum price they would sell the property for. The buyer questionnaire asked the same questions 1-10 (questions 11-13 only used for sellers) as the seller questionnaire, with wording changed slightly to represent a buyer decision. The buyer questionnaire asked for the maximum price they would pay for the property. Where important, the wording of the buyer questionnaire has been included in parentheses in Table 1 to ensure clarity.
onwards, the buyers' maximum buying price will be referred to as their 'willingness to pay' (WTP) and the sellers' minimum selling price will be their 'willingness to accept' (WTA).
To determine the presence of a behavioural bias, the first step is to establish a benchmark price according to SET. Buyers and sellers would behave rationally and refer to this market price to form their WTP/WTA. This piece of information is provided to all respondents in Questions 1 (in a declining market) and 5 (in an increasing market) as a price range of a property under consideration. Due to our natural tendence to maximize utility from the transaction, we expect that the average responds to Question 1 may be higher than that to Question 5. However, this potential discrepancy between sellers and buyers' benchmark prices will not affect our analysis, because we focus on changes relative to these benchmarks.
Questions 1 À 4 and Questions 5 À 8 are used to test Hypotheses 1 À 3 in a decreasing and an increasing market condition, respectively. First, initial purchase price is introduced in Questions 2 and 6, followed by an intermediate price in Questions 3 and 7, and finally an alternative offer price (buyer) or a recent transaction price (seller) in Questions 4 and 8. The first four questions form three pairs, which are effectively three one-factor controlled experiments. For example, Questions 1 and 2 are the control and treatment groups to test the effect of initial purchase price. Let WTA 1i and WTA 2i be the answers from respondent (buyer) i to Questions 1 and 2, respectively. WTA 2i À WTA 1i will be calculated for all respondents and used to test Hypothesis 1. Note that even if a respondent's WTA is affected by the housing prices in her own city, the size of this anchoring effect remains the same in WTA 1i and WTA 2i because they are reported by the same individual from the same city. Because the only difference between these two questions is the introduction of an initial purchase, we can be confident that any changes in reported WTA is due to the initial purchase price. The same logic applies to Questions 2 to 4, as well as Questions 5 through 8 for the increasing market condition. This incremental introduction of the information makes it possible to isolate the effect of specific information, as no other aspects of the scenarios are changed. Table 3 summarises which questions are used to test each hypothesis.
Questions 9 and 10 are used to test Hypothesis 4. The two questions have the same initial purchase price of £300,000. However, a historical high/low price that is £50,000 above/below the current market valuation is introduced in these two questions. If the average responses from these two questions are significantly above/below the current market valuation, there is evidence of historical high/low prices being used as reference points. The seller questionnaire then has three additional questions (Questions 11 À 13) that introduce the concept of an expected profit that the respondent wants to earn when they eventually sell the property. They are then given the range of market price of similar properties and asked what their WTA was. Hypothesis 4 was tested by comparing the average responses to these three questions. Specifically, if the reported WTA increases as the target profit level changes from £25,000 to £75,000, expected profits are used as reference points.
Whilst the first four hypotheses verify the role of different reference points in house price determination, Hypothesis 5 checks the moderating effect of social norms on these relationships. To test Hypothesis 5, we use the differences between reported WTA/WTP and SET predictions (i.e. behavioural biases) as the dependent variable, and social norm measurements as the independent variable in regression models. The coefficient estimates of social norm variables, if statistically significant, can be used to gauge the direction and strength of the moderating effect. We design three questions to determine the respondent's propensity for social comparisons. Initially, they were presented with a list of peers, for example 'a close friend', or 'a colleague', and were asked to select those which they had compared their lifestyle to. Participants were then asked how often they made these lifestyle comparisons and, on a scale of 0 to 10, how important it was that they were better off than the person(s) they had previously selected. The definition of these variables, along with other control variables in the regression models, are given in Table 4.

Online panel data
We used the online panel data (OPD) platform Prolific to conduct the experiment online. The number of published papers in social sciences that are using OPD has surged in recent years (Porter et al., 2019). Yet some researchers remain sceptical about the use of OPD platforms. Firstly, critics argue that participants are non-naïve to the studies, citing the emergence of 'professional survey takers' who participate in a high number of experiments and subsequently become aware of researchers' motives. However, studies have shown evidence that crosstalk between participants, as well as respondents intentionally attempting to participate in one study repeatedly, is nearly non-existent (Chandler et al., 2014).
Secondly, there are fears that the quality of online sample data is inadequate due to respondents' inattentiveness or lack of effort. However, substantial evidence suggests that attention levels of online participants either meet or exceed those from traditional sources (Behrend et al., 2011;Buhrmester et al., 2011;Crone & Williams, 2017;Goodman & Paolacci, 2017;Ramsey et al., 2016).
Finally, the representativeness of online participants has been questioned but the evidence compellingly suggests that online samples are more representative of typical working adults than traditional university student samples used in many behavioural experiments (Crone & Williams, 2017;Goodman & Paolacci, 2017;Peer et al., 2017). Indeed, Casler et al. (2013) compared the results of online crowd-sourced respondents with in-lab respondents who both completed a behavioural task. They determined the responses of the two groups to be equivalent and found the crowdsourced participants were significantly more representative. The most widely used OPD platform is Amazon Mechanical Turk (or MTurk in short). We choose Prolific over MTurk because its abundance of UK respondents. MTurk's respondents consisting of predominantly US (75%) and Indian (16%) citizens (Difallah et al., 2018). Prolific, on the other hand, have members from the UK primarily. As our objective is to collect empirical evidence from the UK housing market, Prolific provides us with access to the target population. Moreover, the extent to which we could pre-screen our sample was one of the main benefits of using Prolific. The platform allows us to pre-select respondents and classify them into homeowners and renters sub-sample and conduct experiment separately 1 . This filter facilitates the test of differences between the effects on first-time homebuyers and experienced homebuyers.

Empirical findings and discussions
In the last two columns of Table 4, we summarize the profile from the seller, the owner-buyer and the renter-buyer subsamples respectively. Respondents in the first two groups are very similar in all personal traits considered. The renter sub-sample are relatively younger, less educated, and with lower monthly income and housing expenditure. For example, none of the 195 renters paid more than £1,500 per month for housing. These patterns are consistent with the conventional profiles of homeowners and renters. It is worth noting that the proportion of females in all three subsamples are consistently high, i.e. between 69% and 75%. This is certainly not representative of the whole population in the UK. However, this is an indication of the profile of online panel members at Prolific and maybe in the UK as well. Although there is no evidence that there is a significant gender imbalance in other OPD platforms, the UK OPD platforms might be dominated by female members. Our empirical findings should be interpreted with this caveat in mind 2 . Table 5 summarises responses from the three sub-samples respectively. The 'SET Predictions' column contains the prediction of SET based on the information given in Questions 1 and 5 (i.e. the average market prices). The 'Manipulated Information' column gives the new information provided in each corresponding question (further information can be found in the last column of Table 2). The 'Average Reported Prices' column presents the average responses to each corresponding question. The 'Deviation from SET Predictions' column contains the difference between figures in the 'Average Reported Prices' column and the 'SET Predictions'. Finally, the 'Price Distribution' panel shows how reported prices in each question are distributed around the SET predictions, providing further information for the 'Deviation from SET Predictions' column. Specifically, the 'Lower', 'SET Predictions', and 'Higher' columns contain the proportion of responses that are below, equal, and above the SET predictions respective. Figures in the last four columns in Table 5 indicate clear deviation from the SET predictions in all questions across three subsamples.
To verify the statistical significance of identified biases in Table 5, we perform paired two-sample t test on each of the question pairs given in Table 2, and one- sample t test on the difference between average responses to questions 9 À 10 and their corresponding SET predictions. This evidence is used to test hypotheses 1 through 4, and the results are reported in Table 6. Hypothesis 5 is tested by using linear regressions, with differences in responses between question pairs in Table 3 (i.e. for questions 1 À 8, and 11 À 13) or differences between average responses and SET predictions (i.e. for questions 9 À 10) as the dependent variable, and social norm measurements as the key independent variables. The general model specification is given below.
The definition of each independent variable can be found in the first column of Table 4. Bias i is defined either as the difference between answers to question pairs or the difference between answers and the corresponding SET predictions. The absolute values (i.e. the absð:Þ function in the above equation) of Bias i is used because the 19.363 ÃÃÃ Hypothesis 4 (Expected profits) 6.913 ÃÃÃ Buyer (Owner) Hypothesis 1 (Initial Purchase Prices) Note: The first line in each sell shows the questions that are used in the test. The second line in each cell is the ttest statistics and its statistical significance, which is indicated by the asterisks. P-value < 0.01 ( ÃÃÃ ). direction of biases does not matter. We used natural log transformation of all dependent variables to facilitate the comparison of coefficient estimates among models. Because the minimum value of absðBias i Þ is zero (i.e. the reported WTP/WTA equals the SET prediction), we add one to the results so that the natural log transformation can be performed. In total there are 11 models estimated for sellers, and 8 models for buyers (owners or renters). These models are estimated by OLS, and the results are reported in Tables 7-9. The test of each hypothesis is discussed below.
5.1. Hypothesis 1: initial purchase price is used as a reference point The benchmark market price referred to in Table 5 is established in Question 1 and Question 5 in a declining and growing market respectively. The reported responses in both questions are close to the SET predictions (i.e. an average market price of £300,000 for Question 1 and £500,000 for Question 5). This means that both sellers and buyers are able to predict market price rationally, in the absence of information manipulation. Question 2 and Question 5 then introduced the initial purchase price. All three groups of respondents changed their WTA/WTP accordingly. For example, sellers increased their WTA by more than 25% in a declining market to be closer to the initial purchase price, while renters' adjustment is more modest (i.e. 12.25%) in a similar setting. Initial purchase price has a stronger reference point effect in a declining market than in a growing market, for both sellers and buyers, and affects sellers' reference point formation more than buyers, particularly in a declining market. Note that the responses from buyers are very similar between the owner and renter subgroups, which means homeownership did not change the effect of initial purchase price. Despite the differences in the responses among sellers and buyers, as well as in different market conditions, the effect of initial purchase price in reference point formation is statistically significant across the board, as can be seen in Table 6.
The results support Hypothesis 1, that initial purchase price is a reference point, as sellers'/buyers' WTA/WTP deviates from SET predictions. As such, there is a clear behavioural bias and home sellers/buyers do not rationally refer to the average market price as SET would predict.

Hypothesis 2: intermediate prices are used as reference points
We then introduced a non-peak intermediate price in Question 3 and Question 7 for a declining and growing market respectively. Sellers and buyers altered their WTA or WTP towards this price accordingly. For example, in Table 5 the average WTA reported in Question 3 was 17.67% higher than the SET prediction now, or a downward adjustment of 9% from the average WTA reported in Question 2. This highlights that when the intermediate price is closer to the current market price or SET prediction, sellers adjusted their WTA towards it. The price distribution also suggests that a higher proportion of respondents reported a WTA closer to the SET prediction, suggesting the intermediate price is being used as a reference point. The use of an intermediate price as a reference point seems to persist, regardless of market conditions. The relevance of a historical peak/trough in price was also investigated. Question 9 introduced a scenario whereby the seller's property had increased in value to a peak in price several years ago, but the market had subsequently fallen to the low current price. Question 10 followed the same structure but in the opposite market scenario, to elicit the effect of a historical trough in price. The findings indicate that information about a historical peak in price do lead to distortions in WTA/WTP. Table 6 shows that the differences between the reported WTA/WTP and the average market price in both scenarios was highly statistically significant in the sellers and buyers (owners) group. The effect is weaker in the renter subsample, with significant effect identified for historical low price only.
The findings are consistent with the conclusions in Paraschiv & Chenavaz (2011), but seemingly run counter to the results of Baucells et al. (2011) who found relatively weak to no influence of historical peak and trough prices. However, their study uses stock prices rather than property prices, suggesting the disparity in results could be a consequence of the asset market studied. Price information is more valuable in the property market due to infrequent trading and lack of transparency in valuation compared to the stock market. This interpretation is supported by Seiler et al. (2008) who noted that, in the real estate market, sellers experienced higher regret if they failed to sell at an all-time peak in price. Therefore, we conclude the results support Hypothesis 2; intermediate prices are used as reference points, and this remains the case when the intermediate price is a historical peak or trough.

Hypothesis 3: recent prices are used as reference points
Additionally, the results support the idea that recent prices were used as reference points. For sellers, Question 4 introduces information about a similar property that recently transacted below the market price. Following the introduction of this information, sellers' average WTA drifted by 3.81% below the SET prediction. Indeed, just over 82% of sellers do not refer to the average market price when they are aware of a recent transaction price. 57.36% of this deviation is accounted for by sellers reporting a WTA below the market price, clearly showing their behavioural bias has been influenced by the recent price. Moreover, in a growing market the recent price also produces a statistically significant deviation in WTA, but in an opposite direction. The same patterns are observed for the two groups of buyers as well. All observed deviations from SET predictions are statistically significant at the 1% level, as reported in Table 6. As such, it can be said that the results confirm the notion that recent prices influence respondents' WTP/WTA by acting as a reference point.

Hypothesis 4: expectations serve as reference points
Questions 11, 12 and 13 were used to test Hypothesis 4. Each question instructed the seller to imagine that when they bought their property, they had wanted to earn a certain level of profit on it when they eventually sold it. The desired level of profit was increased in each subsequent question. Because the information in these three questions is identical except for the target profit, we use the difference between questions to test the influence of expected profits. Specifically, if the average WTA in a question is higher than that from its preceding question, there is evidence that expectations serve as reference point.
Indeed, the responses do seem to be influenced by the introduction of an expected profit, as reported in the last three rows in Table 5. The results show that sellers adapt their WTA towards the expected profit, regardless of its size, suggesting that expectations influence the behavioural biases of sellers in the property market. By performing matched two sample t tests on the responses, it is possible to determine whether the reported WTAs differ significantly from SET predictions. Table 6 illustrates that for all levels of expected profits, the observed differences are statistically significant at the 1% significance level.

Hypothesis 5: social norms influence reference point formation
Equation (4) is estimated by using OLS for each of the behavioural biases calculated based on Questions 1 À 13. The dependent variable for each model was the difference in WTP/WTA caused by the introduction of each new piece of information. Social norm variables are the key independent variables in this analysis. Participants were also asked a series of basic demographic questions which were used to control for these characteristics during the regression. Since the answers to these questions consisted primarily of categorical data, we translate them into a series of dummy variables (see Table 4 for variable definitions and descriptive statistics). Regression results are reported in Tables 7-9 for sellers, buyers (owners) and buyers (renters) respectively. The discussions in this section are based on coefficient estimates that are statistically significant at the 10% level or lower, as highlighted with asterisks in the tables. Three observations can be drawn from these models.
First, the behavioural bias is primarily psychological and can only be marginally explained by observable demographics factors. This is evident from the low R squares reported for all models. Specifically, these factors can explain between 8.88% to 18.85% of the variation of the observed behavioural bias. As the data were generated in a controlled environment that allows little room for other confounding factors, we conclude that the observed bias is indeed primarily due to the psychological responses to each of the reference dependence determinants introduced in the experiments.
Second, the size of the effect of demographic factors is not negligible. For example, males and older people are more rational in housing decisions in general; higher education attainment seems to help reducing the size of bias for both sellers and buyers. The absolute values of these coefficient estimates are all greater than one, which translates to 100% changes from the base category due to the natural log transformation of the dependent variables. The moderating effect of these demographic factors on behavioural bias is economically significant. Note that the proportion of female respondents is high in all three subsamples (see Table 4). Therefore, it is possible that some of the behavioural biases identified in our sample, such as the effect of initial purchase price on sellers in a decreasing market, are over-estimated.
Finally, social norms can influence the size of behavioural bias, but the size of the social norm effect is relatively small among homeowners. We examine the influence of social norms by identifying possible social groups to compare with (i.e. variables SOCIAL1 -SOCIAL5, representing a close friend, a colleague, a neighbour, a family member, or oneself in the past respectively) and the level of social comparison (i.e. variables FRE1 and FRE2 to measure the frequency of comparison, and IMP to measure the importance of social comparison to each individual). We find that comparing to a neighbour actually leads to the smallest behavioural biases among all social groups considered. This pattern holds true for all homeowners (sellers or buyers). Using other social groups tends to increase behavioural bias in housing decisions, but the pattern is weaker. On the other hand, comparison frequency does not affect the size of the bias, but the self-reported importance of social comparison does. This is particularly true for buyers (owners), because the coefficient estimate of IMP is positive and significant in five of the eight models estimated. The size of these coefficient estimates, albeit statistically significant at the 10% level, are generally smaller than that of the demographic factors. For example, the coefficient estimate of IMP in the first model in Table 7 is 0.24. Because the standard deviation of IMP is 2.70 for sellers, a one standard deviation change in IMP will lead to a 64.8% change in the bias caused by initial purchase prices. This is small when compared with the coefficient estimates of most of the demographic factors.
The effect of social norms on behavioural biases is different for renters in both direction and magnitude. Firstly, comparison frequency has a negative relationship with behavioural biases, which means individuals who make more frequent social comparisons are less reference dependent. Second, the effect of comparison frequency variables is larger than that of control variables. We do not identify a clear pattern among the regression models reported in Table 9. One of many possible reasons is that on average renters are younger and inexperienced. This makes them more susceptible to psychological biases. It is also an indication that renters' response to social norms may be different from homeowners, although we do not observe any differences between homeowners and renters in other aspects of our analysis.
In summary, we found some weak support for Hypothesis 5 and conclude that social norms may influence reference point formation. A qualification to this conclusion is that social norms seem to be more influential for certain reference pointsspecifically the reference points that are formed from new and contemporary information, such as a recent transaction price or an alternative offer price. However, this conclusion does not hold for the renter group, where more empirical investigation is needed for further verification.

Conclusions
This paper presents a controlled experiment to gain insight into reference dependence in the UK housing market. We use an online panel platform to conduct experiments with potential home buyers and sellers in the UK. Both homeowners and renters are included in our sample. The results of the experiment clearly indicate that buyers and sellers in the UK housing market use the initial purchase price, intermediate prices, recent prices, and expected profits as reference points, leading to seemingly irrational deviations in WTP/WTA. We also find some weak evidence that an individual's propensity to compare their lifestyle to others can influence the formation of their reference points.
We hope our study will encourage more empirical investigations of reference point dependence in housing markets. Although reference dependence is by far the most extensively explored behavioural topic in housing economics, the number of publications is still too small to rule out publication bias, i.e. a result of any tendency to publish only statistically significant findings and replications. In order to verify whether the effect of a given behavioural factor is really significant, a good number of replications is necessary. This has not yet been achieved in the study of reference dependence in housing decisions. To illustrate, whilst Kahneman & Tversky (1979) is the most cited paper in the publication history of Econometrica, with over 22,000 citations in the Web of Sciences database as of February 2021, we found the number of papers using prospect theory to test reference dependence in the housing market is only 13.
To move forward, the next step is to investigate the relationship between identified reference points. The objective of our study is to confirm the presence of multiple reference points in housing markets. Our experiment design does not allow the assessment of the relative importance (i.e. the weights) of identified reference points. Baucells et al. (2011) is a good framework for such analysis. However, it is difficult to implement in online experiments because of the length of the questionnaires. For example, respondents in Baucells et al. (2011) had to answer 60 questions, each of which is based on a series of initial, intermediate, and current prices of a stock. Although the design facilitates the estimation of the weights of multiple reference points, it is very demanding to implement outside of a laboratory where respondents can be supervised to fully engage in the experiment. There is a delicate balance to strike between internal and external validity in behavioural housing economics, and this is a good example of such a challenge.
Finally, behavioural housing studies is not a replacement for, or even a challenge to, established knowledge of housing market economics, which is largely based on the SET. To the contrary, it helps us to better understand housing decision processes by making sense of seemingly irrational behaviours in housing markets. Researchers have already noted asymmetric patterns among housing market agents at different stages of housing cycles decades ago (e.g. Evans, 2004;Meen, 2002;Muellbauer & Murphy, 1997). What is missing is a convincing explanation of these anomalies, which are still the topics of ongoing debates (see, for example, Alqaralleh, 2019;Alqaralleh & Canepa, 2020;Andre et al., 2019;Hu & Lee, 2020;You, 2020). Prospect theory is a potential solution, because its assumption of different responds in the loss and gain domain (e.g. loss aversion) fits the behaviours in housing markets nicely. However, to test the presence and effects of loss aversion in housing decision making process, the first step is the reliable identification of reference points. Building on this paper's application of novel reference points to the UK property market, research into newly considered reference points can be integrated into existing housing models to enhance their predictive power. Therefore, our study is part of the effort to take us from we have already known (e.g. the importance of expectations in housing decisions) to what we may learn based on new theories (e.g. a behavioural explanation of asymmetric patterns in housing prices).

Notes
1. The filters are not visible to potential participants and works on existing information in user profiles. Hence workers (i.e., members to answer questionnaires at Prolific) cannot change their profile in order to meet the selection criteria in a specific experiment. 2. We divided the sample into male and female subsamples and explored whether there is a significant gender divide. We found that the two subsamples are similar in most demographic, social, and economic aspects, and empirical results by gender are consistent with those based on the whole sample. Results of these gender analyses are available from the authors upon request.