Automatic thermal model identification and distributed optimisation for load shifting in city quarters

ABSTRACT Buildings with floor heating or thermally activated building structures offer significant potential for shifting the thermal load and thus reduce peak demand for heating or cooling. This potential can be realised with the help of model predictive control (MPC) methods, provided that sufficiently descriptive mathematical models of the thermal characteristics of the individual thermal zones exist. Creating these by hand is infeasible for larger numbers of zones; instead, they must be identified automatically based on measurement data. In this paper an approach is presented that allows automatically identifying thermal models usable in MPC. The results show that the identified zone models are sufficiently accurate for the use in an MPC, with a mean average error below $1.5{\rm \; K}$1.5K for the prediction of the zone temperatures. The identified zone models are then used in a distributed optimisation scheme that coordinates the individual zones and buildings of a city quarter to best support an energy hub by flattening the overall load profile. In a preliminary simulation study carried out for buildings with floor heating, the operating costs for heating in a winter month were reduced by approximately 9%. Therefore, it can be concluded that the proposed approach has a clear economic benefit.


Introduction
Efficiently providing energy (heating, cooling, and electrical power) to modern city quarters is a challenge, especially when considering the increased share of fluctuating renewable energy sources.In (Moser et al. 2020) a modular optimisation-based energy management system (EMS) was presented which coordinates the producers with renewable sources, thermal energy storage, and consumers.By specifying the components available to the energy hub, their nominal operating conditions and how they are connected with each other, an optimisation problem is automatically formulated and solved to provide the optimal operating strategy for supplying the city quarter with heat and cooling.By repeatedly performing these calculations using the newest measurement data, the EMS was able to react to unexpected behaviour (model predictive control methodology, MPC).However, the heating and cooling load profiles were assumed to be fixed, and forecast algorithms (Nigitz and Gölles 2019) were used to predict them.The only flexibility available to the EMS was provided by a thermal energy storage and a battery.
In reality, however, buildings have a considerable thermal capacity and can act as thermal storage if operated intelligently, see, e.g.(Kensby, Trüschel, and Dalenbäck 2015).This flexibility allows to shift thermal demand to reduce the production costs by flattening the overall load profile, or to pre-heat or pre-cool the building to use, e.g. a heat pump at times of high solar yield from photovoltaics.To do so, mathematical models of the thermal dynamics of individual thermal zones need to be identified so that they can be incorporated in the supervisory EMS.These models describe how heating and cooling, solar radiation and outside temperatures influence the zone temperature.They can then be used to ensure that shifting the demand (also called demand side management, DSM) does not lead to violations of temperature comfort levels by predicting how changes in the operation will influence the individual zone temperatures.
The problem with this approach is two-fold: First, the type of mathematical model chosen determines the possible precision of predictions, the possibility to integrate external knowledge such as weather forecasts or handle unknown disturbances such as internal gains, and influences the complexity of the resulting optimisation problem.Simpler models, e.g.linear models, are advantageous since they can be easily integrated into optimisation, but lack the possibility to correctly handle non-linear aspects such as, e.g. the inherently non-linear effects of temperature variations together with mass flow variations.The model type must therefore be chosen with care.Second, the mathematical models need to be parametrised to closely match the dynamic behaviour of the real zones.This parametrization should require as little user input as possible to make the approach feasible for larger buildings or even city quarters with hundreds of zones and corresponding models.
Finally, once the mathematical models are available, they need to be integrated into the energy management system.This could be by combining them all in a single optimisation problem; however, this does not scale well to hundreds of zones in city quarters.Formulating the optimisation problem as a distributed optimisation problem, on the other hand, provides the possibility to split the global optimisation problem into smaller sub-problems that can be handled by individual, less powerful computers, e.g. on a building level.However, it requires additional communication overhead and comes at the cost of an interactive algorithm.
In this article both problems inherent to performing load shifting, or demand side management, on the larger scale of big buildings and city quarters were tackled: The selection of mathematical models to describe the temperature dynamics in individual zones and their parametrization with as little user input as possible; and combining those models in a distributed optimisation that scales well to many zones.The following sub-sections give an overview of existing literature on both subproblems and how our approach is different, followed by a concise statement of our contributions and an overview of the structure of the article.

Building model identification
Much research has already been carried out in the sector of thermal building model identification.In (Sourbron, Verhelst, and Helsen 2013) a second-order and a fourth-order resistive capacitive (RC) equivalent model, are introduced; the lumped RC parameters of the model are estimated using grey-box identification methods.The problem with such linear grey-box models is that they may provide a good fit for the training data, but result in parameters that are physically meaningless.(Sourbron, Verhelst, and Helsen 2013) state that using even the most ideal data set for estimation led to unwanted results such as, e.g.near zero heat transfer coefficients and/or overestimated influence of solar irradiation.While relying on a similar model type (third-order RC equivalent model), the method described in this paper relies on a different learning technique focusing on the usage in MPC, model predictive control-relevant identification (MRI), see, e.g.(Žáčeková, Prívara, and Váňa 2011).The general idea of MRI is to estimate models not based on the deviation of the model's predictions to real measurements for the next time step only, but for a whole simulation period, resulting in models that perform better with regards to their control performance.However, this makes the method inherently non-linear, even if the models themselves are linear, requiring capable solvers for the parameter estimation.
In (Prívara et al. 2013), a detailed overview of different building model identification approaches and a new methodology using a co-simulation approach are presented.They argue that temperature measurements used for model identification suffer from a co-linearity.This means that the temperatures in a controlled building are very similar in time and make the estimation problem ill-conditioned.On the other hand, performing suitable identification experiments, i.e. varying the inputs to the building drastically to excite its dynamics, can be quite time consuming and, if the building is already operational, requires acceptance by the inhabitants (Široký et al. 2011).Therefore, the authors in (Prívara et al. 2013) present an approach where such experiments are carried out using a detailed building simulation tool such as, e.g.EnergyPlus.This, on the other hand, assumes that such a simulation model exists.Even though building information models (BIM) are becoming more widespread in city quarter design processes, whichusing third party add-onscould be imported in EnergyPlus, investigations made by the authors of this work have shown that the resulting simulation models both typically lack some detail and are not precise enough to act as surrogates to the real buildingin general, no simulation model that is not validated using measurement should be used for real operation.Additionally, BIM models are currently not standardised to such a high extent that all information which is needed for a working simulation model is present and hence, manual user input, e.g.detailed properties of the used construction materials, are needed.This necessity for manual input is not always acceptable in real-world applications and should, if possible, be avoided to save costs.
Another approach based on detailed building models is presented in (Sturzenegger et al. 2016), where a RC-equivalent thermal building model is auto-generated from an EnergyPlus simulation model using the BRCM toolbox (Sturzenegger et al. 2014).Since all zones and the heat exchange between them are considered in the resulting model, its order is very high (up to 35 states for an individual zone) and needs to be reduced to be used in optimisation.Again, the result only approximates reality and requires additional tuning to be usable in a real-world control context.
In (Coninck 2015), a tool chain for automated deployment of MPC in buildings based on datadriven, grey-box building models is presented.The model identification part of the presented tool chain was made freely available in the FastBuildings library (Coninck 2017).However, the models generated by the FastBuildings library may be non-linear and time varying, making them less suitable for large-scale optimisation problems and linear MPC.
In (Zagoni 2020a), a promising identification approach inspired by genetic algorithms and machine learning is presented.Unfortunately, the investigated RC models do not consider the influence of solar irradiation and internal gains.While the developed method is provided via an opensource Python package called DarkGreyBox (Zagoni 2020b), and could be extended to do so, the question remains whether the resulting model can avoid overfitting and correctly predict the effects of different operating strategies.
In (Boodi et al. 2020), a model identification approach using a stochastic particle swarm optimisation (PSO) is proposed.The identified RC-equivalent model yields very accurate results, however, the search space of the PSO algorithm was limited using knowledge from building material properties, hence requiring specific knowledge about the specific building.The approach presented in this paper only poses minimal restrictions on the model parameters (positiveness, for example) and requires no such knowledge.

Distributed optimisation for demand side management
The topic of thermal demand side management deals with active load shifting.If the main goal is to reduce peak loads, this is also called peak load shaving, but the concept is more general, e.g.shifting the demand to times where heating and cooling can be provided at lower costs.In literature this is often used in the context of district heating (DH) networks, where each consumer throughout the network is controlled, see, e.g.(Guelpa and Verda 2021).The same concepts may also be applied to larger city quarters since they essentially are small DH networks.The problem is always that the requirements of many zones or substations need to be considered at the same time.If optimisation-based concepts are used, the overall control problem might become too big to solve in real-time.Hence, a popular approach in literature is the use of a two-stage method, where in the first stage each individual thermal load is optimised, e.g. with the goal of guaranteeing comfort in each zone.In the second stage all the individual load contributions are aggregated and an optimisation-based EMS is used to compute an optimal schedule for the production, see, e.g.(Capone and Guelpa 2023).However, incorporating more complex goals, like shifting the overall load towards renewable generation is a challenge.This is only really possible if the demand side and production side are considered at the same time.
Although the resulting optimisation problem can be very large, it has a special structure to it.If the thermal distribution is neglected, the individual thermal demands (of each thermal zone) are simply aggregated and form the total demand.In an optimisation-based approach this would mean that there are essentially many individual optimisation problems for each thermal demand and one for the production, which all are coupled by very few constraints.This special structure can be exploited via distributed algorithms.In (Parisio and Gutierrez 2018), an active-set method is applied to the thermal DSM for buildings.However, the overall control problem is considered to be linear with continuous optimisation variables, which only approximates the real behaviour of, e.g.heat pumps and boilers that are switched on and off.Taking this into account requires binary variables, resulting in a mixed-integer linear problem.Another popular distributed optimisation technique is the alternating directions of multipliers method (ADMM), see, e.g.(Boyd et al. 2010).ADMM allows separating problems with complicating equality constraints into smaller, easier-to-solve subproblems, which may be solved locally on embedded devices in real-world applications.The idea here is to relax the coupling constraint via Lagrange multipliers and iteratively enforce a consensus between all subproblems.A major benefit of ADMM is that it makes no assumptions on the class of the subproblems as long as they are convex, so some kinds of non-linear problems can be considered as well.While there are no convergence guarantees, results in (Kaisermayer et al. 2020) suggest that mixed-integer linear problems can also be handled via some algorithmic changes.This paper thus investigates the suitability of applying ADMM to the specific case of demand side management encompassing multiple buildings with multiple zones.

Contribution and structure of the paper
The main contributions of this paper are (a) the introduction of a purely data-driven building model identification approach that does not require any user inputs and considers all relevant influence factors, i.e. solar (façade) irradiances, ambient temperature, feed temperature, (constant) internal influences as well as envelope and heating dynamics, including its validation using a detailed, state of the art simulation model, and (b) the evaluation of ADMM as an algorithm for scalable demand side management based on the identified models.
The rest of this work is structured as follows: the methodology followed to derive the results presented in this paper, together with the methods used for building model identification and the distributed optimisation, are discussed in Section 2. To validate the presented methods, a simulation study was performed that is presented in Section 3. The results of this study are presented in Section 4. Finally, a summary of the presented work, conclusions derived from the results and an outlook to further research are given in Section 5.

Methodology and methods
The methods presented in this paper are the result of the following methodology: Different types of building models and model identification algorithms, using measurement data from individual zones of a detailed simulation model, were investigated.The best combination of model and identification method is presented in Section 2.1.The models were then incorporated in an EMS using a distributed optimisation technique based on ADMM as described in Section 2.2.A simulation study on a larger scale, i.e. on a detailed simulation model of an entire city quarter, was performed to validate the overall results of modelling and optimisation by investigating the economic benefits while ensuring thermal comfort.The simulation model used and the economic evaluation method are outlined in Section 3; the results are explained in Section 4.

Building model identification
Each building typically consists of multiple thermal zones whose temperature can ideally be measured and controlled individually.While these zones influence each other, we assumed them to be independent of each other to simplify the model structure and thus the resulting optimisation problem.A physically motivated model with a priori unknown parameters (grey-box model) is used to describe the thermal dynamics of each zone.This thermal zone model and the method used to identify the parameters are described in the subsequent Sections 2.1.1 and 2.1.2,respectively.All symbols used for the building model identification methodology are summarised in Table 1.

Thermal zone model
The proposed (non)-linear thermal zone model is based on the widely used T i T e T h model, see, e.g.(Bacher and Madsen 2011) and (Zagoni, 2020a): where T i is the inner (or zone) temperature, T e is the envelope (or wall) temperature, T h is the heating (or floor) temperature, T f is the feed (or inlet) temperature, T a is the ambient temperature, I g,o with o e{north, east, south, west} are the solar façade irradiances and u h is the relative heating control action.The model parameters to be identified are l ei , l hi , l ie , l ae , l ih , l h , b and a o .Even though it is not directly visible, model ( 1) is basically an RC-equivalent model, where the l-parameters are lumped RC expressions.The differences between model (1) and the typically used T i T e T h model are: (a) the consideration of (constant) unknown external or internal disturbances b, (b) the splitting of the solar irradiance into four façade irradiances, and (c) the consideration of the water inlet temperature.For using model (1) in a distributed optimisation, see Section 2.2, the third state equation is linearised to reduce the computational effort when many zones need to be optimised at the same time: where T h is the optimal linearisation point of the floor temperature T h , which can be treated as an additional model parameter, and T f is assumed to be constant.

Model parameter estimation
The parameter estimation is done by posing an optimisation problem whose optimisation variables are the desired model parameters.The objective of the optimisation is to find the best model fitin the least squares sensecomparing the measurements of the zone temperature and the model state T i .To get a model especially suited for the use within an MPC, the model predictive control relevant identification concept, see, e.g.(Žáčeková, Prívara, and Váňa 2011), is used.In MRIin contrast to the often-used prediction error method (PEM), see, e.g.(Prívara et al. 2013), where only the onestep-ahead prediction is optimiseda multi-step-ahead prediction is considered in the objective function, which satisfies the needs of an MPC for having a good prediction over the entire horizon.
Performing such a multi-step-ahead prediction for every time step results in high computational effort.As a trade-off between model quality and computational effort, the data is split into l overlapping data slices of the size of the prediction horizon, N p , and the model fit is evaluated for each slice individually.A drawback of this procedure is that start values x 0,h of the state vector x := T i T e T h need to be estimated for every data slice h = 1, . . ., l.One possible way to determine all n • l start values, where n = 3 is the number of states of model (1), is to add additional optimisation variables.This, however, increases the computation effort of the optimisation problem proportional to the number of data slices l.Moreover, this enables the optimiser to 'abuse' the start values in each slice to improve the model fit in an unphysical way.
Another way is to integrate a filter, that estimates the initial state vector for each slice, into the objective function evaluation.As such a filter, a Kalman Filter (KF) is a widely used.It is an algorithm that uses a copy of the system model and measurement data to estimate the current state of the system, which typically is not measurable by itself (e.g.here only the zone temperature T i can be measured, while the KF needs to estimate the entire state x).Since the KF already requires a system model, it needs to be initialised with a best guess based on, e.g. an ordinary identification scheme such as least squares / PEM, or random initial values within physically meaningful bounds.Then, when iterating over different model parameters while trying to minimise the objective function (the sum of squares of the model error for each MRI slice), the KF model is updated with the currently newest model parameters, i.e. the KF really is an integral part of the parameter identification process.The initial model state x 0 of the KF is set by making a reasonable guess, i.e. by setting T i to the measured zone temperature, and by setting T e in between T i and T a and T h in between T i and T f .Then a small part at the beginning of the measurement data is required to guarantee the filter's convergence to meaningful start values for the first MRI slice x 0,1 .This is especially important for T e and T h , which were just set to the reasonable guess defined above.The trust region of the initial filter state can be set by choosing the diagonal elements of the initial covariance matrix of the KF; in this case, they were set to a trust region of 5°C, meaning that our initial guess is trusted +5 • C, i.e. they need only be tuned once as part of the method design; adapting the matrices to individual applications could further improve the model quality, but would require manual tuning and thus go against the principle of full automation.
Figure 1 illustrates the final data slicing and start value extraction from the KF during parameter estimation.The actual MRI hyperparameters (length of the prediction horizon, sampling time, etc.) used within this research are outlined in Section 4.1.
Within the objective function, the linearised zone model is implemented using time-continuous ordinary differently equations (ODE) modelling (Rackauckas and Nie 2017) in the Julia programming language (Bezanson et al. 2017); this general approach allows the easy investigation of different, also non-linear, models.The resulting optimisation problem needs to be solved using a gradient-free solver due to the KF calculationswhich include matrix inversions within the objective function.For such gradient-free parameter optimisation problems, stochastic algorithms are widely and successfully used.Here, the stochastic NOMAD algorithm (Montoison, Pascal, and Salomon 2020) was used, which was faster than other tested stochastic solvers such as, e.g.differential evolution (DE) because NOMAD can 'remember' unsuccessful regions and will step out of them if (random) steps lead there (again).Using the presented approach makes it possible to identify any kind of thermal zone as long as the zone temperature T i , the weather conditions T a and I g,o , and the heating actuation u h are available as measurements or can be easily calculated from other measurements (e.g. the façade irradiances I g,o are typically calculated from the global horizontal irradiance (GHI) using solar irradiance models).

Distributed optimisation for demand side management
Once the thermal models of each zone are identified, they can be used to predict the room temperatures based on different possible heating and cooling scenarios.When optimising each thermal zone to use its thermal inertia to flatten the overall load profile, the optimisation problem becomes very large if each zone is considered separately by a centralised EMS.Depending on the class of the thermal model (linear, mixed-integer linear, or non-linear) and the number of zones, the overall optimisation problem might not be solvable in realtime.If the individual zones are assumed to be independent of each other (the neighbouring zones are assumed to be well-controlled at the same or at a similar setpoint temperature), the only coupling is due to the shared heat or cooling supply from the energy hub.If the thermal model equations from Section 2.1.1 are discretized in time using forward Euler discretisation with sampling time t S , the overall control problem for the building EMS can be stated as Figure 1.Data slicing for the model relevant identification (MRI) approach using a Kalman Filter (KF) for the generation of the required start values x 0,h for each data slice h = 1 . . .l.In order to obtain meaningful start values for the first MRI slice x 0,1 , a small part at the beginning is required to ensure the filter's convergence. follows: where T i,j,k , T ref,j,k , T i,j,lb , T i,j,ub and u j,k are the zone temperature (the first entry in x j,k ), reference temperature, lower and upper comfort band and the control signal of the j-th zone, respectively.D k are the uncontrollable inputs (ambient temperature, solar irradiance, …) for the k-th time step, and Q j are the parameters for the j-th zone.N z is the number of considered zones, and N p the prediction horizon.An overview of all symbols new to this section is given in Table 2.
The cost function J in (3) consists of two parts.The first part ensures that each zone minimises reference tracking costs as well as costs for leaving the comfort band.The second part is the central coordinator that minimises the distance between the overall maximum and minimum heat flow, which essentially flattens the total demand Qtotal,k .The heat flow to each zone is calculated using the relative heating control action u j,k times the maximum heat flow for each zone, Q j,max .
Depending on the number of thermal zones N z , problem (3) can have many optimisation variables and hence be hard to solve in real-time.However, it exhibits a special structure: It essentially combines N z separate optimisation problems plus an optimisation problem for the central coordinator.These optimisation problems are only linked through a single coupling constraint.
The alternating directions of multipliers method (ADMM), see, e.g.(Boyd et al. 2010), can be applied to problems of the following form: minimize where there are only two blocks of variables z 1 and z 2 .Each block is subject to local constraints h i and g i and both blocks are coupled together by the coupling constraint H.The ADMM algorithm for problem (4) is: where n are the so-called Lagrange multipliers of the coupling constraints in H, and m is the iteration counter of the iterative algorithm.The sets X i represent the subspaces defined by the local constraints h i and g i .
In order to apply ADMM to problem (3), it has to be transformed.First, problem (3) can be stated more generically as: where the first N z sub-problems represent the thermal zones and the last one the overall peak-load shaving: The local optimisation variables j j , j = 1, . . ., N z , represent the decision variables x j,k , a j,k , u j,k of the original problem, and j N z +1 contains the terms Qmin , Qmax and Qtotal,k .All problems are connected via coupling constraint h.Note that in general not only peak-load shaving can be handled via this formulation, but also more complex production side models with intermittent renewable generation, where defining a goal for the total demand would require also considering heat or cooling production costs and constraints on the production.
Transforming problem ( 6) into (4) can be achieved by introducing copies of the variables j i , ĵi : Now the problem has two blocks of variables with z T and T , which are only linked via the last equality constraint H, and hence the problem is amenable for ADMM.New iterates for the first block of variables z 1 can be obtained in parallel.These sub-problems could be solved locally in the thermal zones, and only the new iterates z 1 need to be shared with the central coordinator that then solves for the second block of variables, z 2 , and thus ensures energy balance.

Simulation study
In order to generate measurement data for the building model identification, and then validate the results and perform a co-simulation together with the proposed distributed optimisation scheme, a detailed building simulation model is used.To show the benefit of scalemore buildings mean more potential for flattening the overall load profilea simulation model considering an entire city quarter (Q1 of Reininghausgründe in Graz, Austria) was chosen in this work.It is described in the following subsection 3.1.The optimisation was performed with the goal of lowering the peak loads; the economic evaluation of the resulting strategy is explained in subsection 3.2.

Simulation model
The simulation software IDA Indoor Climate and Energy (IDA ICE) from EQUA was used for the detailed thermal modelling of the city quarter using a dynamic multi-zonal simulation approach.
The physical models used in IDA ICE reflect the latest research and the computed results compare well with measured data.Furthermore, it is possible to couple it with the proposed EMS via an API interface.
The simulation model is based on the model described in (Nageler et al. 2018), which was developed in previous research projects carried out by the authors.It uses geographic information system (GIS) data, see, e.g.(Nageler et al. 2019), making it easier and faster to set-up in comparison to other building simulation tools such as, e.g.TRNSYS or EnergyPlus, especially when dealing with large simulation models with many thermal zones.The city quarter with a gross floor area of almost 65,000 m² was modelled in different levels of detail concerning the thermal zoning.Some buildings were modelled as two (big) single zones (one for the office and commercial demand and one for the residential demand), whereas in other buildings every room of an apartment is modelled individually, resulting in a total of 36 zones.Regarding the use of zones, only the main uses, residential and office and commercial, were distinguished.The climate conditions for the simulation model stems from temperature measurements from Graz, Austria, for the year 2018, provided by the Zentralanstalt für Meteorologie und Geodynamik (ZAMG, engl.: Central Institute for Meteorology and Geodynamics).An overview of the basic data and use of the buildings for quarter Q1 is given in Table 3.The columns Heating and Cooling in Table 3 represent the nominal heating and cooling load of the individual buildings.These were estimated from design parameters: For apartment zones the design power is 50 kW/m² for heating and 25 kW/m² for cooling.For commercial zones the design power is 60 kW/m² for heating and 80 kW/m² for cooling.The nominal temperature spread of the floor heating and cooling system is 5 K.
In the simulation the individual zones of the buildings are either controlled by (a) conventional zone controllers, i.e. without the proposed EMS (woEMS), or (b) with the proposed EMS (wEMS), i.e. the zones are operated using the presented distributed optimisation approach.The conventional zone controllers are simple thermostats which either fully open the heating valve u h if the zone temperature drops below the desired setpoint by more than 1 K, or fully closes the heating valve if the zone temperature is higher than 1 K of the desired setpoint, i.e. u h [ {0, 1}.In case of the The energy hub in the simulation model is idealised, i.e. it is able to provide the required heating and cooling at all times.The connections between the buildings are also assumed to be ideal, ignoring thermal losses and transport delays.

Economic evaluation
The economic evaluation aims to analyse the operating costs for providing the required (heating) energy when using the conventional zone controllers (woEMS), or when using the proposed EMS (wEMS).The evaluation is based on an a posteriori analysis of the required heating demand of each scenario.Heating demands up to 1 MW are satisfied by a ground source heat pump (HP), higher demands are satisfied by a district heating (DH) network.Then the total required electricity for the HP and the heat from DH are multiplied with their respective purchasing prices; heat from DH was set to cost 62 €/MWh, and the electricity price was set to 100 €/MWh.The coefficient of performance (COP) of the HP is assumed to be 4.

Results
In this section the results of the simulation study are presented.The results of the building model identification for one exemplary thermal zone are discussed in Section 4.1.Subsequently, the overall results of the distributed optimisation scheme are presented in Section 0, where the previously identified thermal zone models are used in the EMS and simulated together with the simulation model.The economic evaluation of the overall results is presented in Section 4.3.The model verification phase is set to 7 days to enable a better visual evaluation of the identified model.The identification of one linearised model takes about five minutes on a conventional desktop computer with a dual-core Intel i5 CPU @ 2.8 GHz and 16 GB of RAM.

Building model identification
The first subplot of Figure 2 illustrates the comparison between the model state T i and the measured (or simulated) zone air temperature depending on the heating actuation.The mean absolute error (MAE) within the verification phase is 0.28 °C.The second subplot depicts the course of the identified envelope temperature T e , which cannot be meaningfully compared to any measurement data because it represents one (virtual) wall combining the dynamics of all internal and external walls.The third subplot compares the estimated floor temperature with the floor temperature simulated in IDA ICE; the MAE within the verification phase is 0.45 °C.Note that the measured floor temperature is not used within the objective function because it usually is not available in a real-world system.The last subplot shows the weather conditions used in the simulation model.The results indicate that the identified model is sufficiently accurate (MAE of T i ≤ 0.5 • C) for the use in an MPC.
Figure 3 shows the results obtained for other zones, where a MAE of 1.5 K was achieved in the worst case.The results were obtained using the simulation data of two weeks and predicting the zone temperature for each time step for the next 48 h in a rolling horizon fashion.The most accurate results were achieved for zones where a lot of heating excitation is present and the influence of climate conditions (e.g.solar irradiation) is high; on the other hand, the worst results were obtained for zones without windows and less or no heating actuation.

Distributed optimisation for demand side management
The distributed EMS is evaluated in a simulation study using the simulation model described in Section 3.1 and weather data from February 2018.The thermal zone models were identified with the method described in Section 2.1 for each of the 36 considered zones using simulation data from December 2017.The total heat demand Qtotal consists of the demand from each zone and the heat demand from commercial zones Qcom , which cannot be influenced but must also be considered by the EMS.The commercial demand is forecast using a multi-linear regression model (Nigitz and Gölles 2019).
The EMS was implemented in the programming language Julia and the optimisation problems were solved using the commercial solver Gurobi, see (Gurobi Optimisation 2019).The prediction horizon was chosen to be 24 h, with a sampling interval of 15 min.The simulation was conducted on a server with a 12 core Intel Xeon Gold processor @ 3 GHz and 32 GB of RAM.While the subproblems could be solved in parallel on separate machines this was not tested.The results from the simulation with and without EMS are depicted in Figure 4.It shows that many thermal demand peaks are significantly reduced by the EMS, while the mean zone temperature shows a comfort gain with less overheating.Overheating of the zones due to high solar irradiance during the second half of the simulation time can partly be avoided by the EMS.The sharp peaks in heat demand stem from the commercial zones of the city quarter and cannot be influenced by the controller.This can  also be observed in Figure 5, where the duration lines (i.e. the cumulated duration of operation at given power levels) of the thermal load for the apartment and commercial zones are compared.It can be clearly seen that the peak load of the apartments can be significantly reduced when using the proposed EMS, thus reducing the peaks of the overall demand profile as long as no uncontrollable demand peaks from commercial zones occur.

Economic evaluation
Figure 6 shows the results of the economic evaluation of the results presented in Section 0. The left plot illustrates the total amount of (heating) energy provided by either DH or the HP, with and without EMS operation.The total heat demand is about 2% higher with the proposed EMS than without.This stems from a slightly higher average zone temperature when using the EMS compared to conventional control.The right plot illustrates the operating costs for the simulated month.The overall costs with EMS are approximately 9% lower than without EMS since the flattened demand reduces the reliance on (more expensive) DH for covering demand higher than 1 MW.

Conclusions and outlook
In this work a purely data-driven building model identification method which does not require any user inputs and considers all relevant influence factors, i.e. solar (façade) irradiances, ambient temperature, feed temperature, (constant) internal gains as well as envelope and heating dynamics, was presented.Results showed that the identified building models are sufficiently accurate for the use in an MPC, where even in the worst case a mean average error (MAE) of 1.5 K for the prediction of the zone temperature was achieved.In the simulation study carried out for buildings with floor heating, the proposed distributed optimisation technique based on ADMM was able to use these models to flatten the demand profile, thus reducing the operating costs for heating in an exemplary winter month by approximately 9%, assuming a fixed COP of the HP and constant prices for electricity and imports from DH.
In real-world applications the reduction of peak demand can have even more positive economic effects: (a) the maximum rated thermal power needed from DH may be less, hence, lower purchasing tariffs may apply and smaller pipes and DH installations may be used, and (b) flattening the profile may favour the installation of another (or a bigger) HP over the import from DH, due to longer run-times at their maximum rated power output, resulting in a shorter return of invest of the HP, or may eliminate the necessity of DH integration altogether.
To obtain more reliable results about the benefit of using the proposed EMS, more extensive simulation studies considering entire years and varying weather scenarios will be carried out.

Figure 2 Figure 2 .
Figure2illustrates the result of the identification of the linearised zone model for one of the simulated zones.The data used for the building model identification were generated beforehand using the simulation model presented in Section 3.1.All subplots are divided into three phases: the KF convergence phase, the parameter optimisation phase, and the model verification phase, which compares the identified model with measurement data.During all phases, the zone model is provided with the inputs (heating actuation u h , weather influences I g,o and T a , and feed temperature T f ) measured in the simulation model.The model states T i , T e , and T h of the individual MRI slices are not present in the subplots' legends to keep the illustration uncluttered, but are visualised by varying line colours.The MRI slices 1 (blue) and 3 (green) are highlighted exemplarily in the second subplot in order to visually highlight the length and overlapping of the individual MRI slices in comparison to the entire dataset.The initial states from the KF are shown as dots.The feed temperature T f is constant at 37 °C throughout the entire simulation and is not shown in the illustration.The sampling time t s is 15 min, the KF convergence phase is set to 3 days, the length of the individual MRI slices is equal to the MPC prediction horizon, N p = 24 h, and the number of MRI slices is l = 90.

Figure 3 .
Figure 3. Evaluation of the mean absolute error (MAE) over the look ahead time for each zone (blue).The mean over all zones is indicated in orange.

Figure 4 .
Figure 4. Comparison of the duration lines of the total thermal load of the apartment and commercial zones without EMS (woEMS) and with EMS (wEMS).

Figure 5 .
Figure 5. Co-simulation results without EMS (woEMS) and with EMS (wEMS).The upper plot shows the total heat flow from the energy hub.The lower plot shows the mean zone temperature of the apartment zones, weighted by the respective floor area, and the chosen comfort band of 22 + 1 °C.

Figure 6 .
Figure 6.Results of the economic evaluation.The overall costs are about 9% lower when using the distributed EMS.

Table 1 .
Symbols used for the building model identification methodology.
h relative heating control action x state vector (consisting of T i , T e and T f )

Table 2 .
Symbols used for the distributed optimisation methodology.
Note: The index j denotes the zone.

Table 3 .
Overview of the basic data and use of the buildings of quarter Q1 of Reininghausgründe in Graz, Austria. the heating valves of all zones are allowed to be optimised to any continuous value between fully closed and fully open, i.e. u h[ [0, 1].