Approximating model predictive control strategies for heat pump systems applied to the building optimization testing framework (BOPTEST)

Model predictive control (MPC) is promising for optimizing building's operation but high hardware, software and know-how requirements impede its commercialization. Therefore, rule-based controllers (RBC) are state-of-the-art. Approximate MPC (AMPC) can help bridge this gap by replacing the optimization with an explicit functional relation called approximator. Literature lacks reproducible use cases and benchmarks and a comparison of sophisticated and traditional approximators. This study aims to close this gap by applying AMPC to BOPTEST's two-zone heat pump testcase. The BOPTEST testcase includes predefined RBCs and KPIs promoting repeatability. Then, a comparison was made between artificial neural networks (ANNs), random forest (RF), linear, and logistic regression. The results show that feature selection significantly affects the performance. After adapting the features, the ANNs and RF outperform the RBC with cost savings of up to 33% and discomfort reductions of 70%, while requiring 15% of the MPC's computation time. The traditional approximators fail to outperform the RBC.


Introduction
The energy consumption and, thus, CO 2 emissions must be reduced to mitigate global warming.The building sector has a high energy reduction potential.While the global pandemic caused a slowdown in emission increase, the CO 2 emissions from building operation exceeded their all-time peak of 2019 by 2% in 2021.Overall, the building sector accounts for nearly 40% of the energy-related emissions (United Nations Environment Programme 2018).In addition to expensive refurbishment strategies, optimizing the operation of building energy systems is a promising and inexpensive measure to reduce CO 2 emissions.Furthermore, existing fossilbased technologies like gas and oil boilers, which are most common for supplying heat to buildings, need to be replaced by technologies that can be operated solely based on renewable energy sources.
In this context, heat pumps are a key technology due to their efficient use of ambient energy (Vering et al. 2021).However, heat pumps are more complex in operation as their coefficient of performance (COP) strongly depends on partial load level, external and supply temperatures.In addition, their grid connection is both an opportunity and a challenge.The opportunity arises by an increased sector coupling potential increasing electrification while the CONTACT Laura Maier laura.maier@eonerc.rwth-aachen.dechallenge lies in the related increased control complexity.Advanced control strategies can exploit the existing potential of heat pump systems while handling the increased complexity (Shafai 2002).In this regard, model predictive control (MPC) is an acknowledged advanced control method whose performance has been demonstrated by various studies (Drgoňa et al. 2020;Kim et al. 2022;Taheri, Hosseini, and Razban 2022).Even though MPC is a well-researched method in the scientific community, its transfer to practice is difficult and only progressing slowly.In practical applications, rule-based controllers (RBC) are most common (Schild et al. 2019).
The reason for the slow progress are the missing knowhow and experience with MPC applications as well as high requirements for the hard-and software and complicated data infrastructures (Cigler et al. 2013;Serale et al. 2018;Zong et al. 2019).To realize the existing potential of MPC applications while still keeping the hardware and software requirements as low as possible, approximate MPC (AMPC) is a promising approach.AMPC is also referred to as imitation learning (Dinh and Kim 2022;Drgoňa, Helsen, and Vrabie 2020) or rule extraction (Bursill, O'Brien, and Beausoleil-Morrison 2020;May-Ostendorp et al. 2013, 2011;Yu and Pavlak 2021) and aims at using mathematical models such as machine learning models to learn the relation between MPCs' in-and outputs.The mathematical models are the socalled approximators that try to imitate the optimization.To apply AMPC, training data, i.e. the MPC's input and output pairs, is generated using the MPC, which is also called teacher MPC.The approximators are any type of explicit function describing the relation between the MPC's in-and outputs.The training is conducted before actual deployment.For the deployment, the MPC's inherent optimization, which determines the manipulated variables for the forecast horizon, is replaced by the trained approximator(s).Consequently, there is no need for an ongoing optimization during system operation.Machine learning has proven to be suitable to automatically learn this functional relation.Several studies have proven the concept of AMPC for the building sector and have shown that it outperforms common RBC while retaining most of the MPC performance (Bursill, O'Brien, and Beausoleil-Morrison 2020;Coffey 2013;Domahidi et al. 2014;Drgoňa, Helsen, and Vrabie 2020;Drgoňa et al. 2018;Klaučo et al. 2014;Le, Bourdais, and Guéguen 2014;Löhr et al. 2019;May-Ostendorp et al. 2013, 2011;Piscitelli et al. 2019;Yang et al. 2021;Žáčeková et al. 2015).
The studies relevant for the present work are listed and categorized in Table 1.In addition to the studies listed in the table, a research field related to AMPC has emerged over the last years dealing with using explainable AI (XAI) to extract control rules (Cho and Park 2022; Domahidi et al. 2014;Yu and Pavlak 2021).XAI aims at making methods of the field of AI more comprehensible and, hence, more trustworthy.This is often realized by using decision trees as machine learning method due to their intuitive 'ifcondition-then-action' structure.In Cho and Park (2022), Domahidi et al. (2014) and Yu and Pavlak (2021), e.g.decision trees are used in different forms to compute control rules (Cho and Park 2022;Domahidi et al. 2014) or set value trajectories (Yu and Pavlak 2021).These rules can be learned by an MPC like in (Domahidi et al. 2014;Yu and Pavlak 2021) or by a reinforcement learner as demonstrated in Cho and Park (2022).Even though comprehensibility of approximators is beneficial for a successful deployment of AMPC in real-life systems, it is not the focus of this study.
The studies in Table 1 are distinguished regarding their control or manipulated variables, the approximation scheme (regression or classification), the used approximation method, which is further differentiated into treeand non-tree-based algorithms, the subsystem focus, and the building type.The authors of Drgoňa et al. (2018), Klaučo et al. (2014), Žáčeková et al. (2015) and Robillart, Schalbart, and Peuportier (2018) use regression models to mimic the optimized heat flows to different thermal zones, while Yang et al. (2021) and Coffey (2013) imitate the optimization outputs for an air-handling unit's mass flow rate, external shading, and charging duration for radiant slab pre-cooling.Furthermore, Tam et al. (2019) investigate the optimal ice storage charge and discharge power as well as the monthly target demand-related costs.In addition to the regression-based approximators, some studies present approximations of discrete control variables.E.g.May-Ostendorp et al. (2013, 2011) mimic the optimization of window opening, while Le, Bourdais, and Guéguen (2014) change the blind position and Piscitelli et al. (2019) predict the window transmission for a smart glazing application.All in all, only Löhr et al. (2019) use regression and classification simultaneously due to continuous and discrete control variables.Since real-life systems usually involve a mix of discrete and continuous control decisions, this is a topic that should be investigated in more depth.
In addition, most of the studies have in common that their focus lies either on the consumer or the generation side of the building energy systems.To the authors' best knowledge, only Löhr et al. (2019) and Yu and Pavlak (2021) investigate control strategies focussing on the generation and distribution subsystems simultaneously.However, Löhr et al. (2019) assume a pre-simulated building and domestic hot water load profile as consumers instead of an actual physical thermal zone model.Hence, the generation system's performance does not have an effect on the load and comfort violations cannot be investigated.Moreover, Yu and Pavlak (2021) use clustering to select a representative control trajectory for the cooling set temperature without any closed-loop interaction with the system states.Consequently, they investigate a feedforward controller instead of a feedback controller.Apart from that, despite their great role in the energy transition, to the authors' best knowledge, only Löhr et al. (2019) investigate heat pump systems in the context of AMPC.
Hence, we observe a first research gap in a missing detailed assessment of heat pump systems with a coupled generation and consumer side and simultaneous assessment of continuous and discrete control variables.In addition, the focus has mostly lied on non-residential buildings, i.e. offices so far (Bursill, O'Brien, and Beausoleil-Morrison 2020;Coffey 2013;Domahidi et al. 2014;Klaučo et al. 2014;May-Ostendorp et al. 2013, 2011;Piscitelli et al. 2019;Yang et al. 2021;Yu and Pavlak 2021;Žáčeková et al. 2015).Yet, especially in residential buildings, limited hard-and software requirements are a necessity.This is further motivated by the usually missing building automation system which is fairly common in newer office buildings but not in residential ones.To the authors' best knowledge, only the studies (Drgoňa et al. 2018;Le, Bourdais, and Guéguen 2014;Löhr et al. 2019;Robillart, Schalbart, and Peuportier 2018;Tam et al. 2019) focus on residential buildings.However, as discussed above, none of the studies investigates a coupled generation and consumer system.Consequently, we refine the research gap regarding the missing application of AMPC to coupled residential building energy systems with heat pumps as heat supply system.
Furthermore, Table 1 reveals that the studies mostly focus on one approximator only.Among the few studies that benchmark several approximators, Drgoňa et al. (2018) compare regression trees with time-delay neural networks regarding control and computational performance.They find that the time-delay neural networks outperform the regression trees regarding both objectives.However, they do not compare their sophisticated machine learning approaches with more traditional statistical methods like, e.g.linear and logistic regression.In contrast, May-Ostendorp et al. (2013) apply the CART algorithm, adaptive Boosting as well as generalized linear models.In this regard, we categorize the CART algorithm and adaptive boosting as sophisticated learning methods, while the generalized linear models are considered a more traditional approach.They find that all models perform well open-loop, but the generalized linear models experience a major performance decline in the closedloop simulation.Furthermore, Domahidi et al. (2014) compare support vector machines and adaptive boosting based on decision stumps to imitate binary control decisions.However, they solely focus on one approximator model at the same time, meaning that all control variables are predicted by the same model type.In this context, it would be interesting to compare different approximators for each control variable independently, resulting in a hybrid AMPC approach.Also, they do not benchmark their approach with traditional statistical methods.
To summarize, we detect a second research gap in a missing study investigating the concurrent control of discrete and continuous control signals while comparing sophisticated learners and traditional statistical methods and, therefore, applying a hybrid approximator concept.
Finally, usually, all of the MPC's input data serve as features for the approximator.In the context of AMPC, only Drgoňa et al. (2018), Domahidi et al. (2014), Le, Bourdais, andGuéguen (2014), andYang et al. (2021) discuss the effect of feature dimension reduction.For example, in Drgoňa et al. (2018), the authors select the most significant features by a mix of manual selection, principal component analysis, and the building model's dynamic analysis.By doing this, they reduce the input space from 1518 features to 330 ones for the time-delay NN and 27 for the regression tree.Here, most of the features are discarded due to model disturbance dynamics analysis.This impressive work showed the feasibility of feature selection and engineering processes in the context of AMPC.Nonetheless, we are further interested in features that are easily (i) measurable and (ii) accessible.The latter concerns the availability of forecast data for, e.g.disturbance predictions.Even though these criteria apply to most features in Drgoňa et al. (2018), they do not apply to all.In addition, a comparison of the open-and closed-loop performance is missing which might lead to a different feature selection result according to May-Ostendorp et al. (2013).Furthermore, they do not benchmark their controllers with AMPC approaches that use the full feature space instead of the reduced one.Apart from that, in Domahidi et al. (2014), Domahidi et al. put high effort in analysing the feature influence on the adaptive-boosting-based approximator and benchmark it with a controller that uses the full feature space.Both studies (Domahidi et al. 2014;Drgoňa et al. 2018) have in common that they select different features tailored to each approximator.This is reasonable due to the different approximator nature and, thus, their sensitivity to the feature space.Nonetheless, we are interested in comparing the controllers based on the same feature set to prevent the overlap of different effects.Furthermore, they look into either regression or classification and not a mix of approximation models.Finally, Le, Bourdais, and Guéguen (2014) gradually discard features reaching a feature space of only 2 features.The resulting approximator's closed-loop performance is still satisfactory but a lower controller benchmark is missing and the target system is rather simple.Moreover, we aim at investigating the influence of the horizon length for disturbance forecasts to realize accurate controller performance.This is motivated by uncertainty in forecasts and the implied negative effect on the closed-loop performance as well as dimensionality reduction in general.This aspect has not yet been extensively dealt with by the scientific community.
Hence, we define a third research gap regarding the feature selection assessment for AMPC applications that comprise an open-and closed-loop comparison solely based on easily measurable and accessible features as well as regarding the influence of the prediction horizon.
Another aspect is the missing comparability among the studies.The common approach is to use the teacher MPC as upper benchmark and some RBC as lower benchmark.However, especially the lower benchmark controllers are not comparable among the studies since their application is often expert-based and use-case-specific.Furthermore, the regarded use cases, themselves, are difficult to compare due to their large diversity.Hence, we observe a fourth research gap regarding the transferability of the use case and the benchmark controller.Here, the open-source Building Optimization Testing Framework (BOPTEST) comes into play (Blum et al. 2021).
To summarize, the present study combines the following novelties in the field of AMPC: (1) We investigate a residential building with a heat pump system whose generation and distribution system actively interact with the thermal zone model and whose control variables are of continuous and discrete nature.
(2) We carry out a detailed comparison of sophisticated tree-and non-tree-based machine-learning models with traditional statistical methods and develop and test a hybrid AMPC approach.(3) We conduct an in-depth assessment of the influence of different feature sets on the open-and closed-loop performance only considering easily measurable and accessible features to enhance the applicability in practice as well as an investigation regarding the forecast horizon's influence on the control performance.(4) We choose a transferable use case and benchmark controller based on BOPTEST's two zone apartment with a hydronic heat pump system (Blum et al. 2021).

Use case and control task: two-zone apartment with hydronic heat pump system
The use case is a two-zone apartment with a hydronic heat pump system and floor heating (see Figure 1).The use case is part of BOPTEST 1 and has first been presented by Zanetti, Kim et al. (2022).The apartment's location is Milan in Italy and only heating is considered.The apartment consists of a living room and a bedroom whose room temperature is controlled via two thermostats.The openings of the two thermostatic valves are two out of three manipulated variables.On the energy generation side, the heat pump's supply water temperature can be controlled, too.A pump circulates water based on the thermostat setpoints.When the pump is turned off, there is zero flow.Each of the opened valves is assigned with a set mass flow rate.Consequently, the pump either does not circulate water at all when both valves are closed or it circulates 50% or 100% of the system's design mass flow rate.The apartment has a total area of 44.5 m 2 with the bedroom accounting for 22.5 m 2 and the living room for 22 m 2 (see Table 2).The heat pump is the Altherma model from Daikin with a nominal power of 5 kW.The apartment is modelled in the modelling language Modelica using the Buildings library (Wetter et al. 2014a).The two rooms are modelled as two thermally connected thermal zones.
The rooms are further connected via a door.

Previous study: model predictive control concept
The baseline MPC corresponds to MPC4 from Zanetti, Kim et al. (2022).A grey-box model based on the resistancecapacitance (RC) analogy was used for the MPC.A three capacity and seven resistances (3C7R) circuit was adopted for each thermal zone.The three capacities, which are also the dynamic states of the MPC, are related to the room air temperature T room , wall temperature T wall and floor temperature T floor .Resistances connect the capacities nodes to each other and furthermore, two resistances connect C room and C wall to the external temperature T amb .
The wall has a resistance that connects also with the sky  temperature T sky .The disturbances are the hemispherical global radiation hitting the external wall and window s and internal gains int divided between sensible and radiative components.Finally, the heat flow rate to the floor heating system, injected in the floor capacity C floor is calculated as where T supply is the supply water temperature and T ret is the return water temperature.ṁf,nom is the nominal value of mass flow and u i is the valve position for zone i.The return/outlet temperature T ret was modelled with the following linear equation to correlate T ret with T supply and floor temperature T floor as T ret = w floor T supply + (1 − w floor )T floor , where w floor is the weighting factor for the identification process.T supply and u i are the control variables of the MPC, which lead to a nonlinear formulation.
A summary of the optimal control problem is reported in Table 3 in terms of optimizations options, control variables, constraints and objectives.The control horizon of 12 h and the time step of 15 min are reasonable considering available weather forecasts and floor heating slow dynamics.A direct collocation method was implemented using the Pyomo (Bynum et al. 2021) framework to convert the continuous optimal control problems into discrete programming problems, which leads to a Nonlinear Programming formulation (NLP).For the control variables, δ D/N is an auxiliary variable representing a temperature deviation from a setpoint, and is coupled with the constraint C comf and the objective j comf .By looking at the constraint C comf , the value of δ will be higher than zero if the room temperature T room is lower than the setpoint temperature T set .In this case, it will be penalized by including δ 2 in the objective j comf .This will push the MPC to keep T room higher than the setpoint temperature.T supply is the supply temperature and can go up to the maximum temperature of 45 • C to avoid high temperatures in the floor, down to a minimum temperature, defined as the adiabatic mixing temperature in constraint C Tmin .The formulation of constraint C Tmin comes from a local energy and mass balance at the return outlet of the floor heating system under the assumption that the nominal flow rate is the same for all circuits.T f,D/N is the floor temperature and u D/N is the floor heating circuit valve control.u R represents a continuous relaxation on the valve control, so that the valve can continuously modulate.The objective functions section shows the objective function formulation for the optimization problem.The complete objective function J tot to be minimized is the sum of these different objective components with some weights, denoted as k i , where i corresponds to a specific objective component.The weighting parameters k i need to be tuned to balance the impact of each objective on the total objective function J tot .The objective j en is the energy cost and is calculated as the energy price p el multiplied by the total heat flow rate provided by the heat pump, ḢD (t) + ḢN (t), divided by the heat pump COP.The COP is a function of the external and supply temperatures COP(T amb , T supply ).Here, the objective of the optimization problem becomes nonlinear because a control variable is present in the denominator of a fraction.j comf is the temperature mismatch between room temperature T room and setpoint T set and works as a comfort proxy.The switching frequency objective j switch is the sum of the squared valve control derivatives.This objectives serves the purpose of penalizing undesirable sudden changes in the control variables.Finally, the binary constraint, j bin , forces u D and u N to be close to either 0 or 1 to avoid having an objective greater than 0. The reasoning behind this constraint is to approximate a Mixed Integer Nonlinear Problem (MINLP) as a NLP.To solve the discretized non linear programming problem, the interior point optimizer solver IPOPT (or Foundation 2006) was coupled with Pyomo.

Adapted boundary conditions
The study presented by Zanetti, Kim et al. (2022) assumes equal boundary conditions for both zones.Consequently, the zones can be controlled simultaneously since their setpoint temperature profile is identical.The original internal gains and setpoint temperature profile is shown in Figure 2. Occupancy is assumed during the night with an absence during the day and full absence during the weekend.
In the present study, we want to adapt these boundary conditions aiming at a control task with two separate    (Seidinger and Menard 2006).In addition to that, we assume the door to be fully closed rather than opened as presented in the original study (Zanetti, Kim et al. 2022) to further enable individual temperature control.Furthermore, we slightly adapt the dynamic electricity price profile as shown in Table 4.This adaptation aims at a better comparison to the constant price scenario implemented in BOPTEST since the dynamic price scenario's average equals the constant tariff's value in the adapted version.

Mathematical problem formulation
The previously described MPC functions as teacher controller for the AMPC.Closed-loop simulations are taken as a basis to generate training data for the approximator.The approximator's aim is to replace the implicit control laws generated by solving the optimal control problem by an explicit expression.I.e.we try to find a classification or regression function based on a training data set with n samples {(E (1) , u (1) ), . . ., (E (n) , u (n) )} that predicts the values of u based on a feature set E (Drgoňa et al. 2018).Consequently, the approximator computes the control law u = f θ (E).The feature set E comprises past and current measurements y, past, and current disturbances d, predictions of disturbances d, and past and current system states x.The presented use case has three control variables: the heat pump's supply temperature and the openings of the two thermostatic valves.While we could find one approximator computing all of the three control variables simultaneously, this study develops a parallel tool chain, instead.Therefore, we want to determine three approximators as illustrated in Figure 4.
The two approximators computing the valve opening are of discrete nature, hence, we deal with a classification problem, while the computation of the heat pump's supply temperature is a regression problem.In the case of continuous manipulated variables u deriving from a MISO problem like in the case of the supply temperature, the target function is a multivariate regression function f θ : R n E → R. The function is obtained with the help of a training set of E (i) ∈ R n E and u (i) ∈ R based on the following objective (Drgoňa et al. 2018): For the classification problem, we use the cross entropy function that evaluates the difference between the predicted probability distribution f θ (E (i) j ) and the observed where n u = 2.

Training process
We refer to all regression or classification functions as approximators.These approximators are determined based on pervised learning problem.To find the optimal combinations of features and hyperparameters, we use the Python-based tool AddMo (Rätz et al. 2019) For the present study, we apply the RobustScaler from scikit-learn.The period selection is carried out expertassisted covering all relevant seasons.The individual feature sets including their engineering is discussed for each feature set separately in Subsection 2.3.For the adapted feature set, we apply the wrapper method using Random In the present study, we utilize manual model selection since the algorithm choice is predetermined.In addition, Bayesian optimization is used since it is more efficient than grid search while still obtaining good results (Rätz et al. 2019).The training process is carried out using k-fold cross-validation.The whole training process is also referred to as open-loop training since the controller does not interact with the actual target system.After the model tuning phase, we replace the optimal control problem with the approximator as illustrated in Figure 5.The depicted closed-loop control scheme consists of the respective controller (either OCP or approximator) and a simulation model.The simulation model is the substitute for a real-life system.The model is provided by BOPTEST and modelled using Modelica.Here, models from the IDEAS (Jorissen et al. 2018), Buildings (Wetter et al. 2014b), and the IBPSA library (Wetter, Blum, and Hu 2019) are utilized.The plant model is exported as a functional mock-up unit (FMU).The interaction between the controller and the model is done via pyFMI (Andersson, Åkesson, and Führer 2016).For each control step, the OCP or the approximator receive perfect forecast information on future weather data, electricity prices and internal gains and compute the discussed control variables.The plant model is simulated for a time step of 15 The closed-loop simulation results serve as training data for the approximators.While it is common for, e.g.data-driven MPC approaches to apply system identification measures like system excitation (Stoffel et al. 2023), we use simulation results of a whole heating period as training data.The heating period is assumed to start in October and last until the end of April.Of the simulation results, 6 months (October to January and March to April) serve as training and validation data, while the data from February is used as test data.

Feature sets
In the present study, we investigate two different feature sets to better understand the optimal feature selection for AMPC applications.Table 5 lists all features for both sets and the time steps that are considered.Both sets cover current states and current and future values of the disturbances.The first feature set follows the idea that the AMPC uses almost all inputs that its teacher MPC utilizes, as well.These inputs are the result of expert knowledge and physical significance in accordance with the constraints of the MPC formulation.The set contains the process variables, i.e. the room air temperatures of both zones as well as important physical quantities of the hydraulic circuit.Among these are the floor heating system's average water temperature and the return temperature.In addition to these state variables, all disturbances affecting the system are used as inputs for the AMPC.The adapted feature set, however, uses less features to simplify the approximator data base.In addition, two engineered features, namely the deviations from the set room temperatures are taken as a basis.Furthermore, the adapted feature set excludes the return temperature as a feature to prevent autoregressive behaviour.Regarding the disturbance predictions, we do not use the full 12 h prediction horizon with a time resolution of 15 min as features in both feature sets.Instead, we assume a resolution of 60 min and a horizon of 6 h to reduce the number of features and, hence, accelerate the training process.The reduction in the disturbance prediction horizon is why we noted that the basic feature set only utilizes almost all instead of all MPC inputs.

Approximator configurations
Another research questions that this study addresses is the correct choice of approximators.As discussed in Section 1, we observe a research gap in a detailed comparison between tree-based models and ANNs as well as sophisticated machine learning models and more traditional statistical methods, in general.To close this gap, we investigate different approximator configurations as shown in Table 6.As approximators, we focus on multilayer perceptrons as ANN and the RF.RF is a representative for tree-based ensemble methods and has proven to outperform traditional decision trees (Idowu et al. 2016).
The authors of Callens et al. ( 2020) even prove their ability to outperform ANNs.Since RF is also based on decision trees, they are suitable for the field of explainable AI.Yet, we like to highlight that making the learned RF more comprehensible and interpretable is subject to future work and not part of this study.As representatives of the traditional statistical methods, linear and logistic regression are used.The five configurations cover purely ANN-(AMPC 1) and RF-based mixes (AMPC 2) as well as a mix of ANN and RF models (AMPC 3 and 4) for the respective control variables.In contrast, AMPC 5 only utilizes the traditional learners.

Benchmark controllers
The idea behind AMPC is to mimic the teacher MPC as best as possible while clearly outperforming traditional RBCs.Consequently, the MPC is the upper benchmark while the RBC serves as lower benchmark.The teacher MPC has already been discussed in Section 2.2.As rulebased benchmark controller, the built-in controller in BOPTEST is used.The controller assumes a linear heating curve to obtain the heat pump's supply temperature.The valve openings are controlled using a comfort band which is defined as follows:

Results
The following section presents the simulation results.At first the teacher MPC and rule-based benchmark controller are discussed (see Section 3.1).Following this, the AMPC is presented based on two different feature sets.At first, we investigate the open-loop and closed-loop training results for the basic feature set (see Section 3.2).Based on the results derived from the basic feature set, we evaluate the adapted feature set in Section 3.3.The results are discussed for the month of February which is a month with colder and warmer periods, hence, best reflecting the overall performance of the controllers.

Teacher model predictive controller and benchmark rule-based controller
At first, we compare the teacher MPC and the RBC for the adapted boundary conditions discussed in Section 2.3.Figure 6 shows the room air temperature courses for both thermal zones for an exemplary week in February.Compared to the preliminary study be Zanetti, Kim et al. (2022) the two zones' temperature deviates due to adapted boundary conditions.In general, the MPC clearly outperforms the RBC regarding both the electricity costs and the thermal comfort.The MPC yields specific electricity costs of 0.62 e/m 2 and a thermal discomfort of 25 Kh/zone in February.In contrast, the RBC results in electricity costs of 0.92 e/m 2 and a thermal discomfort of 100 Kh/zone.Consequently, the MPC results in a cost decrease of 30% while increasing thermal comfort by 75%.This behaviour is also reflected by the course of the temperatures in Figure 6.We observe that the MPC violates some of the temperature constraints.This behaviour is explained by the discrete nature of the valve control as well as the cost function weight determination.Nonetheless, it is an essential improvement over the RBC.

Open-loop performance
Following the results of the upper and lower benchmark, we first investigate the AMPCs with the basic feature set.The open-loop results for the heat pump supply temperature are depicted in Figure 7 and the ones for the valve openings in Figure 8.We evaluate the open-loop performance based on the R 2 value and the mean absolute error MAE.Both the ANN and RF result in an R 2 value of 0.96, while linear regression achieves a value of 0.9.Regarding the MAE, the RF yields the lowest error of 0.36 • C, while the ANN results in an MAE of 0.4 • C and the linear regression of 0.79 • C. All in all, the regression accuracy for the basic feature set is very high.For the valve opening, the accuracy and the F1 score serve as KPIs for the open-loop performance.The F1 score is calculated from the precision and recall of the data set.By including both metrics, it is a balance betweem both important statistical KPIs.In the following sections, the living room is referred to as the day zone (D) and the bedroom is the night zone (N).Here, F1 score(0) denotes the F1 score for the closed valve and F1 score( 1) for the open valve.The results show that the ANN has the highest accuracy of 0.99 for both valve positions.The RF and logistic regression result in lower accuracy values for u D of 0.91 and 0.90, respectively, and for u N both yield an accuracy of 0.97, which is similar to the ANN's one.Considering the F1 score(0) and F1 score(1), we see that the ANNs for u D and u N for both valve positions show almost identically high approximation accuracies of 0.98.For the RF and logistic regression, the scores for the valve position of u N are also similar with a value of 0.97, while for u D , the values of F1 score(0) and F1 score(1) differ by 0.1 for the RF and by 0.24 for the logistic regression.The difference can be explained by the fact that the valve u D is open 80% of the time in the living room, resulting in unevenly  distributed data.This imbalanced distribution leads to the RF's and the logistic regression's lower prediction quality.

Closed-loop performance
Following the open-loop analysis, we apply the AMPC configurations to the simulation system for the month of February and analyse the closed-loop results.Figure 9 illustrates the resulting electricity costs and thermal discomfort (top) as well as the room air temperature for the living room for an exemplary week in February.
The operating costs of all five AMPCs are lower than the MPC's one.However, the control by the AMPCs results in significantly higher thermal discomfort.While the MPC has a discomfort of 25 Kh/zone, the thermal discomfort of the AMPCs ranges from 300 to 2800 Kh/zone.The thermal discomfort is also considerably higher than the RBC's discomfort with 100 Kh/zone, although the latter is supposed to serve as a lower benchmark.It is noticeable that the AMPCs with pure ANN (AMPC 1) and ANN and RF (AMPC 3) combination show similar values with respect to the KPIs, and we also observe similar values in the KPIs for RF (AMPC 2) and RF and ANN (AMPC 4).The approximator, which is only based on linear and logistic regression yields the lowest electricity costs but also a significantly higher thermal discomfort of 2800 Kh/zone.The similarity of AMPC 1 and 3 as well as AMPC 2 and 4 motivates why we only depict the room air temperature curves for the configuration AMPC 1 and AMPC 2 as a representative for AMPC 3 and AMPC 4, respectively.When analysing the room air temperature's courses, we observe that the RF-based and the traditional regression models-based approaches clearly violate the set temperature constraints.While the traditional regression-based approach shows a continuous decline in temperature between Monday and Friday with a steep rise on Friday, the RF-based approximator shows a decline over the whole week.Following Tuesday night, the RF-based approach does not succeed in keeping the temperature constraints.In contrast to that, the ANN-based approach does violate the constrains significantly on Monday and Sunday, but somehow follows the MPC's course during the residual days.This also explains the comparatively good performance.
To better understand the behaviour of the RF-based approach, we compare its output with the MPC's optimized set supply temperature.The corresponding set temperature course is illustrated in Figure 10.Here, we clearly see why the room air temperature slowly drops starting from Tuesday.After keeping the set supply temperature at the maximum level of 32.5 • C, the temperature decreases to values of below 26 • C and does not succeed in exceeding this threshold after this.This behaviour is explained by the feature selection.Once the supply temperature decrease due to the intended night setback, the prediction stay at that level due to the dependency from the return temperature.The return temperature is inherently a high correlating feature for the supply temperature.However, the high correlation results in a high dependency and, hence, leads to a autoregressive behaviour that we observe in Figure 10.
Overall, we see that despite the good open-loop performance as shown in Figure 8, the closed-loop performance deviates clearly.A similar effect has also been reported by May-Ostendorp et al. (2013).Even though the ANN-based approximators clearly outperform the RFand traditional regression-based ones, none of the AMPC approaches results in a sufficient closed-loop operation.Therefore, we adapt the feature set to mitigate the observed autoregressive behaviour.

Adapted feature set
The poor closed-loop performance exemplifies that the feature selection should not only target a good open-loop performance but also consider the resulting controller behaviour.To account for this, we adapt the feature set as listed in Table 5.The main idea is to exclude features that trigger autoregressive behaviour like the return temperature, exclude potentially less-correlating features, and include more informative features that prevent the operation in low temperature regimes.Furthermore, we want to concentrate on features whose forecasts are easily accessible in practice.Hence, we introduce the deviation from the set temperatures in the bedroom T set,N and the living room T set,D as additional features.Apart from that, we exclude the sky temperature, the dew point temperature, and the COP since their effect is already captured by the ambient temperature leading to high cross-correlations and forecasts difficult to access.In addition, we exclude the internal gains since their prediction is not reasonable in practice.Finally, the electricity price is excluded from the feature set because its correlation is rather low.The electricity tariff is a time-of-use tariff.Therefore, it is not a continuously changing signal whose correlations with the supply temperature would presumably be higher.All in all, the new feature set only consists of 43 features compared to 75 features.
The feature importance rating for both feature sets is given in Figure 11.The feature importance is rated based on the RF and the prediction of the heat pump's supply temperature.The inherent feature selection is categorized as embedded feature selection process and evaluates to what extent feature reduce the trees' impurity (Rätz et al. 2019).I.e. the lower the impurity, the higher the feature importance.The comparison of both feature importance ratings reveals that the importance is balanced more evenly among the features for the adapted set.When analysing the basic feature set, we find that the heat pump's return temperature has a significantly higher feature importance of 88% than the residual features.In contrast, for the adapted feature set, the feature with the highest importance is the bedroom's air temperature with 28%, followed by the ambient temperature with 18% and the bedroom's valve opening with 15%.We rate this as a more intuitive feature importance since it better reflects the control task of controlling the room air temperature based on comfort boundaries.
For the adapted feature set, the already discussed toolchain is repeated.

Open-loop performance of the adapted feature set
Figure 12 depicts the open-loop results for the supply temperature prediction, while Figure 13 illustrates the ones for the valve openings.Compared to the open-loop accuracy presented in the previous section, the R 2 values for all approximators are lower.The ANN's R 2 value drops from 0.96 to 0.89, the RF's from 0.96 to 0.86, and the linear regression's from 0.90 to 0.66.The same effect is observed for the MAE of all algorithms.For the ANN and the RF, the MAE is doubled to 0.75 K and 0.71 K, respectively, and for the linear regression the MAE increases from 0.79 K to 1.28 K.
Figure 13 presents the results of the open-loop test for the valve positions.In the figure, the accuracy of all values and the F1 score for the respective classes are shown.Since the same data was used for training as presented in the previous section, the data for the living room is still unevenly distributed.The comparison of the algorithms reveals that the ANN predicts the valve position u D for the living room best.The ANN results in the highest accuracy   for u D of 0.95, the highest F1 score(0) of 0.84 as well as the F1 score(1) of 0.97.The RF has a similar approximation quality for u D .The logistic regression yields a significantly lower F1 score(0) of 0.73 for u D than the other models and a similar F1 score(1).
For the predictions of the positions u N for the bedroom, the accuracy of all three algorithms is 0.90, which is lower than the accuracy for u D In addition, the F1 score(0) and F1 score(1) of the respective algorithms are almost identical to the accuracy.In summary, the approximation accuracies of the ANN and RF are similar, and the values for the logistic regression are lower.

Closed-loop performance of the adapted feature set
Analogous to the procedure in Section 3.2.2, the different approximators are applied to the simulation model.In the upper part, Figure 14 shows the operating costs plotted against the thermal discomfort of the MPC, the RBC and the AMPCs in February.The four AMPCs with the ML algorithms result in both lower operating costs and thermal discomfort compared to the RBC.The MPC shows a thermal discomfort of 25 Kh/zone and operating cost of 0.62 e/m 2 .The RBC control yields a thermal discomfort of 101 Kh/zone and operating cost of 0.92 e/m 2 .Figure 14.Closed-loop operation results for the controllers for the adapted feature set for the month of February.The top plot illustrates the operating costs and thermal discomfort for the month of February, while the bottom plot shows the room temperatures for an exemplary week in February.
The AMPC with the simple regression models (AMPC 5) shows a discomfort of 327 Kh/zone with operating cost of e0.58/m 2 .At the same operating costs level, the AMPC with the simple regression models only realizes a much higher discomfort.The AMPCs with ANN only (AMPC 1) and the mix of ANN and RF (AMPC 3) can best mimic the operating behaviour of the MPC.They achieve the same operating costs and a slightly higher discomfort of 32 Kh/zone and 30 Kh/zone.The control by the AMPCs with RF only (AMPC 2) and RF and ANN (AMPC 4) generate a higher thermal discomfort of 55 Kh/zone and lower operating cost of 0.56 e/m 2 .
In the lower part of Figure 14, the operating behaviour of the different schemes is shown for one week in February.Here, of the AMPCs that are based on ANN only (AMPC 1) and ANN and RF (AMPC 3), only the response for the ANN is shown, and for the AMPCs with RF (AMPC 2) and RF and ANN (AMPC 4), only the response for the RF is shown, since they result in a similar behaviour, respectively.The four AMPCs with ML algorithms can reproduce the behaviour of the MPC well and show only slight deviations in the room temperature control.Focusing on the room temperature response of the AMPC with the simple regression models (AMPC 5), it can be seen why the thermal discomfort is so high.At the beginning of the week, the AMPC 5 follows the course of the MPC, but fails to provide the set temperature in the middle of the week.In addition, with the simple regression models, the AMPC sometimes shows large upward deviations in room temperature and controls the living room air temperature to 25 • C.

Influence of forecast horizon
As a last feature set assessment, we evaluate the influence of different prediction horizons as features for the AMPC.In the previous sections, the disturbance predictions always assumed a prediction horizon of 6 h which is 50% of the MPC's prediction horizon.In this section, we compare the closed-loop performance of the RF-based approximators (AMPC 2) for a forecast horizon ranging between 0 and 6 h.The corresponding features are marked with future time steps in Table 5. Figure 15 illustrates the results for February.Here, we only discuss the results for AMPC 2 since the other approximators result in a comparable behaviour.Except for the AMPC with no predictions, i.e. an offline AMPC, all other AMPC configurations clearly outperform the RBC.The offline AMPC results in operating costs of 0.58 e/m 2 and a thermal discomfort of 130 Kh/zone.Consequently, the offline AMPC's operating costs fall well below the RBC's ones of 0.92 e/m 2 but its discomfort is significantly higher than the RBC's one of 102 Kh/zone.When adding more and more forecast information, we observe that the AMPCs' performance slowly increases until it reaches a threshold for a forecast horizon of 3 h.Here, the operating costs stagnate at around 0.57 e/m 2 and a thermal of 54 Kh/zone.This information is valuable for the feature selection process.In addition, by reducing the number of required future predictions, we reduce the induced uncertainty and, hence, the difference between perfect foresight and real operation.

Transferability assessment for different weather scenario
To test the potential of AMPCs for a full heating season from October to April, the controls are applied to the simulation model with a second weather data set for Milan, Italy.Consequently, the location is the same but a different weather scenario is analysed to investigate the controllers' transferability for unseen inputs.The weather data records a lower minimum temperature of −8.7 • C (compared to −7.4 • C) and a higher mean temperature of 11.95 • C (compared to 11.71 • C) than the test and training data sets of the previous sections.
Figure 16 illustrates the operating costs plotted against the thermal discomfort of the tested controllers.
The previously trained AMPCs are applied to the simulation model with a new weather scenario but without any additional changes and no retraining.Only the AMPCs with the machine learning models are used for validation, since the AMPC with the simple regression models did not manage to sufficiently mimic the operating behaviour of the MPC for the test month.The MPC and RBC are also tested with this weather data and serve as benchmarks again.We highlight at this point that the results are calculated for the full heating period (October to April) for the subsequent analysis.In the previous assessments, the costs and thermal discomfort only covered the month of February.For this scenario, the four ML algorithms also show significantly better KPIs than the RBC.The AMPCs using the same algorithm for the supply temperature also result in similar KPIs in each case.
This time, the two AMPCs with the RF for the supply temperature can better mimic the MPC.The AMPCs with the RF (AMPC 2) and the mix of RF and ANN (AMPC 4) result in slightly lower operating costs than the MPC with 1.90 e/m 2 , but a thermal discomfort almost twice as high.It is noteworthy that the KPIs now cover the full heating period and not only the month of February.Consequently, the costs and thermal discomfort are averagely higher.The MPC yields a discomfort of 43 Kh/zone and the two AMPCs of 71 Kh/zone.The AMPCs with the ANN for the supply temperature have a higher thermal discomfort of 82 Kh/zone for the ANN_RF and of 100 Kh/zone for the ANN only.For this, both AMPCs achieve lower operating costs of 1.70 e/m 2 .For the AMPCs with the same algorithm for the supply temperature, the AMPC with the RF for the valve positions shows the better KPIs.

Computational effort of controllers
Lower hard-and software requirements are one motivation to deploy AMPC and facilitate the application of advanced control strategies in practice.The AMPC can be deployed based on open-source software only and without the need of an expensive solver.However, in this study, despite these rather qualitative aspects, we also want to quantify its computational effort compared to the benchmark controllers.This is why we determine the average computation time for each control step.The results are depicted in Figure 17.We observe a clear trend that the AMPC approaches' computation times fall between the MPC's and the RBC's one.All approximators perform similar, achieving average computation times of 0.18 s to 0.21 s.Among the approximators, the purely ANN-based approach is the fastest but the deviations to the residual approximators is negligible.Overall, the computation time of the approximators falls between 15% and 18% of the MPC's one.However, With an average computation time of 1.17 s the MPC is clearly real-time capable considering a time step of 15 min.In comparison to the RBC, the approximators result in a 3.5 to 4 times higher computation time.We like to highlight at this point that the presented computation times should only be compared relatively to each other and not as absolute quantities.This is explained by the fact that the computation times are determined within the BOPTEST framework.This involves an expensive data infrastructure.In addition, none of the controllers were especially tailored to short computation times.E.g. within the AMPC configurations, we rely on expensive Pandas evaluations.Nonetheless, since this applies to all controllers, their comparison is relatively fair.

Discussion
Despite the results indicating this trend, we do not want to draw general conclusions that the traditional statistical methods are not able to outperform sophisticated machine learning models or RBCs.We put the focus on linear and logistic regression due to their simplicity but there exist more sophisticated but still traditional parametric learning methods.Nonlinear regression or generalize linear models are taken as examples.In addition, we investigate RF as representative for tree-based algorithms due to its high potential shown in literature.Yet, a comparison with simple decision trees would be an interesting enhancement of this study.Regarding the algorithm choice, we experienced that for our framework, training the RF was much faster and straightforward than finding the optimal hyperparameters for the ANN.We acknowledge that this highly depends on the implementation but training ANNs is generally a challenge since it involves a larger set of hyperparameters.
A comparison of the results reveals that the supply temperature prediction has a more significant influence on the closed-loop performance compared to the valve opening prediction.For example, if we compare ANN and ANN/RF (AMPC 1 and 3) or RF and RF/ANN (AMPC 2 and 4), respectively, we see only minor differences in the closed-loop KPIs.In contrast, when analysing ANN/RF vs. RF/RF (AMPC 3 and 2), the closed-loop performances clearly deviate.Therefore, it would be interesting to see if a mix of traditional statistical methods to predict the valve opening and a sophisticated machine learning model to predict the supply temperature would result in a similar performance like the presented ones.
Apart from that, we show that the adapted feature set outperforms the basic feature set despite a worse open-loop accuracy.We assume that an overlap of effects causes this behaviour.Thus, it is difficult to distinguish the features' influence on the closed-loop performance.I.e.we cannot separate the feature effects and, hence, explain if the addition of the two engineered features T set,D and T set,N , the exclusion of the selected disturbance predictions, or the exclusion of the heat pump's return temperature are the driving factors for the good closed-loop performance.Nonetheless, we can deduce general recommendations on how to design the feature set and give implications which features should be excluded despite their high correlation with the signal.In this context, it was a challenge during research to estimate the closed-loop behaviour based on the openloop results.We applied automated simulation-based testing to facilitate closed-loop evaluation.Nonetheless, it was a procedure requiring expertise.It would be interesting for future work to fully automate the process.
Moreover, the comparison of the computation times per control step show that all approximators clearly outperform the MPC.Yet, the underlying code structure was not specifically optimized for quick computation times.E.g. the approximators rely on expensive Pandas executions.Here, Numpy arrays could further reduce the computation effort.However, the comparison is relatively fair because the MPC was also not adapted to be computationally highly efficient (Zanetti, Kim et al. 2022).The illustrated results for the computation times serve as a proof of concept and underline the motivation to implement AMPC without extensive tuning.

Conclusions
In this study, we successfully implemented an AMPC for a hydronic heat pump system based on the BOPTEST framework (Blum et al. 2021).The focus lies on comparing sophisticated learning-based and more traditional statistical approximators as well as on comparing tree-based models with ANNs.In addition, we investigate two different feature sets to better understand a good feature selection process for AMPC applications.
The study confirms already existing findings in literature that AMPC is suitable to retain most of the teacher MPC's performance and outperform conventional RBCs (e.g.(Bursill, O'Brien, and Beausoleil-Morrison 2020;Coffey 2013;Domahidi et al. 2014;Drgoňa, Helsen, and Vrabie 2020;Drgoňa et al. 2018;Klaučo et al. 2014;Le, Bourdais, and Guéguen 2014;Löhr et al. 2019;May-Ostendorp et al. 2013, 2011;Piscitelli et al. 2019;Yang et al. 2021;Žáčeková et al. 2015)).However, we observe significant performance deviations when comparing sophisticated learning and traditional statistical methods.In all cases, the machine-learning-based approximators clearly outperform the traditional statistical methods (see Figure 14).A comparison among the approximators proves that the ANN and RF result in a similar behaviour.While the closed-loop performance assessment of the adapted feature set for the initial weather scenario reveals that ANNs are more suitable for predicting the supply temperature and the RF is more suitable for the valve opening prediction, the same trend cannot be observed for the alternative weather scenario (see Figure 16).In the alternative weather scenario, there is no clear superior controller.Consequently, we cannot generally conclude that ANNs are always favourable for continuous signal prediction and the RF, i.e. a tree-based algorithm, is always favourable for discrete signal prediction.Nonetheless, we observe a tendency that proves that trend which would also support the algorithm choices in current literature (see Table 1).
In addition to these findings, the comparison of two feature set reveals that a good open-loop performance is mandatory but not a guarantee for a good closed-loop performance (compare open-loop performances shown in Figures 7 and 12 with closed-loop performance illustrated in Figures 9 and 14).This finding is also supported by May-Ostendorp et al. (2013) who see a closed-loop performance decline for an approximator based on generalized linear models whose open-loop performance is high.For the presented use case, the first basic feature set represents almost all input data that the MPC received, while the second one (adapted) omits autoregressive features like the heat pump's supply temperature, applies dimensionality reduction by excluding correlating features, and includes more informative features like the set temperature deviation (see Table 5).Even though the basic feature set yields significantly better open-loop results, it results in an unacceptable closed-loop performance.This proves that feature selection for AMPC applications should be supported by extensive closed-loop testing and feature selection not only focussing on the open-loop prediction quality.
Apart from that, we investigate the influence of different forecast horizons on the closed-loop performance of the approximators taking the purely RF-based AMPC (AMPC 2) as an example.This proves that the addition of the disturbance prediction of the subsequent hour already boosts the AMPC performance (see Figure 15).These findings can also support the development of predictive RBC.We further demonstrate the transferability of the approximators based on an unseen weather scenario (see Figure 16).Here, the approximators using RF to predict the supply temperature outperform the ANNbased ones when compared to the initial weather scenario.Again, the closed-loop performance is similar to the MPC's one and the RBC is clearly outperformed serving as a proof of concept for transferability.
In addition, we have also shown that AMPC's computational effort is well below the MPC's one facilitating its application in practice (see Figure 17).Regarding the average computation time per control step, all approximators performed similar.They clearly exceed the MPC's computational speed but are slightly less performant compared to the RBC.
To summarize, we derived the following main findings in this study: • Machine-learning-based models like ANNs and RF both retain most of the MPC's performance and outperform traditional statistical methods like linear and logistic regression.• Good open-loop performance is a prerequisite but not a guarantee for good closed-loop performance.• Feature selection is critical.Engineers are encouraged to avoid the inclusion of features triggering autoregressive behaviour and to ensure that the feature importance is diversified to avoid overfitting.• The minimum required prediction horizon should carefully be selected.The shorter the horizon, the less uncertainty lies in the predictions supporting real-life deployment.

Outlook
This study presents a fully automatable procedure in which the controllers are replaced by machine learning and traditional statistical methods, the so-called approximators.However, the presented toolchain could be adapted in future work to refine existing RBC rather than fully replacing them.This process is better known as rule extraction.For the presented use case, the presented RBC could, e.g.be extended by knowledge extracted from the optimizations, as done for heating curves in Zanetti, Alesci et al. (2022).The result would be a predictive RBC.Doing so, the controllers' interpretability is expected to increase at the expense of a slightly worse performance.This process could be supported by using conventional decision trees since their structure allows for high interpretability.Another option would be to investigate how RF can be made more comprehensible since it is already based on decision trees.The RF's comprehensibility has not been the focus of this study.Here, the field of explainable AI comes into play.In addition, information criteria like the Akaike and the Bayesian information criterion could be used to realize a trade-off between approximator accuracy and comprehensibility.
In addition, as representatives for traditional parametric statistical methods, we use linear and logistic regression.However, there are many other traditional learners that could result in a better closed-loop performance than the demonstrated one.Nonlinear regression or generalized linear models are some examples for this future assessment.Furthermore, we recommend to not only demonstrate the controllers' transferability based on a new weather scenario and, hence, location, but also changing use case.In this context, the process of transfer learning is of interest.Finally, this use case only focussed on perfect foresight.Yet, it might be interesting to see if the controllers react differently to faulty forecasting.Here, the insighty gained by the assessment of the prediction horizon's influence on the closed-loop performance can be taken as a basis.

Figure 1 .
Figure1.Schematics of the two-zone apartment with a living (day zone) and a bedroom (night zone).The energy system comprises a heat pump and two valves and was first introduced by Zanetti,Kim et al. (2022).

Figure 2 .
Figure 2. Original internal gains and setpoint temperature assumptions of the original use case presented by Zanetti, Kim et al. (2022).

Figure 3 .
Figure 3. Adapted internal gains and set temperature profiles following Swiss standard SIA2024 (Seidinger and Menard 2006).

Figure 4 .
Figure 4. Parallel prediction of control signals.

Forest
combined with manual selection based on expert knowledge.The output from the data tuning stage is the tuned training and test data that serves as input for the subsequent model tuning phase.During model tuning, the following steps are carried out: • Model selection: Automated or manual model selection • Hyperparameter tuning: Bayesian optimization or Grid search • Training, testing, and evaluation

Figure 5 .
Figure 5. Closed-loop scheme using the MPC or approximated MPC as controller for the BOPTEST simulation model.

Figure 6 .
Figure 6.Exemplary operation in February of the MPC and the RBC.The top chart shows temperature for the living room, while the bottom chart for the bedroom.The black solid lines are the setpoint temperatures, the solid coloured lines the MPC results and the dashed coloured lines the RBC solutions.

Figure 7 .
Figure7.Open-loop accuracy for the supply temperature prediction using the basic feature set.The left plot shows the R 2 value while the right one illustrates the mean absolute error to predict the supply temperature.

Figure 8 .
Figure8.Open-loop accuracy for the valve openings using the basic feature set.The left chart shows the accuracy and F1 scores for the valve in the day zone (living room), while the right chart illustrates these KPIs for the night zone (bedroom).

Figure 9 .
Figure 9. Closed-loop KPIs for the different controllers for February (top) and the room temperature for the living room (day zone) in an exemplary week in February (bottom) for the basic feature set.

Figure 10 .
Figure 10.Comparison of the predicted and the true supply temperature for the RF for the basic feature set.

Figure 11 .
Figure 11.Feature importance regarding the supply temperature prediction for the two feature sets based on the RF impurity decline.The upper plot shows the feature importance distribution of the adapted feature set, while the lower part illustrates the distribution of the basic feature set.

Figure 12 .
Figure12.Open-loop accuracy for the supply temperature prediction using the adapted feature set.The left plot shows the R 2 value while the right one illustrates the mean absolute error to predict the supply temperature.

Figure 13 .
Figure13.Open-loop accuracy for the valve openings using the adapted feature set.The left chart shows the accuracy and F1 scores for the valve in the day zone (living room), while the right chart illustrates these KPIs for the night zone (bedroom).

Figure 15 .
Figure 15.Closed-loop performance based on different forecast horizons ranging between 0 and 6 h.The plot shows the operating costs and thermal discomfort for the adapted feature set for the configuration AMPC 2 with different forecast horizons.

Figure 16 .
Figure16.Closed-loop performance of the approximators trained based on the adapted feature set applied to a heating period (October to April) with an unseen weather scenario.

Figure 17 .
Figure 17.Average computation time for each control step.

Table 1 .
Relevant studies in the context of approximate MPC in building energy systems.

Table 2 .
Use case specifics for the BOPTEST emulator.

Table 4 .
Adaptation of the dynamic electricity price compared to the profile defined in BOPTEST.
thermal zones.Figure3illustrates the adapted boundary conditions.In contrast to equal boundary conditions, we define a day and a night zone.The day zone is the living room which is occupied during the day by one person.The night zone refers to the bedroom with an occupancy of two people during the night.The living room has a set temperature of 21 • C while the bedroom has a set temperature of 19 • C. For both zones, we assume a night setback to 17 • C. We assume the same boundary conditions for each weekday.These assumptions are inspired by the Swiss standard SIA2024

Table 5 .
Feature overview for the basic and adapted feature set.