Application of the ESHO-QA guidelines for determining the performance of the LCA superficial hyperthermia heating system

Abstract Purpose This study aimed to assess the quality of the lucite cone applicator (LCA), the standard applicator for superficial hyperthermia at the Erasmus MC Cancer Institute, using the most recent quality assurance guidelines, thus verifying their feasibility. Materials and methods The assessment was conducted on each of the six LCAs available for clinical treatments. The temperature distribution was evaluated using an infrared camera across different layers of a fat-muscle mimicking phantom. The maximum temperature increase, thermal effective penetration depth (TEPD), and thermal effective field size (TEFS) were used as quality metrics. The experimental results were validated through comparison with simulated results, using a canonical phantom model and a realistic phantom model segmented from CT imaging. Results A maximum temperature increase above 6 °C at 2 cm depth in the fat-muscle phantom for all the experiments was found. A mean negative difference between simulated and experimental data was of 1.3 °C when using the canonical phantom model. This value decreased to a mean negative difference of 0.4 °C when using the realistic model. Simulated and measured TEPD showed good agreement for both in silico scenarios, while discrepancies were present for TEFS. Conclusions The LCAs passed all QA guidelines requirements for superficial hyperthermia delivery when used singularly or in an array configuration. A further characterization of parameters such as antenna efficiency and heat transfer coefficients would be beneficial for translating experimental results to simulated values. Implementing the QA guidelines was time-consuming and demanding, requiring careful preparation and correct setup of antenna elements.


Introduction
Numerous randomized clinical trials have shown that hyperthermia is a potent biological sensitizer when added to radiotherapy and/or chemotherapy [1][2][3][4][5][6][7][8].In systemic reviews and meta-analyses, Datta et al. demonstrated therapeutic benefits for head and neck [9], breast [10] and cervical [11] cancers.This benefit was revealed in terms of improvement of either local tumor control, progression-free survival, and overall survival after adding hyperthermia to radiotherapy.Moreover, there is consistent evidence over time on correlations between treatment outcome and temperature, on one hand, and treatment outcome and thermal dose delivered to the target, on the other hand.This applies for multiple tumor pathologies [12][13][14][15][16][17][18].Therefore, evaluating hyperthermia devices on their ability to apply controlled and conformal heating through quality assurance (QA) guidelines is essential in this process.Hyperthermia QA guidelines ensure the effectiveness of clinical treatments is independent of the device used for the heating treatment.In other words, QA aims to guarantee that hyperthermia heating devices can apply controlled, reproducible, and uniform high-quality treatments [19].
Quality assurance guidelines were introduced for hyperthermia at its naissance [20].More recently, the technical committee of the European Society for Hyperthermic Oncology (ESHO) has updated new guidelines, specifically for superficial [19] and interstitial [21] hyperthermia.The new ESHO QA guidelines emphasize the ability to achieve effective tissue temperature under clinical conditions rather than energy-related parameters.Furthermore, the design of the new QA protocols better addresses the fact that there is a wide variation of technologies available to apply hyperthermia treatments, for example, external infrared sources, microwave antennas, radiofrequency electrodes, capacitive electrodes, or ultrasound transducers heating systems.In these new QA guidelines, a minimum level of quality performance has been defined independent of the clinically used superficial hyperthermia system.The performance of the device is evaluated in terms of temperature increase achieved in 6 min at different depths in a layered fat-muscle mimicking phantom.In particular, the device must produce a temperature increase of at least 6 � C above the starting temperature in 6 min at 1 cm depth in a muscle-tissue equivalent phantom (which is 2 cm depth in the fat-muscle phantom) [19].Other quality parameters were also defined.Among these, the thermal effective penetration depth (TEPD) which corresponds to the depth at which the maximum temperature increase is 50% of the maximum temperature increase, and the thermal effective field size (TEFS), defined as the area within the 50% of maximum temperature increase contour.It should be noted that for these two parameters no minimal requirements or values were specified for the devices to be considered adequate [19].
At the Erasmus MC Cancer Institute, the Lucite Cone Applicator (LCA) has been used since 1998 as the standard device for superficial hyperthermia for the treatment of breast cancer recurrences [22][23][24][25].The LCA is a 434 MHz water-filled horn applicator with a larger effective field size compared to conventional waveguide applicators [26,27].Its square aperture of 10 � 10 cm 2 allows an easy and flexible use of multiple antennas in an array configuration with independent temperature control for each individual antenna.When arranged in a 2 � 3 array, superficial tumor areas up to 20 � 30 cm 2 can be heated with a specific absorption rate (SAR) that is within the 25%-line contour of the maximum achieved SAR (25% iso-SAR contour).The clinical and technical performance of this applicator has been extensively investigated and reported in comparison with conventional waveguides [28] and in SAR simulations [29,30].
However, until today, no new evaluation of the LCA against the most recent ESHO-QA guidelines [19] for superficial hyperthermia has been performed.Therefore, this study, for the first time, reports on the verification and temperature-based evaluation of the superficial LCAs used at the Erasmus MC Cancer Institute.An extensive quantitative evaluation of heating performance of the LCA in single and array applicator configurations was performed and compared to numerical modeling.

Phantom configuration and material characterization
A layered fat-muscle phantom (Figure 1) was produced according to the ESHO-QA guidelines [19] and used for the evaluation of the LCAs.The recipe for the fat phantom can be found in [31], while the superstuff-agar muscle phantom was produced following the recipe given in [19].
The configuration of the phantom consisted of a 1 cm thick fat-mimicking layer overlaying a muscle mimicking phantom, having an overall thickness of 9 cm, and subdivided in five different layers according to the indications in the guidelines [19].The distribution of the layers is illustrated in Figure 1.
Each layer was casted with the help of custom-designed PVC frames, getting a surface area of 50 � 40 cm 2 .Each of these muscle layers was covered with a thin mylar plastic (thickness of 0.06 mm) [32] to prevent mold formation and water evaporation.For this study, the use of a vertical split phantom was not considered since the horizontal plane at 2 cm depth in the fat-muscle phantom must always be measured (Figure 1).Moreover, vertical profiles can still be obtained using a horizontally split phantom by reconstructing the vertical temperature distribution from the thermal camera views of multiple horizontal planes.
Prior to the experiments, dielectric and thermal properties of the phantoms were verified, according to La Gioia et al. [33].These properties were assessed on phantom samples of thickness of approximately 5 cm and diameter of 6 cm.All the measurements were conducted at room temperature.Conductivity and permittivity were measured using an openended coaxial probe, DAK 12 (Schmid & Partner Engineering AG, Zurich, Switzerland) with 4 MHz − 3 GHz frequency range, connected to a two ports Rhode and Schwarz ZNC 3 VNA.The probe was calibrated with an open-short-load routine, using distilled water at room temperature as load material.The calibration accuracy was verified by using a saline solution with known properties (Supplementary Materials).The maximum deviation from the mean value was below 1%, at the frequency of interest, 434 MHz.Three measurements for each sample were conducted.The final values for conductivity and permittivity were calculated by averaging these.The dielectric properties were measured between 4 MHz and 1 GHz.The thermal conductivity (k) and volumetric heat capacity (c) of the phantoms were assessed by means of a commercial thermal analyzer (TEMPOS, Meter Group, Inc., Pullman, WA, USA, accuracy: 10%) with the dualneedle sensor (SH-3).The needles were inserted entirely in the phantom samples.This measurement was based on the hot wire technique: heat was transferred from one needle for 30 s to the surrounding tissue.The consequent temperature increase was then recorded by the second needle for 90 s and the thermal properties were derived by the device, as described by Silva et al. [34].An average of five consecutive measurements performed at each sample was calculated.The density of the phantom samples was determined by the measured volume divided by the mass weighed with a precision scale of cubic shaped phantom samples.The volume of the sample was estimated by measuring all edges of the cube with a ruler: height, length, and width.

LCA applicators and signal generation
The LCAs are the devices currently employed at Erasmus MC Cancer Institute for hyperthermia treatments of superficial tumors (Figure 2).These devices derive from conventional rectangular horn antennas while the two diverging brass side walls parallel to the electric field are replaced with lucite, having 0.28 mm thickness [26].The antenna cone is filled with circulating temperature-controlled deionized water.The antennas can be used as single elements or in combination.A single antenna has an aperture of 10 � 10 cm 2 .Up to six applicators are available and can be combined in an array to treat a 600 cm 2 area with independent energy control per applicator.
All six available LCAs were singularly tested.Afterwards, combinations of two (2 � 1 array) and four (2 � 2 array) applicators were also evaluated.The four single LCAs that had the maximum net power to the antenna were selected to perform the combinations of two and four applicators.

Single applicator setup
For the measurements involving a single LCA, a generator with a maximum power of 200 W (pinkRF, Nijmegen, The Netherlands) at 434 MHz was connected through a bidirectional coaxial coupler (3022, Narda-MITEQ, Hauppauge, USA) to the LCA applicator, as shown in Figure 3.The forward and reflected power were measured using power sensors (E4412A, Keysight, Santa Rosa, USA) and a digital power meter (EMP-442A, Hewlett Packard, Palo Alto).The values for each configuration are summarized in Table 1.
The LCA location was defined prior to the positioning of the device to guarantee at least a 2 cm distance from the phantom edges.Marks were drawn on the phantom surface as guidance for the antenna placement.Following this reference, the LCA was positioned on the top of a 2 cm thick temperature-controlled deionized water bolus (fabricated inhouse), with a 20 � 20 cm 2 surface area, placed on the phantom previously described.The LCA was secured on top by a fixation arm and special attention was given to position the LCA aperture surface parallel to the phantom surface.The temperature measurements were conducted using multi-sensor fiberoptic temperature probes (FISO FOT-NS-577E, Fiso, Quebec, Canada).Additionally, an infrared (IR) thermal camera (T1020, FLIR, Wilsonville, USA), mounted perpendicularly over the phantom on a supporting structure was used to measure the 2-dimensional (2D) temperature distribution at each exposed layer.

Array configuration setup
For the assessment of 2 � 1 and 2 � 2 LCAs arrays, a setup similar to the one used for single applicators measurements was adopted.For the array configurations, each LCA applicator was individually fed with independent operating (incoherent) generators, each with a maximum output power of 250 W, (MED-LOGIX SRL, Rome, Italy).Additionally, to maintain the demand of 2 cm distance between the LCA edge and the water bolus edge, larger water boluses, i.e. 20 � 30 cm 2 for the 2 � 1 array and 45 � 35 cm 2 for the 2 � 2 array, were used to adequately cover the aperture areas of the multiple antennas.The forward and reflected power to each LCA were measured using digital power meters (EMP-442A, Hewlett Packard, Palo Alto).

Temperature measurements and assessment
As previously mentioned, the temperature assessment was performed both by means of fiberoptic temperature probes, having a 0.2 � C accuracy, and an IR camera with absolute  error of ±1 � C. The fiberoptic probes used were equipped with six sensors each, spaced every 2 cm.Explicitly, a single fiberoptic temperature probe was employed below each LCA, placed at 2 cm depth in the fat-muscle phantom (Figure 1).The sensor located on the tip of the probe was placed in correspondence of the center of the antenna aperture.This procedure aimed to verify that the temperature increase at the central point under the applicator aperture was at least 6 � C in 6 min.
All single LCA measurements were performed on different days or, if on the same day, during early morning and later afternoon, ensuring sufficient time for the phantom to return to room temperature (22 � C).This condition was verified using the fiberoptic temperature probe before each experiment, reporting a temperature of 22 � C (±1 � C).Moreover, when two single LCA measurements were performed on the same day, different areas of the phantom were used, to ensure that the same initial conditions were met for all the measurements.Both the 2 � 1 and 2 � 2 LCAs array measurements were performed on different days.The time span between the first and last measurement was two weeks, enough to prevent any significant change or deterioration of the used phantom materials [19].
The power was turned on for 6 min, during which the temperature at 2 cm depth in the fat-muscle phantom was assessed, with the multi-sensor fiberoptic temperature probe.This ensured that the minimum of 6 � C criterion was fulfilled.Directly after turning off the power at 6 min, the temperature distribution at each layer was assessed with the infrared camera, with a sequential approach, starting with the measurement of the temperature distribution at 0 cm depth and continued until reaching the surface at 5.5 cm depth.Namely, the LCAs, as well as the water bolus were put away from the phantom and a thermal image of the first layer was taken.Then, the first layer was removed and a thermal image of the second layer was taken.This process was rapidly repeated until the last layer was reached.The maximum time elapsed between the acquisition of the first and the last thermal image was approximately 1 min.

Electromagnetic modeling
To perform the electromagnetic and thermal modeling reproducing the heating experiments, the software package Sim4Life (v5.2 Zurich MedTech AG, Zurich, Switzerland) was used.The electromagnetic propagation in the 3D phantom model was predicted using finite-difference time-domain solver.Two different phantom models were used (Figure 4).In model 1, a geometrically perfect, i.e. canonical phantom model composed of homogenous fat and muscle material layers was used.In model 2, a computed tomography (CT) scan of the phantom was taken, which was then automatically segmented in MIM (version 7.1.6,MIM Software Inc., Cleveland, OH, USA) into different tissues, namely, fat-and muscle-equivalent tissues and air.The threshold used for the segmentation as well as summary statistics for each tissue are summarized in Table 2.The volume was calculated excluding 2 cm from the edge, where the segmentation is less accurate, and no measurements nor simulations were performed.The realistic phantom model was imported into Sim4Life, and a full 3D phantom model was generated.
The electromagnetic tissue parameters at 434 MHz are given in Table 3. Different total numbers of grid cells were Figure 3. Schematic of the measurement setup.The generator was connected through a bidirectional coupler to the LCA antenna.The forward and reflected power were measured by a power meter through power sensors.The temperature increase at 2 cm depth in the fat-muscle phantom (Figure 1) was monitored by a fiberoptic probe to ensure an increase of at least 6 � C in 6 min.Generator and temperature measurement system were controlled by the user through a suitable software installed on an external PC. tried to obtain grid-independent simulations solution.We found that a nonuniform grid (14 million grid cells), in which the maximum and minimum grid steps were 5 mm and 0.5 mm, respectively, allowed to obtain grid-independent simulations results (Supplementary Materials, Figure S1).The simulations took between 30 min and 2 h, and the resulted electromagnetic loss energy, for each of the phantom setups, was used as input for thermal modeling.To equalize the input power in the modeling with that of each measurement, the thermal simulations were scaled according to the net power to each antenna during the performed measurements (Table 1).The 3D temperature distribution was calculated using the heat equation: where q is the mass density (kg/m 3 ), c (J/(kg�K)) is the specific heat capacity, T (K) is the temperature, t (min) is the time, k (W/(m�K)) is the thermal conductivity, S (W/kg) is the SAR, which served as a source for the thermal simulations.Energy losses were modeled using a mix of Dirichlet and Neumann boundary conditions.This mixed boundary conditions were applied using the following heat transfer coefficients (h) and outside temperature (T): phantombackground (h ¼ 4 W/m 2 /K; T ¼ 22 � C) and phantom -water bolus (h ¼ 152; 107; 91 W/m 2 /K; T ¼ 25 � C).The mentioned heat transfer coefficient values correspond to the different configurations (single, 2 � 1 and 2 � 2 LCAs) and were taken from van der Gaag [35] for each of the different setups.The water bolus temperature was kept at 25 � C for all measurements, and this was the used value in all simulations.

Data analysis
To characterize the heating properties and obtain the temperature increase of the different experimental LCA configurations, both the IR thermal images and the thermal model data were analyzed and compared.The maximum temperature increase at each different depth was obtained from the experimental measured data.The data were fitted, with an exponential decay function, in order to obtain the depth at which the maximum temperature increase is 50% of the maximum temperature increase at 2 cm depth in the fatmuscle phantom, that is, the TEPD.The same fitting procedure was performed for the simulated data (the two phantom models) to obtain the TEPD following the exact same procedure that was applied for the experimental data.The curves obtained directly from the solver can be found in the Supplementary Materials (Figure S2).
The fitting function was applied only to the points that were measured in the muscle layers, because it is expected that different tissues will have different behaviors and therefore would need different fitting functions.Since a fitting function is only needed for the TEPD calculation, i.e. after 1 cm of depth in the fat-muscle phantom, this procedure may be applied.The impact of the phantom model was also assessed by comparing the differences between the maximum temperature rise at each depth for the canonical phantom model and the realistic phantom model.To evaluate the temperature pattern, specifically at 2 cm depth in the fat-muscle phantom, the TEFS, that is, the area within the 50% of maximum temperature increase contour in the muscle was calculated.This was computed for both experimental and simulated data.

Phantom characterization
The measured electrical and thermal properties of the phantom are shown in Table 4.The values obtained were within the range reported in the ESHO quality assurance guidelines for the dielectric properties, with a small variation (around 7%) for the muscle phantom conductivity, while a 10% variation from the reference values for the thermal properties was found.For the sake of comparison, Table 4 shows also the dielectric and thermal properties for the muscle and fat tissue, given by the IT'IS database [36].

Temperature measurements
The maximum temperature increase measured from the IR images data for all configurations, i.e. single, 2 � 1 array and 2 � 2 array is summarized in Table 5.The median and range were presented for the single LCA configuration, since six different antennas were available for the experiment.The temperature rise appears similar for multi-element arrays, both for the 2 � 1 and the 2 � 2 configurations.The highest  temperature increase was achieved for the 2 � 1 array.For single antennas, the maximum temperature increase is achieved in correspondence of the center of the antenna aperture.When an antenna array is used, the maximum temperature increase is realized in correspondence of the aperture of the most efficient among the used antennas.The corresponding temperature depth profiles for each antenna setup are shown in Figure 5.The experimental data were fitted with an exponential decay function of the form DT ¼ a e −b x , where DT is the maximum temperature increase in � C, x is the depth in cm and a, b are fitting parameters.Note, that the first data point, corresponding to 0 cm depth, was not included in the fitting, since this point was measured in a different tissue (fat compared to the remaining data points measured in muscle).It is possible to observe an initial build-up region until 1 cm depth, where the maximum temperature is achieved, followed by an exponential decay.As expected, all the fit curves follow this tendency, with a particular agreement between the curves obtained for the antennas used in an array configuration.

Comparison between simulation and measurement data
To guarantee a fair comparison between simulated and measured data, the impact of the time interval between power off and the capture of the last thermal image was assessed.For this, a temperature decay was simulated for a duration of 1 min after power was turned off.In Table 6, the absolute difference in � C between the temperature at power off and 10, 20, 30, 40, 50 and 60 s (DT 10 , DT 20 , DT 30 , DT 40 , DT 50 , DT 60 ) after power off is presented.All these results were obtained for the canonical phantom model.It is possible to conclude that for all configurations, the highest temperature difference was obtained at 1 cm depth in the fat-muscle phantom.Since this was the second phantom layer to be captured with the thermal camera, no more than 20 s elapsed after power off.Therefore, differences up to 0.3 � C could have been experienced.However, since the uncertainty of the thermal camera is 1 � C, even after 60 s, all the values obtained are below this uncertainty.Therefore, no back extrapolation to compensate for temperature decay was applied.

Thermal effective penetration depth (TEPD)
The depth temperature profiles of the experimental data were compared against those obtained from modeling (both canonical and realistic phantoms).As done with the experimental data, also the simulated data were fitted with the same exponential decay function described above.These comparisons are depicted in Figure 6, specifically for single LCA, the array of 2 � 1 LCAs and the array of 2 � 2 LCAs.The fitted curves suggest that the temperature increase is not reaching 0 � C, close to a depth of 5 cm.
With the fitted curves, the experimental and simulation TEPD for single LCA, 2 � 1 and 2 � 2 array configurations were computed.These data are summarized in Table 7. Interesting to see that the impact of the segmented phantom model, i.e. realistic phantom, does not seem to be equal Table 5. Maximum temperature increase after 6 min of heating obtained in each layer of the phantom for all three measured configurations: single LCA, 2 � 1 LCAs and 2 � 2 LCAs.The median and range are presented for the single LCA measurements, since six different antennas were used.All measurements were performed with the power settings presented in Table 1.The experimental data points were fitted with an exponential function, excluding the temperatures point at 0 cm depth, that is, the fat layer.The first and fourth data points for the 2 � 1 and 2 � 2 are overlapping.All measurements were performed with the power settings presented in Table 1.

Maximum temperature increase (
Table 4. Phantom properties measured and used in the electromagnetic and thermal modeling at 434 MHz, with their uncertainty.The range of values for the dielectric properties indicated in the ESHO QA guidelines for the phantoms are reported in a separate row.Specific values are given for the thermal properties.Thermal and dielectric properties of muscle and fat (I -infiltrated, N.I.-not infiltrated) tissue taken from the IT'IS database are also reported, together with their uncertainty.for all array configurations.The more accurate TEPD is obtained with the realistic phantom for 2 � 1 and 2 � 2 LCA arrays, while for the single configuration, the canonical phantom model gives the closest prediction.Nevertheless, all the predictions from the canonical phantom model are below 12% of difference.The difference between simulated and experimental data was also evaluated, for each depth point.The results are shown in Figure 7. Higher temperatures were obtained experimentally, with a total mean difference between simulated and experimental data of −1.27 � C for the canonical model and −0.37 � C for the realistic model.Data obtained from the realistic model showed differences within ±1 � C for all depths, except for the 5.5 cm interface.This indicates a good agreement between the simulated realistic phantom data and experimental data.Conversely, data obtained from the canonical model showed discrepancies in the first data point, that is, bolus to fat interface (0 cm depth), second data point (1 cm depth) and the last data point (5.5 cm depth).

Thermal effective field size (TEFS)
Regarding the TEFS, the isoline of the half maximum temperature was obtained for 2 cm depth of fat-muscle phantom (Figure 1).The 2D temperature distributions for one of the single LCA setup, 2 � 1 LCAs array and 2 � 2 LCAs are represented in Figure 8.For the single LCA setup, the shape of the TEFS given by simulated data from the realistic phantom model seems to be the most similar, although the hotspot on the experimental data seems to be less intense.For the 2 � 1 array, all the shapes seem qualitatively similar, although the realistic phantom model appears to have a more identical shape.Finally, for the 2 � 2 array, there is a clear distinction between the shape of the simulated and experimental data, being the latter more spread, having, thus a greater TEFS.For this array configuration, both the canonical and realistic models give a very similar TEFS in terms of shape.
The calculated TEFS from both the experimental and the simulated data are shown in Table 8.For single antenna configurations, the median value and the corresponding range is reported.No significative variability was observed when the canonical phantom was used.Larger TEFS values are obtained for the experimental data.Both the numerical models show a smaller TEFS.The modeling values that show the best prediction, i.e. closest to experimental data, were found for the 2 � 1 array configuration.For the single LCA and the 2 � 2 LCAs array, the two phantom models seem to give a similar TEFS value, with small differences between themselves.

Discussion
This study is the first to report the results of the ESHO-QA guidelines evaluation on a superficial hyperthermia system, specifically the LCA antennas, used at the Erasmus MC Cancer Institute.This evaluation is a novelty, since most studies have been evaluating electromagnetic parameters, such as the SAR.Here, we report the results of QA measurements for single LCA and multiarray LCA configurations based on temperature, specifically evaluating two parameters, the TEPD and the TEFS [19].In addition, a comparison of the experimental temperature data with simulated data was performed, investigating two different in-silico model phantoms, distinguished by their complexity.

Phantom characterization
When evaluating hyperthermia heating systems, the initial step was to build phantoms representing human tissues.To ensure the reliability of the results and a proper numerical modeling of the experimental setup, it was important to assess the properties of the tissue-mimicking phantoms.These properties must be comparable with the reference values, within an acceptable range of variability of ±10%.The measured properties for fat and muscle phantoms produced in this study were then compared with the properties given in the IT'IS database [36], both reported in Table 4.The deviation of the phantom properties from the tissues was limited within the ±10% interval for the muscle phantom.The fat phantom dielectric properties were representative of the average fatty tissue properties, as well as the thermal ones.The fat-muscle layered phantom provided a basic shape for the validation of a hyperthermia applicator model, which was the primary aim of this study.
The general use of preserving agents is, due to their toxicity, not allowed in modern phantom recipes.Therefore, we did not use any in this study, leading to rapid deterioration and molding.Development of new recipes, ideally the socalled dry phantoms, is encouraged to allow longer lasting phantoms, which can be used repeatedly.Concerning the fat phantom, the recipes proposed in the guidelines have complicated preparation procedures or have a short life due to evaporation and molding.For these reasons, we adopted the recipe of De Lazzari et al. [31], which overcomes these two critical points.This fat phantom recipe does not require the addition of water, preventing bacterial proliferation and therefore guaranteeing a long-lasting material.While using long lasting phantoms, however, both dielectric and thermal  Table 7. Thermal effective penetration depth (TEPD) computed for both experimental and simulated data.The median and range are presented for the single LCA measurements, since six different antennas were used.The percentage differences between simulated and experimental TEPD are also reported in separate lines.properties should always be measured before each experiment, to mark any change over time.

The importance of (accurate) modeling
The temperature distribution within the layered fat-muscle phantom has been evaluated on six different horizontal surfaces.At 2 cm depth in the fat-muscle phantom (1 cm depth in the muscle phantom), the maximum temperature was found to be more than 6 � C after heating for 6 min.This follows the prescribed minimum requirement for superficial hyperthermia systems.This minimum requirement was obtained in all three experimental configurations.At the remaining depths, the maximum temperature increase was studied to assess the temperature behavior at different depths.As expected, after the initial build-up, a temperature decay was found with increasing depth.In the last measured depth point (5.5 cm), the temperature was higher than expected, since with a total phantom muscle thickness of 9 cm the energy should be sufficiently decayed at such depth.The simulated values were showing a temperature lower than that obtained from the experiments, for all the configurations tested (single LCA, 2 � 1 and 2 � 2 LCAs arrays).Possible explanations include the inhomogeneity of the muscle phantom in certain areas and the presence of air pockets.The realistic phantom model attempted to account for those, but the segmentation from the CT scan might not have been detailed enough, possibly resulting in inaccuracies.Moreover, a precise match between the simulation and the experimental antenna position was also difficult.The exact tilting of the antenna, the thickness of water bolus under the antenna are components difficult to control and mimic in the simulation.These made it harder to have the exact same result between experimental and simulated data.
Concerning the TEPD, a good agreement between experimental and simulated data was found.The fitting functions were a good approximation to characterize the temperature decay, leading to similar TEPD between simulated and experimental data, obtained for all configurations.All the obtained TEPD values ranged between 2.8 cm and 3.2 cm, which indicates that optimal heating with these antennas occurs up to 3 cm in depth.Still for the TEPD values, the differences between the canonical and realistic phantom models were not significant, namely 1%.When assessing the differences between the experimental and simulated temperature increase at each depth, it was apparent an improved matching for the data obtained with the realistic phantom model.These findings suggest that the realistic phantom was a better approximation to the performed experiments, especially for the more complex configurations (2 � 1 and 2 � 2 arrays).Worth mentioning is the difference between the two simulations for the first layer, that is, the fat phantom layer (bolus -fat interface).With the realistic phantom model, the standard deviation of difference to the experimental temperature was approximately 0.5 � C, while for the canonical model, this was 1.4 � C.This might be explained by the irregularities of the fat layer which are not modeled with a geometrically perfect canonical flat surface.
When evaluating the TEFS values, the match is not as good as for the TEPD values.For both single and 2 � 1 LCAs configurations, the realistic model TEFS is more accurate than that of the canonical model.For the 2 � 2 LCAs configuration, the TEFS difference between the measured value and the simulated value for each phantom model is around 30%.However, the simulated values for both phantom models were consistent.This could be explained by different factors.Firstly, the more antennas are used, the more complex is the experimental setup and hence, the mimicking of the setup becomes more challenging.For instance, the exact thickness of the water bolus under each antenna and the extremities might not have been homogenous.Another example is the position of the antenna, which sometimes, might have not been exactly parallel to the phantom, because of slight tilting.All these factors could have contributed to uncertainty that increases in multiple antenna configurations.Therefore, the model with the realistic phantom, although mimicking better the real phantom, does not take into account the above-mentioned uncertainties, i.e. it is not sufficient to mimic the applicator setup and position.Moreover, the efficiency of the applicators is expected to be different, and this parameter has not been assessed.This is something that should be kept in mind in future studies.Applicator efficiency could be calculated through SAR measurements with a calibrated E-field probe, in a liquid of known dielectric properties and an input power level where thermal effects can be neglected [29].Furthermore, parameters as the real radiation characteristics were also not modeled, being also limiting factors on simulations.Reliability and resolution of thermal images is something that should also be considered as a source of possible errors.It is important to emphasize that the current version of the guidelines for superficial HT was solely based on simulation results.However, the findings presented in this study highlight the significance of experimental verification as a means of comparison to the simulation outcomes.There might be noticeable disparities between the simulated values and the actual experimental data, particularly when assessing the TEFS.As mentioned earlier, several factors can significantly impact the result, such as applicator misplacements, bolus thickness, heat dispersion, and data recording uncertainty.These cannot be adequately accounted for in the simulation environment.Therefore, relying exclusively on simulations is inherently limiting and warrants the need for appropriate experimental verification.
An interesting approach to improve modeling accuracy and its comparison with experimental data and thermal images is presented by Drizdal et al. [37].Photogrammetry reconstruction technique holds improved measurement setup reconstruction and thus the agreement between measured and simulated absolute SAR.Despite its time demands, this technique could be a promising approach to improve the matching of the 2D temperature distributions.

Experience on the application of ESHO-QA guidelines
In the framework of this experimental study, the implementation of the QA guidelines proved to be challenging to  some extent, especially when one aims to implement them in clinical routine.Correctly prepared phantoms should not contain air bubbles and the layers should be even and uniform, which makes the preparation a demanding procedure.
In addition, the measurement of the thermal and electrical parameters of the phantom material requires specific highend equipment, which may not be available in each hyperthermia institution.This limits multi-institution reproducibility of the procedure because the phantom properties might vary.Secondly, as stressed above, the sole measurement of temperature by means of a thermal camera is, in certain cases, insufficient to consider the numerous factors at play, including the temperature of the bolus and the exact positioning of the applicator.Another noteworthy aspect is the time required to comprehensively perform the procedure.In the present case, eight different measurements were performed.Some repetitions were necessary due to possible execution errors, such as mispositioning of the applicator, leading to excessive reflected power.Changing of water in the bolus was also necessary due to high conductivity.Given the limited availability of the equipment, as expected in a clinical context, and the need to wait for about 12 h between one measurement and the next, a total time of two weeks was devoted to the characterization of the superficial applicators.To this, the five days needed for the phantom preparation must be added.Within this time window, repetitive measurements for each specific antenna and array configurations were not performed, which may also, to some extend impact the presented results.Reproducibility, for instance, could not be assessed since no repetitions were performed.Nevertheless, it is rather clear that in clinical practice there will be larger variations in positioning than in phantoms and thus the impact of mispositioning, movement and breathing of patients on the quality of heating treatment will be larger than those measured in phantoms.In other words, the patient is continuously moving, whereby also the breathing (moving of chest wall) will cause a constant variation in skin to applicator surface contact, and this will cause bigger uncertainties than those from experimental measurements.However, it is highly advisable to perform (single) phantom measurements since these teach users the impact of reproducible positioning on the quality of the induced SAR and temperature distributions.
Additional measurements for instance to assess efficiency or the heat transfer coefficient and the application of thermographic sensors should be considered, since these are expected to improve the comparison between the simulation and the experimental data.This could however increase the experimental time, which is sometimes not feasible in a clinical setting, where continuation of clinical treatment has higher priority.All this reasoning, certainly, represents a downside of the current guidelines, which could be solved by a procedure optimization, namely using more advanced phantoms, for instance of solid material, or excluding multi-array sets of measurements.
It is worth noting that the Erasmus MC Cancer Institute is an exceptional case, as they utilize multiple in-house built applicator arrays as a part of their standard clinical practice.
Most currently commercially available superficial hyperthermia devices use single, compact applicators, which means that the evaluation process could potentially be completed in a quicker manner.However, this does not diminish the importance of addressing the other constraints mentioned earlier, such as the need for thorough preparation and multiple measurements.

Conclusions
This study showed that it is feasible to apply the QA guidelines for the characterization of superficial hyperthermia heating systems.The results obtained for the LCAs were in accordance with what is expected in terms of performance of the superficial systems and are satisfactory based on the procedures performed.However, the practical implementation of the QA guidelines turned out to be time-consuming, as well as demanding.The phantom production was challenging, and a perfect phantom was not obtained, due to the presence of air bubbles and in homogeneities.Other practical aspects of the experiments were also challenging, for example to ensure a correct and reproducible positioning of the antennas.
The experimental results were compared with simulated results, obtained with a canonical phantom model and a realistic phantom model segmented from CT imaging.The second model included the air and inhomogeneities of the produced phantom.Even for this realistic phantom model, differences in the TEFS were observed.These indicate that a more accurate model might be needed to better mimic the whole experimental setup, including for example accurate positioning of the antenna.Therefore, the translation of the simulated values to experimental results requires a more extensive parameters characterization and accurate modeling of all details of the experiment setup.

Figure 1 .
Figure 1.Schematic illustration of the fat-muscle layered phantom.The first layer (light blue) represents the fat phantom material and the following layers represent muscle phantom material.Thermometry probes were positioned at 2 cm depth in the fat-muscle phantom (1 cm depth in the muscle phantom).

Figure 4 .
Figure 4. Schematics of the phantom used in the modeling.In (a) the geometrically perfect, canonical phantom model is illustrated and in (b) the realistic model based on CT-segmented phantom.

Figure 5 .
Figure5.Measured temperature increase depth profile for the three different antenna setups: single, 2 � 1 and 2 � 2 LCAs in blue, red, and yellow, respectively.The experimental data points were fitted with an exponential function, excluding the temperatures point at 0 cm depth, that is, the fat layer.The first and fourth data points for the 2 � 1 and 2 � 2 are overlapping.All measurements were performed with the power settings presented in Table1.

Figure 6 .
Figure 6.Temperature depth profile and fitting curves for experimental and both simulation data, for the three different configurations: (a) single LCA, (b) 2 � 1 LCAs array and (c) 2 � 2 LCAs array.For a 0 cm depth, the experimental data points (orange circles) are behind the realistic phantom data points (dark blue diamond).

Figure 7 .
Figure 7. Difference between simulated data and measured data.Both the canonical phantom (green) and the realistic phantom (purple) data are shown.

Figure 8 .
Figure 8. Experimental (first column) and simulated (center column -canonical model, right column -realistic model) 2D visualization of the temperature increase after 6 min heating on the horizontal plane located at 2 cm depth, for an increasing number of LCA antennas.Top row: one single LCA, Middle row: 2 � 1 array, bottom row: 2 � 2 array.The footprints of the antennas are also shown as white dashed lines.

Table 1 .
Measured forward and reflected powers, and calculated net power for each of the antenna configurations.For single applicators, the median and range values are given, since six different antennas were used.

Table 2 .
Parameters and statistic values for the tissues segmented to generate the realistic phantom model.(HU: Hounsfield units).

Table 3 .
Electromagnetic and thermal properties used for the modeling of the LCA antenna at 434 MHz.

Table 6 .
Temperature decay after 10, 20, 30, 40, 50 and 60 s after power off at different depths for all the antenna configurations.

Table 8 .
Thermal effective field size (TEFS) computed for both experimental and simulated data.The median and range is presented for the single LCA since six different antennas were used.The percentage differences between experimental and simulated TEFS are also reported in separate lines.