Contrast-enhanced ultrasound is a reliable and reproducible assessment of necrotic ablated volume after radiofrequency ablation for benign thyroid nodules: a retrospective study

Abstract Purpose To investigate the intra- and inter-observer reliability and agreement of contrast-enhanced ultrasound (CEUS) in measuring ablated volume (Va) after radiofrequency ablation (RFA) for benign thyroid nodules. Materials This retrospective study evaluated 65 patients with 74 benign thyroid nodules who underwent RFA. Patients were followed up at 1, 3, 6, and 12 months and every 12 months thereafter. Two independent observers measured the Va using CEUS during the same follow-up visit. The intra- and inter-observer reliability was assessed using intraclass correlation coefficient (ICC) with 95% confidence interval. The Bland–Altman analysis was used to evaluate the inter-observer agreement, which was expressed as a mean difference with 95% limit of agreement (LOA). Results No significant difference was found in Va measurements by the two observers with a mean follow-up time of 41.17 ± 16.80 months (all p > 0.05). The intra- and inter-observer reliability were both excellent (ICC >0.90) at each follow-up period. The 95% LOA became wider over the follow-up period. The smallest 95% LOA was found at 1 month with a LOA from 0.8117 to 1.122, and the largest 95% LOA was from 0.5694 to 1.343 at 36 months. Conclusions CEUS could provide a reliable and reproducible assessment of Va after RFA for benign thyroid nodules. In clinical post-ablation follow-up, the irregular morphology of ablated area and the variation by different observers could not affect the assessment of Va by CEUS.

As minimally invasive treatments, the primary purpose of ablation was the resolution of cosmetic problems and nodule-related symptoms rather than complete treatment [19]. After ablation, the total volume (Vt) of nodule was divided into ablated volume (Va) and vital volume (Vv) [20,21]. Recently, some novel parameters have emerged using Va for calculation to evaluate the efficacy of ablation. Vv increase, which defined as a more than 50% increase compared to the previously reported smallest Vv, was found to be not only an early sign of nodule regrowth [20,22], but also an indication of additional ablation [23]. Moreover, a quantitative index, the initial ablation ratio (IAR) calculated by the ratio of Va to Vt at the first follow-up period, was developed to predict the therapeutic success after RFA [24,25]. All these parameters needed Va for calculation, which was measured on conventional ultrasound (US) based on a decreased hypoechoic zone without vascularity in the treated nodule [20,24,25]. However, the boundary between the ablated and vital zone was not easily differentiated on conventional US, making Va measurement potentially inaccurate [26,27].
Contrast-enhanced ultrasound (CEUS) was a contrast harmonic imaging technique that allowed the detection and characterization of focal lesions by assessing the micro-vascularization with a second-generation ultrasound contrast agent (UCA) [28][29][30][31]. Compared with conventional US, CEUS was a superior method for detection and definition of ablated necrotic zone induced by thermal ablation [2,32]. CEUS had higher reproducibility and agreement compared to conventional US in the measurement of Va [33,34]. Recently, the clinical application of CEUS for the ablated nodules during the follow-up has been recommended in the two guidelines [2,5]. However, it was unknown whether the irregular morphology of ablated area could prevent accurate the evaluation of Va. To our knowledge, no study investigated the measurement variability of Va using CEUS during the follow-up period of RFA for benign thyroid nodules.
Therefore, the purpose of this study was to investigate the intra-and inter-observer reliability and agreement of Va measurement by CEUS during the follow-up period of RFA.

Materials and methods
This study was approved by the Institutional Review Board of Chinese PLA General Hospital (approval number: S2019-211-01). Written information consent was obtained from all the patients prior to RFA and CEUS.

Patients
All the enrolled patients fulfilled these inclusion criteria: (1) confirmation of benign nodules status on two separate fineneedle aspiration (FNA) or core-needle biopsy(CNB); (2) no suspicious malignant features on US examination; (3) solid( 10% of fluid component) or predominantly solid nodules(11-50% of fluid component) [35]; (4) reported of cosmetic and/or symptomatic problems or concern of nodules growing rapidly or malignant transformation; (5) serum thyroid hormone and thyrotropin levels within normal ranges; (6) follow-up time ! 24 months; (7) underwent CEUS at each follow-up period. Exclusion criteria were: (1) malignancy findings or follicular neoplasm on FNA or CNB;(2) nodules with benign result on FNA or CNB had suspicious of malignancy in US; (3) follow-up time <24 months; (4) refused CEUS at each follow-up period.
From August 2014 to March 2018, 435 patients with benign solid/predominantly solid thyroid nodules underwent RFA in this institution. Among them, patients refused CEUS during the follow-up (N ¼ 219) or follow-up time less than 24 months (N ¼ 151) were excluded. At last, 65 patients with 74 benign thyroid nodules were enrolled in this study. The flowchart of patient enrollment is shown in Figure 1.

Pre-ablation assessment
Conventional US and CEUS before and after RFA, as well as during follow-up were performed using a Siemens Acuson Sequoia 512 Ultrasound System (Siemens, Mountain View, CA, USA) with a 15L8W linear array transducer or a Philips iU22 Ultrasound System (Philips Healthcare, Bothell, WA) with a L12-5 linear array transducer or a Mindray M9 Ultrasound System (Mindray, Shenzhen, China) with a L12-4 linear array transducer. RFA was always performed using a Siemens Acuson Sequoia 512 Ultrasound System with a 6L3 linear array transducer.
Sulfur hexafluoride (SonoVueR, Bracco. International, Milan, Italy) was used as ultrasound contrast agent. CEUS was performed after bolus injection of SonoVue (2.4 ml), followed by a 5 ml of normal saline flush.
Before treatment, the volume of thyroid nodules was calculated by ellipsoid formula: V ¼ pabc/6 (V is the volume, while a is the largest diameter, b and c are the other two perpendicular diameters). Symptom score was self-measured by patients using a 10-cm visual analogue scale (grade 0-10) [1]. The cosmetic score was assessed by physician (1, no palpable mass; 2, no cosmetic problem but palpable mass; 3, a cosmetic problem on swallowing only; and 4, a readily detected cosmetic problem) [1].

Ablation procedure
All RFA procedures were performed by an experienced US physician with more than 20-year experience in thyroid US and interventional US (Y.K.L). A bipolar RFA generator (CelonLabPOWER, Olympus Surgical Technologies Europe, Hamburg, Germany) and an 18-gauge bipolar RF electrodes with 0.9 cm active tip were used (CelonProSurge micro 100-T09, Olympus Surgical Technologies Europe, Hamburg, Germany) in this study.
Patients lay on an operating table in the supine position with the neck extended. Local anesthesia with 1% lidocaine was administered. RFA was performed using the trans-isthmic approach, hydrodissection technique and moving-shot technique. CEUS was performed immediately after the RFA procedure to evaluate the ablation area. If any enhancement existed, a complementary ablation could be performed. Each patient was observed for 1-2 h in the hospital while any adverse event including complication and side effect occurring during and immediately after ablation were carefully evaluated according to the clinical signs and symptoms [35].

Postablation measurement of Va
Two physicians (Observer A, Y.L with more than 10-year experience in thyroid US and CEUS; Observer B, X.J with 3year experience in thyroid US and CEUS) performed all the measurements. Before this study, the two observers standardized the measurements method. Va was presented as a non-enhancement zone within the treated nodule during both arterial phase and venous phase on CEUS [33]. The anteroposterior and transverse diameters of Va were measured on the transverse CEUS image with the largest dimensions, and the longitudinal diameter was measured on the longitudinal CEUS image with the largest dimensions. Va was measured with the calipers placed outside of the halo [36]. The measurement methods of Va by CEUS are shown in Figure 2.
Patients were scanned consecutively by the observers during the same visit. Only one observer was present in the ultrasound room at any time. For each patient, each observer performed a complete new set of CEUS scans, without knowledge of the other one's results. During the examination, the treated nodule was clarified first, and then the CEUS mode was switched. The real-time microbubble perfusion within treated nodule and surrounding tissues were observed for a minimum of 2 min and recorded digitally for further analyzed. After CEUS images were reviewed, the three diameters of non-enhancement zone during both phases were measured twice to calculate the means of each observer. Thus, a total of 6 measurements of Va were obtained for each nodule at each follow-up period.
After RFA, patients were followed up at 1, 3, 6, 12 months and every 12 months thereafter. The Vt, Va, VRR, cosmetic and symptom scores were evaluated during the follow-up period. The volume reduction was calculated as follows: VRR¼ ([initial volume-final volume] Â 100%)/initial volume. Therapeutic success was defined as a > 50% volume reduction at last follow-up point [35].

Statistical analysis
Statistical analysis was performed using the SPSS statistical software (version 25.0) and GraphPad Prism software (version 8.0.0). Continuous data are expressed as mean ± SD (range). Paired-sample t-tests were used for pairwise comparisons.
Reliability was defined as the extent to which measurements can be replicated, which reflects not only degree of correlation but also agreement between measurements [37]. The intra-and inter-observer reliability of Va was assessed using intraclass correlation coefficient (ICC) with 95% confidence intervals (CIs) based on the absolute agreement and two-way random effects model. Reliability was classified as follows: excellent (ICC > 0.90), good (ICC ¼ 0.75-0.90), moderate (ICC ¼ 0.5-0.74), and poor (ICC < 0.50) [37].
The inter-observer agreement of Va was assessed using Bland-Altman analysis. Agreement was expressed as the mean difference with 95% limits-of-agreement (LOA). The mean difference also called bias, was the tendency for one modality to underestimate or overestimate the measurement relative to the other [38]. LOA was the range within which 95% of the differences between measurements by the two observers would lie [39], and expressed the absolute magnitude of the agreement between the two observers. The width of LOA varied with the precision of measurements. LOA was wider and larger when measurements were imprecise and vice versa [40]. Before Bland-Altman analysis, the Kolmogorov-Smirnov test was used to assess the normality of the distribution. The measurements of Va by CEUS were performed as the ratio by the two observers. The conclusion on agreement should be made based on the width of LOA in comparison to a priori defined clinical criteria [40,41]. The clinical criteria of thyroid nodule volume using the ellipsoid formula were reported to be between ±13.1% and ±48.96% [42][43][44]. Therefore, the acceptable clinical criteria of volume in this study should be an LOA ranged from 0.5 to 1.5. A difference with p < 0.05 was considered as statistically significant.

Results
The clinical characteristics of patients are presented in Table  1

Safety
All the patients were tolerable to the RFA procedure. Side effects like pain occurred in five patients (7.69%) and resolved spontaneously within 7 days. No complications occurred during or after RFA. No patients had side effects or complications related to CEUS.

Intra-and inter-observer reliability
The measurements of Va by the two observers are presented in Table 3. No significantly differences of Va measurement were found at each the follow-up (all p > 0.05). A representative case is shown in Figure 3. The ICCs of intra-and interobserver reliability of Va were both excellent (all ICC > 0.90) during the follow-up period (Table 4).

Agreement
The inter-observer agreement of Va was summarized in Table 5. Bland-Altman analysis showed that the mean difference of Va were all around 1 during the follow-up period. The 95% LOA of Va became wider and larger over the follow-up period (Figure 4). The smallest 95% LOA of Va was found at 1 month with a LOA from 0.8117 to 1.1220, which meant that for about 95% of cases, the Va measured by observer A was between 0.8117 and 1.1220 times the Va measured by observer B. The largest 95% LOA of Va was from 0.5694 to 1.3430 at 36 months, which meant that for about 95% of cases, the Va measured by observer A was between 0.5694 and 1.3430 times the Va measured by Values are presented as mean ± SD or number of nodules (percentages).

Discussion
This study investigated the intra-and inter-observer reliability and agreement of Va measured by CEUS after RFA for benign thyroid nodules. The results showed that no significant differences in Va measurements by two observers were found. The intra-and inter-observer reliability were both excellent during the follow-up. The inter-observer agreement decreased over the follow-up time, and the largest 95% LOA was still within the clinical criteria. It suggested that the intra-and inter-observer reliability and agreement of Va measured by CEUS were overall satisfactory. US as an easily accessible noninvasive and cost-effective measurement modality, was the most common method to evaluate nodule volume [2,45]. However, it had a main drawback, which was the observer dependence and might result in variability [42,45]. The reported inter-observer reliability of nodule volume measurement ranged from ±13.1% and ± 48.96% [42][43][44]. It demonstrated that the volume changes of at least 49% or more could be interpreted as nodule shrinkage or growth or as therapy effects, which was recommended by 2015 American Thyroid Association Guidelines as the cutoff value of volume change [46]. Recent study also investigated the inter-observer reliability of Vt measured by conventional US during the follow-up after RFA for benign   thyroid nodules [47]. The results showed that the 95% LOA of Vt became larger and wider over the follow-up period. The largest 95% LOA was between 0.8471 and 1.1733 at 36 months after RFA, which was within the clinical criteria. It suggested that the inter-observer reliability of Vt measurement by conventional US during the follow-up was also acceptable. RFA and other thermal ablation techniques have been recommended as safe and effective alternatives for benign thyroid nodules [1][2][3][4][5][6]. These treatments could selectively destroy the targeted nodule, and Vt was divided into two parts after RFA, which was Va and Vv [20,21]. With the progress of extensive studies in recent years, some novel parameters have emerged to evaluate the efficacy of ablation, which were all calculated by Va. Vv increase, which defined as a more than 50% increase compared to the previously reported smallest Vv, occurred earlier than nodule regrowth and might be used as an early sign [20,22]. After comparing the volume reduction and improvement of symptom and cosmetic scores in patients treated with additional RFA after different indication, Vv increase was also a more appropriate indicator for additional RFA than clinical evaluation findings (i.e. nodule regrowth, incompletely relieved symptoms, or VRR < 50%) [23]. Moreover, a quantitative index IAR was determined as the ratio of Va to Vt to predict the therapeutic success of ablation [24]. The results showed that if IAR was larger than 70%, therapeutic success of ablation could be expected. However, this finding was contrary to a recent study with 5-year follow-up period, which showed that the cutoff value of IAR for therapeutic success was 49% [25]. There were several reasons to explain the discrepancy results between those two studies, including the retrospective nature, different clinical characteristics, and heterogeneous follow-up length. The most important reason was that the measurement method of Va in those studies were based on conventional US not CEUS. Our recent study found that Va measured by conventional US was significantly larger than measured by CEUS [33]. The intra-and inter-observer reliability and agreement between conventional US and CEUS in measuring Va decreased over the follow-up period not only in nodules > 10 ml but also in nodules < 10 ml. The intraand inter-observer reliability for all size nodules were excellent at 1 month, good at 3-6 months, and moderate at 12-24 months, respectively. The best agreement was found at 1 month with a LOA of 0.574 to 2.246 in nodules > 10ml and a LOA of 0.413 to 3.294 in nodules <10ml, which were both larger than the clinical criteria. Similarly, Schiaffino et al. [34] also reported that CEUS had higher reproducibility and intra-and inter-observer agreement compared to conventional US in the assessment of Va measurement after RFA for benign thyroid nodules. It indicated that conventional US could neither reliable nor provide equivalent results compared to CEUS in the measurement of Va in both small and large nodules.
Accurate detection and measurement of the true Va were essential to evaluate treatment efficacy and guide follow-up management after ablation. When Va was needed for further evaluation of efficacy, CEUS should be considered instead of US [33]. Because CEUS was a superior technique for precise definition of the size and margins of the ablated nodules induced by thermal ablation by UCAs [2,28,30,32,48], it has been recommneded in recent guidelines [2,5]. However, the intra-and inter-observer reliability and agreement of Va measurement by CEUS during the follow-up of ablation was unclear. This study found that Va shrunk during the followup period, which was consistent with the results from previous study [22]. At each follow-up period, the intra-and inter-observer reliability of Va measured by CEUS were both excellent, and the 95% LOA of Va were all within the clinical criteria. It indicated that Va measured by CEUS after RFA was reliable and reproducible, which could not be affected by the irregular morphology and different observers. It was very important because in the clinical follow-up after RFA, it was almost impossible to measure the nodule by the same observer at each follow-up. Moreover, as Va decreased over the follow-up, the 95% LOA became wider and larger, suggesting the inter-observer agreement was getting greater. Similar results were also reported by Choi et al. [43], which found that inter-observer variation of volume measurement was greater in smaller nodules. The explanation might be associated with the ellipsoid formula of volume. As the ablated area decreased and re-absorbed, any difference between each diameter of the Va might increase the variation. Several factors were also associated with the variation in measurements, including the imaging plane acquisition, transducer location, angulation, pressure as well as the differences in the manipulation of the calipers [42]. Therefore, the measurement of Va still needed to be carefully performed. The definition and measurement methods of Va by CEUS should be standardized and clarified, and the measurements could be repeated twice to obtain the mean value.
In general, CEUS was extremely safe with a low incidence of side effects and complications [28]. UCAs were administered safely in various applications with minimal risk to patients, which were not excreted through the kidney with no evidence of any effect on thyroid function [31]. The incidence of serious anaphylactoid reactions was 0.006% and of life threatening anaphylactoid reactions was 0.001% [48]. The most frequent adverse events were headache (2.1%), nausea (0.9%), chest pain (0.8%) and chest discomfort (0.5%) [31]. In this study, all the patients were tolerable to CEUS during the follow-up period and no complications or adverse events occurred.
There were some limitations in this study. First, it was a single-center study. Second, the sample size was relatively small. The nodules in this study were not divided into subgroups based on initial volume, which was recommended by the recent reporting criteria of thyroid ablation [35]. Third, several clinical issues remained about CEUS, such as high cost, time-consuming, and invasiveness procedure. However, the accurate measurement of Va was an important element to evaluate the treatment efficacy, regrowth, and additional ablation. When Va was used to calculate Vv increase or IVR, CEUS should be applied [33]. Fourth, the follow-up time was relatively short. This study will be continued to follow up these patients to obtain more conclusions. Fifth, this study did not compare the inter-observer reliability and agreement between CEUS and microvascular imaging techniques, such as superb microvascular imaging (SMI) in measuring Va. Although no significant differences in Va measurement and detection of incomplete ablation rates between CEUS and SMI after laser ablation for benign thyroid nodules [49], the intra-and inter-observer reliability and agreement of different modalities for Va measurement in the follow-up period after ablation are needed to be investigated.
In conclusion, CEUS could provide a reliable and reproducible assessment of Va after RFA for benign thyroid nodules. In clinical postablation follow-up, the irregular morphology of ablated area and the variation by different observers could not affect the assessment of Va by CEUS.

Disclosure statement
The authors declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.