Triple negative breast cancer in Bulgaria: epidemiological data and treatment patterns based on real world evidence and patient registries

Abstract Breast cancer is the most common oncologic disease among women worldwide. Survival rates vary significantly and depend on early diagnosis practice and access to treatment. Triple negative breast cancer (TNBC) accounts for 12–15% of all breast cancers. We performed a one-year real-life retrospective study on the patho-histological status and treatment of a representative cohort of patients with TNBC in Bulgaria. We collected anonymised data for TNBC patients from the electronic artificial intelligence platform Sqilline - Danny Platform. Demographic characteristics, data on biomarkers, TNM (Tumor, Nodule, Metastasis) stage, therapeutic regime, line and changes in treatment were chronologically analysed for all patients. The results were processed through descriptive statistics. For the observed period, Jan 2019–Dec 2019, 6880 breast cancer patients from eight major oncology hospitals were included in the database. The average age of the women was 60 y; 234 (3.4%) of them were diagnosed with TNBC; 10% had unknown TNM stage. The majority of the patients were assigned to chemotherapy (84%) of which 35% were on adjuvant. Most changes in the therapy were observed in the neo-adjuvant group).The results from this study provide evidence that the treatment patterns of TNBC, and changes in therapy are in compliance with international guidelines. We identified less patients with TNBC than the frequencies reported in international epidemiological studies. This might be attributed to lack of funding of necessary tests or insufficient data in patients record. The study confirms that dynamic patient registers are important for performing a real-world studies of treatment patterns.


Introduction
Breast cancer (BC) is the most common oncologic disease among women, with one in every ten being diagnosed worldwide. The disease is the fifth reason for mortality. Survival rates vary among countries and mostly depend on the availability of early diagnosis practice and access to treatment [1]. A recent study of the American Cancer Society shows that diagnosis at early stages increases the 5-year relative survival rate: 99% for localized tumours and 86% for regional ones, but advanced stages are associated with a 5-year relative survival rate under 30% [2].
Triple negative breast cancer (TNBC) is considered to be the most aggressive subtype and accounts for 12-15% of all breast cancers [3]. It is related to poor prognosis because it is characterized by a lack in the expression of the oestrogen and progesterone receptor and human epidermal factor receptor 2 (HER 2). TNBC has high probability of metastasis and lack of specific targets and thus is challenging to treat [4]. Studies have shown that there are evident unmet medical needs in TNBC patients related to no treatment or poor treatment outcomes [5].
Significant progress has been made in the innovation and access to more precise treatment of breast cancer based on the development of novel assays and therapies targeting molecular and immune mechanisms. Recent trends aim to provide individualized approach to each patient based on accurate staging, gene mutation identification, molecular subtype determination, treatment regimen choice and assessment of risk of recurrence or progression and thus increasing the chance of more promising treatment, especially for challenging types of cancer such as TNBC. The effectiveness of novel therapies is difficult to prove if there is not structured access to the data for all patents assigned to therapy and the needed diagnostic tests [6][7][8][9].
In the era of innovations in digital technologies and value-based patient care substantial part of the medical information is still unstructured. Extraction and systematization of information from medical records are of a great significance for improvement of diagnosis, treatment, survival prediction, resource allocation and decision making [10]. Hence, in recent years there is an increased interest towards the term 'big data' , its interpretation and potential use in health economics and outcomes research (HEOR) [11].
In this respect patient records and population-based registries may contribute not only to predict treatment outcomes and survival rates of cancer patients but for resource and cost allocation. They could provide evidence from the real-life treatment practice for the effectiveness of novel targeted biotech therapies. In recent years apart from the original role to provide epidemiological data, some cancer registries started to collect population-based information on the overall efficacy of the health care by summarizing data on treatment patterns, progression, survival status and cause of death, thus providing overall mapping of the patient road. This information could be very useful in the evaluation and improvement in good treatment practices, access to medicines and decision making [12]. It also could help in selecting suitable patients for innovative therapies that could not benefit from the available options.
The data from the Bulgarian National Cancer Registry for 2015 rate breast cancer as a leading cause of morbidity and mortality among women. Morbidity accounts for 26.8% of all oncology cases and the mortality rate is about 17.4%. The relative 5-year survival is 71.7% but is still less than the average for Europe (81.8%). Approximately 71.2% of the women are diagnosed in early stages and 26% in stage III and IV [13]. А few local studies have evaluated the risk associated with BRCA (BReast CAncer gene) mutations, showing that a BRCA1 mutation was found in approximately 10% of the diagnosed women and BRCA2 in approximately 10%, but there are no official data on the percentage of TNBC patients [14].
This prompted our interest to examine the current state of epidemiological and clinical data, treatment patterns of TNBC in Bulgaria and their correspondence with the global epidemiology and therapeutic guidelines and recommendations by using the capabilities of artificial intelligence (AI) programming.

Ethics statement
Due to the retrospective observational nature of the research ethical permission was not obtained. Patients recruited in the database were previously enrolled with the permission of the Ethical committees of the corresponding hospitals.

Study design
A one-year real-life retrospective study on the patho-histological status and treatment of a representative cohort of patients with triple negative breast cancer (TNBC) was performed.
For the data collection and analysis, we used the electronic artificial intelligence platform Sqilline -Danny Platform (www.sqilline.com). The platform collects massive amounts of real-world data and analyses unstructured information from patients' records from all university hospitals and major oncology centres in Bulgaria thus covering almost all available cancer cases in the country. Within the system, complex Deep-learning Natural Language Processing (DLNLP) has been developed in order to make data ready for analyses of patient treatment, drug efficacy and patient recruitment of clinical trials much more efficient and valuable. The application provides quantitative predictive insights based on real-world data and utilizes a suite statistical and machine learning algorithm to provide real-time analyses and feedback, including important clinically-relevant endpoints.
Anonymised data were extracted from the platform for the period Jan 2019-Dec 2019. All patients with clinically proven TNBC treated during the period of data collection were selected. The criteria for inclusion were HER2 negative test, lack of expression of oestrogen and progesterone receptor. We included patients assigned to therapy. Criteria for exclusion were HER2 positive test, or presence of hormonal receptors expression.
The data for the selected eligible patients were chronologically systematized and the following characteristics were analyzed:

Patients' characteristics
For the observed one-year period (Jan 2019-Dec 2019) 6 880 breast cancer (BC) patients from eight major oncology hospitals in Bulgaria were included in the dynamic database. The average age of all women with BC included in the database was 60 years. Of them, 234 (3.4%) were diagnosed with TNBC and out of these 10.9% (n = 24) had unknown TNM (Tumor, Nodule, Metastasis) stage (Table 1).
In the cohort, 57% of the patients were diagnosed at stage I and II and approximately 30%, at III and IV stage of the disease. This result could mean that, despite the lack of a national screening programme for breast cancer in Bulgaria, there is a trend for diagnosis at earlier stages of the disease.

Therapy characteristics
During the observed period, all 234 patients were assigned to therapy. Most of the patients were assigned only to chemotherapy (n = 197); 35% of the patients (n = 99) were assigned to adjuvant therapy, with the majority of them only to a chemotherapeutic regime ( Table 2).
Eighty-three of the patients assigned to adjuvant therapy received chemotherapy. Only 34 of the patients needed tumor-size reducing chemotherapy before surgical treatment and were assigned to neo-adjuvant therapy ( Figure 1 and Table 3).
Only 7% of the patients in the observed cohort reached 3rd, 4th and 5th line of therapy, which may be attributed to the fact that they were either not eligible to 1st or 2nd line therapy (due to contraindications or lack of therapeutic effect), or mortality. Further subgroup analysis in greater depth focused on the chemotherapy regime assigned to each patient depending on the type and line ( Table 4).
The majority of the chemotherapeutic regimes were platinum-based. The preferred products are toxoids (docetaxel, capecitabin and paclitaxel) and were used as monotherapy in 63 patients (16 on adjuvant, 5 on neo-adjuvant, 18 on 1st line and 13 on 2nd to 5th line therapy) or in combination as well as with other alkylating agents, platinum products and osteomodulators (n = 103).
Hormonal products like goserelin, letrozole, tamoxifen, exemestane and anastrozole were prescribed to 11 patients on adjuvant therapy and in combination with chemotherapy as neo-adjuvant and with chemoor targeted therapy as 1st line treatment (n = 1 and n = 3, respectively). Targeted therapy with biotechnological products trastuzumab and pertuzumab was prescribed relatively rarely as monotherapy for 6 patients or in combination with other chemotherapeutic agents (n = 11). One patient was assigned to the novel PARPi olaparib and one to a CDK 4/6 inhibitor (palbociclib). Biotechnological denozumab and pertuzomab in addition to trastuzumab alone or in combination were employed mostly in the adujvant therapy. Bevacizumab was mainly prescribed as a 1st line therapy in combination with chemotherapy.

Therapy change
A change in the therapy was observed in 25 of TNBC patients ( Figure 2). Most of the changes were observed  in the group of patients who were assigned to neo-adjuvant therapy (n = 9, 36%) -. This could be attributed to the fact that neo-adjuvant therapy is assigned to patients who are not eligible for surgical and adjuvant treatment at the time of diagnosis due to inoperable tumour sizes. In 7 patients from this group, the therapy remained neo-adjuvant but with a change in the pharmacotherapy, which may be due to unsatisfactory outcomes. In two patients, the neo-adjuvant therapy led to a change to adjuvant therapy (n = 1) and to first line therapy (n = 1).
In the adjuvant therapy group, 4 patients also stayed on adjuvant regime but with a change in the prescribed medicines. In the 1st line therapy group, all changes were related to a switch either to another cytostatic medicine or addition of another therapeutic group to the previously assigned treatment ( Table 5).
Comparison of the changes in the therapy before and after the change show that for 5 patients the therapy became more complicated with adding new and more agents, while the rest started to receive less demanding therapy.
To our knowledge this is the first study on the role of patient registries and real-world data in the breast cancer domain in Bulgaria that uses AI methodology. This methodology could be used for further analyses in this field and for the evaluation of other types of breast cancer. Тhe collected epidemiological data provide additional value to the increasing role of real-world data and artificial intelligence tools in health care and their importance for both clinical and decision making. These data could also have predictive value in terms of guidelines compliance and changes in therapy, which is important for a more personalized treatment approach.
The study shows that a lot of progress has been made in recent years in the attempt to handle the amount of unstructured medical data [15,16]. Emerged software landscape now allows the implementation of predefined operations on huge volumes and different type of data. Machine learning algorithms now can extract different patterns of data and provide valuable insights [17,18]. Few medical projects are now available on Big Data techniques in the domain of breast cancer that help in the prognosis and decision making. Machine learning algorithms are able to extract and structure a wide range of information including medical history, tumour size, node involvement, biomarkers identification, risk factors and changes in therapy [19,20].
Danny platform is an innovative way to analyse real-world treatment outcomes, survival rates,   chemotherapy only  83  30  46  20  7  3  8  hormonal therapy only  11  0  0  0  0  0  0  targeted therapy only  2  1  3  0  0  0  0  chemo + targeted  3  1  1  2  0  0  0  hormonal + targeted  0  0  3   statistical patient information and therapy changes. The solution uses complex deep-learning Natural Language Processing in order to make data ready for analyses of patient treatment, drug efficacy and patient recruitment of clinical trials much more efficient and valuable. Sqilline, Danny Platform data sets can enhance precision medicine for patients and physicians based on real-world evidence collection. The artificial intelligence (AI) platform can collect, combine, analyse and share data, relevant to all factors which influence a person's health. Moreover, in recent years, patient registries have an increasing role in the process of providing real-world evidence on the effectiveness, quality and safety of the assigned therapy and other health care resources. Data from patient registries improve population-based studies, which are of a great importance for scientific, clinical and policy purposes [21].
In this study, we evaluated data from a patient registry based on deep learning machine that extracts data directly from the patients' hospital records. We obtained real-world evidence on a particular cohort of patients on predefined criteria for type of breast cancer tumor, TNM status, therapy type and line. The results from this study provide valuable information for the compliance of the assigned treatment regimens with the national and international guidelines of the European Society for Medical Oncology (ESMO) and the National Comprehensive Cancer Network (NCCN) for changes in the therapy and other important details such as tumor staging and therapy lines [22,23]. These results also show the importance of establishing dynamic patient registries based on the collection of real world data as these could further describe the epidemiological and treatment patterns. The Bulgarian Cancer Register provides enormous data on the prevalence, incidence and mortality of cancer diseases by gender, age and localization, which is more static and lacks dynamic data as assigned therapy type and line treatment, changes in therapy and progression of the disease. To fill this void, dynamic registries, such as the Sqilline Danny Platform, can provide more focused analysis of the epidemiological and clinical data provided by the Bulgarian Cancer Register.
The results from our study confirm that the median age of women diagnosed with TNBC is predominantly above 55 years, but is below the average global incidence from all invasive breast cancer types: 3% vs. 10-20% [24,25]. This could be attributed to lack of a nationally organized screening programme and to elderly women being not so regular in their prophylaxis compared to younger ones, which leads to postponed diagnosis of breast cancer in Bulgaria. Still the Danny Platform relies on the data completed in the patient records and, if there is missing information, it could make it difficult for the system to classify the patient. However, in our cohort, most of the patients who were diagnosed in earlier stages were younger and this may be related to the increasing role of oncology patients' organizations and medical specialists in providing educational programmes on breast cancer and the importance of prophylaxis for early diagnosis. In the recent years, we also observed an improvement in the diagnostic methods and therapy, as well as an improved patients access to novel therapies. This real-life study showed also that the assigned therapeutic regimes followed the pharmaco-therapeutic guidelines for oncology and were compliant with the recommendations of NCCN and ESMO, and confirmed  the results of a previously conducted study comparing the treatment practices and access to health care services of breast cancer patients in Central and Eastern Europe [26]. The main type of therapy in TNBC is the cytostatic chemotherapy and, as there is still no unified standard therapeutic scheme, it is recommended for most of the schemes to be platinum-based with the addition of paclitaxel. Targeted therapy is rarely applied and is mostly trastuzumab and pertuzumab-based. The recent advances with the development of cancer therapies against mutations like PARP inhibitors (olaparib, talazoparib) are expected to provide a more personalized approach to each patient, thus increasing the chance of more promising treatment for challenging types of cancer like TNBC. However, these medicines were recently authorized for breast cancer by the European Medicines Agency and currently only one, olaparib, is available in Bulgaria through the reimbursement system. The other reason for the limited number of patients on targeted therapy could be that genetic tests (BRCA, PD-L1 expression, growth factors, etc.) are still not covered by public funds in Bulgaria, so patients have to pay by themselves. Despite this, however, to start targeted therapy, the National Health Insurance Fund requires such tests. In recent years, progress has been made towards a more patient-centred treatment approach, with the development of targeted therapy, but there is a need of further biomarker (molecular and immune checkpoint) evaluation in order to assign the most appropriate therapy. Therefore, we can recommend inclusion of TNBC tests in the public financing schemes.
The results of the study show also some important gaps that still need to be covered. We found missing information regarding the age, TNM staging and changes in the therapy. As Sqilline's analytics platform extracts the information directly from the patients' records, it is possible that some data are missing from the records, themselves. There are a lot of challenges related to processing data with missing parameters. Therefore it is extremely important for physicians to enter the patient data correctly and fully. The accuracy of the analysis and the data that we will extract depend solely on the correctness of the patients' information. In order to solve this challenge, Sqilline has embedded various deep-learning Natural language processing (NLP) and Machine learning (ML) algorithms in order to make data suitable for analysis.
Our study has some limitations mainly due to the fact that it evaluates only a certain subtype of breast cancer and a relatively small number of patients included in the retrospective analysis. It could be further used to evaluate the recent advancement in the TNBC therapy added in the therapeutic armamentarium. The study also does not show the reasons for the change in the therapy but we could suppose that it was due to either development of adverse drug reactions, lack of desired therapeutic effect or progress of the disease. This, however, could be further evaluated as this type of information is not missing in the database but needs further structuring and analysis.

Conclusions
The results from this study provided evidence that the treatment patterns of TNBC, and changes in therapy are in compliance with international guidelines. We identify less patients with TNBC than the frequencies reported in international epidemiological studies. It might be attributed to lack of funding of necessary tests or insufficient data in patients' records. The study confirmed that the dynamic patient registers are of a great importance in performing real-world studies of treatment patterns.