A Study on Cancer Patients in the Region of Stockholm by Linking Data from Multiple Sources

Methods: This study is based on a research database with more than 78 million records with person-linked diagnoses, drug treatment, and socioeconomic characteristics from eight national and regional registries, for patients with a recorded cancer diagnosis or treated with cancer drugs during 2001-2011. For this cross-sectional registry study 7,378 patients diagnosed with prostate, breast, or skin cancer during 2009-2010, were selected to assess patient characteristics, comorbidities and drug treatment.


Introduction
The surveillance of drugs newly introduced in clinical practice with regard to efficacy, safety and cost-effectiveness, is of utmost importance for patients, health professionals, pharmaceutical companies, regulators, and payers [1][2][3]. Therefore systems for monitoring disease pattern, drug utilization and outcomes of the treatment are required [4].
At marketing approval medicines have proved a positive benefitrisk ratio based on pre-clinical and clinical studies. However, there is still concern about the effectiveness and safety in broader patient populations [4]. These challenges are especially important in the cancer field, where the number of patients studied in clinical trials is often limited [5]. Efforts to facilitate approval of novel treatment, including by conditional approval, have been made during the last years [5][6][7]. This increases the need of post-marketing surveillance to assess safety and effectiveness [8][9][10]. Pharmacoepidemiological studies on large number of patients may be useful to provide post-approval evidence at a relatively low cost [2]. However, these studies are afflicted with several challenges, including difficulties in assessing complete and valid data [11,12].
Population-based registries in the Nordic countries provide good opportunities for pharmacoepidemiological research due to unique identification numbers for all citizens [13,14]. In a large number of studies in the Nordic countries, prescription registries have been used to assess utilization patterns as well as safety or effectiveness of the therapy [13]. Only a limited number of these studies and studies from other countries with access to similar registers have included oncology drugs [13,[15][16][17][18][19][20][21][22][23][24], despite the growing expenditure and the large number of new drugs to be introduced in the coming years [25,26]. This could be explained by the fact that many oncology drugs are administered in hospital settings and are, therefore, to a limited extent included in ambulatory care prescription registries. Furthermore, to the best of our knowledge, none of the previously published populationbased studies on cancer drugs has included data on socioeconomics, diagnoses recorded by other healthcare providers than in the specialist settings, drugs prepared for parenteral administration, or dispensed prescription drugs for other conditions than cancer [15][16][17][18][19][20][21][22][23][24].
As part of a project for improvement of the introduction and surveillance of new drugs in Stockholm, Sweden [26], a model of record linkage to assess utilization patterns of cancer drugs was initiated. The aim of the present study was to study the possibility to use our real world research database for follow-up of cancer patients. For this purpose; comorbidities, drug treatment, and socioeconomic status among the three most common cancer diseases were analyzed for patients with a new cancer diagnosis recorded during 2009-2010.

The research database -Sources and periods
Data from patients with a recorded malignant cancer diagnosis or dispensed cancer drugs during 2001-2011 were selected. Information about comorbidity, other drug treatments, mortality and socioeconomics was added to these patients using the personal identity numbers [14]. For this purpose data was requested, depending among other on availability, from the following organizations: a) The National Board of Health and Welfare (NBHW) -the Swedish Cancer Register (incident cancer cases 2001-2010) [27][28][29], the National Patient Register (all hospitalizations and outpatient consultations in specialist care with diagnoses 2001-2011) [30,31], the Swedish Prescribed Drug Register (July 2005-2011) [32], and the Cause of Death Register (2001-2011) [33]; b) Statistics Sweden -demographic and socioeconomic data (education, income, civil status and country of birth) (2009-2011) [34]; c) Apoteket AB (the provider of pharmacy services at the time of the study) -cancer drugs prepared for parenteral administration from the Sjukhusapotekens läkemedelstillverkning (SALT)/the hospital pharmaceutical manufacturing database (June 2008-2011) [35]; d) Stockholm County Council -the regional administrative database on all healthcare consumption, including primary care consultations with diagnoses (2005-2011) [36][37][38]; e) The Regional Cancer Centre in Stockholm; the National Quality Registry (INCA) for New Cancer Drugs (2010-2011) [39] (Figure 1).

The research database -Codes for selection
The following classification systems were used: the current Swedish version [40] of the International Statistical Classification of Diseases and Related Health problems Tenth Revision (ICD-10) [41] for diagnoses; the International Classification of Diseases in Oncology second edition of ICD-O [42] for the diagnoses (topography or site) (ICD-O-2) and morphology (for selection of malignant tumours) from the Swedish Cancer Register. The Anatomical Therapeutic Chemical (ATC) classification [43] was used for drugs.
Six of the registries were used to select patients residing in Stockholm and Gotland (S-G region) with cancer diagnoses: C00-C97 (malignant tumours), D00-D09 (cancer in situ), and D37-D48 (tumours of uncertain or unknown nature) or patients with dispensed oncology drugs as prescriptions or for parenteral use (ATC codes: L01 (cytostatic and cytotoxic agents), and L02 (endocrine therapy) ( Figure 1).
Data from the different registries on other diagnoses (recorded in specialist in-and out-patient, and primary care), deaths, and demographic and socioeconomic data was linked to the selected patients. Data on parenteral drugs (ATC codes: L03 (immunostimulants), and L04 (immunosupressants), prepared for infusion or injection, and all other prescription drugs dispensed in ambulatory care from the Swedish Prescribed Drug Register were also added.
Compilation of these data sources was conducted by NBHW and Statistics Sweden through record linkage using the Swedish personal identity number applied in all the registers [14]. The resulting research database consisted of 29 data files containing more than 78 million individual level records for different periods during 2001-2011 ( Figure 1).

Patient data linkage
The patient data was anonymized by NBHW and Statistics Sweden (sociodemographics) before delivery by replacing the personal identity numbers by unique serial numbers. A key file containing both personal identity numbers and the corresponding serial numbers was created by Statistics Sweden and was stored for 3 years. The authorities cooperated and shared the key file for the requested data. Applications were approved and confidentiality agreements signed with the data owners before data delivery. Data from the different registries was received as datasets separated according to source.

Study design and setting
This cross-sectional registry study was based on 7,378 patients with prostate, breast or skin cancer recorded in the Swedish Cancer Register of the healthcare region Stockholm during 2009-2010. This S-G region comprises approximately 23% of the Swedish population (9.7 million inhabitants 2014). This area of Sweden includes cities, large rural areas and a sparsely-populated archipelago. The Stockholm County Council and Gotland are both responsible for financing their primary and secondary healthcare, mainly through taxes.

Selection of the study groups
The Swedish Cancer Register, the only source of incident cancer diagnoses, was used to select our study groups. Patients with prostate (ICD code: C61), breast (C50) or skin (C44) cancer (basal cell carcinoma not included) recorded as the only diagnosis during 2009-2010 were selected ( Figure 2). Malignant melanoma (C43) was not included in the skin cancer group. Patients with more than one cancer site recorded in the Swedish Cancer Register or in specialist care (in-and out-patient care) registries during (2009-2010) or before (2001-2008) the study period, were excluded. Information about other diagnoses (comorbidity) recorded during the study period (all ICD-10 codes: Chapters A-Q at 3-digit level), drug therapy (all ATC codes except V, various) as well as demographic and socioeconomic data (country of birth, level of education, civil status, and income) was linked to this patient group. The categorization of the two income groups were based on the median year income in the general population in the S-G region on last of December 2009 [44]. Both the National Patient Register and the regional healthcare administrative databases (including primary care) were used for investigating comorbidities. Patients who died during the study period (127 prostate cancer, 106 breast cancer, and 60 skin cancer) were included in the analyses.

Statistics
Descriptive statistics are presented as frequencies, proportions and standard deviations. Data management and descriptive analyses were performed with SAS Enterprise Guide 6.1 (SAS Institute Inc., Cary, NC).

Description of the patient groups
During the study period (2009-2010), 8641 patients were diagnosed with prostate, breast, and skin cancer. According to the selection criteria 1,263 patients were excluded (Figure 2). Out of the 7,378 selected patients 3,581 had prostate cancer, 2,760 breast cancer, and 1037 skin cancer. The majority of patients, 96.8%, were registered in the Stockholm county, and the remaining in Gotland.
Most of the patients in all three groups were born in Sweden. The education level had similar distribution across the three groups. The patients in all three groups had an income above the median in the general population (Table 1).

Comorbidity
For all the three cancer groups studied, cardiovascular disease was one of the two most common comorbidities (

Treatment with cancer drugs
Parenteral preparations or prescribed oncological agents were dispensed to 32.4% of prostate cancer patients, to 85.9% of breast cancer patients, and to 4.1% of patients with skin cancer ( Table 3). The most common treatments were: bicalutamide and leuprorelin in prostate cancer; tamoxifen, cyclophosphamide, and epirubicin in breast cancer; bicalutamide, leuprorelin, and fluorouracil in skin cancer patients. The 13 patients with skin cancer that received bicalutamide or leuprorelin had a prostate cancer diagnosis recorded only in the primary care registry. The patients may have received more than one of the medications during the study period either in combination or sequentially (not analyzed).

Other drug treatments
More than 97% of the patients in all three groups had obtained non-cancer drugs. During the two-year period the most common noncancer treatments were anti-infective in patients with prostate cancer, drugs for nervous system disorders in patients with breast cancer, and cardiovascular drugs in patients with skin cancer (Table 4). Less than 3% of the patients had no other prescription drugs dispensed in ambulatory care during the period. Among drugs used for nervous system disorders, various analgesics were dispensed to 41.8% of all patients with prostate cancer, 39.8% of patients with breast cancer, and 28.3% of patients with skin cancer. Psychotropic drugs were also dispensed; sedatives to 11.3%, 18.7%, and 21.0%, respectively; antidepressants to 4.2%, 6.8%, and 5.3%, respectively; tranquilizers to 3.8%, 6.3%, and 5.0%, respectively. Neuroleptics were dispensed to 1.1% or less of the patients in all three groups. One fourth (27%) of all men with prostate cancer received agents for erectile dysfunction. Methotrexate was dispensed to less than 1% of the patients in all three groups of patients. Antiemetics and leucocyte stimulating drugs (mainly pegfilgrastim), were most frequently received, 5.8% and 23.8%, respectively, by breast cancer patients (data not shown).

Discussion
This study was carried out in line with the increasing interest for real world healthcare data for follow up of patients. After collection of data on individual cancer patients from different sources into a research database we created the three cancer groups for the present study. Data from the Swedish Cancer Register was chosen to select the cancer patients since this is the only registry (except for the quality registry) with verified tumour diagnoses [27]. In order to study oncological drug treatment, patients with more than one cancer recorded between 2001 and 2010 in the Swedish Cancer Register or in specialist care registries were excluded. The cancer diagnoses set in primary care were not considered in the selection of the study groups assuming they had not been confirmed by a specialist.
Results from patients with prostate, breast and skin cancer diagnosed during 2009-2010 are presented as an illustration of the potential of record linkage to monitor health, drug utilisation, effectiveness and safety in cancer patients.  [44]. **53% men Similar conclusions on the potential of record linkage for studies in cancer patients were drawn in a Dutch overview study [45]. We did also show the additional benefit of the Nordic population-based registries with individual level data on comorbidity, drug treatment, and socioeconomic characteristics of the patients. Socioeconomic status has shown to be associated with cancer incidence and survival [46,47]. It was not clear how the different socioeconomic factors affected this association, and it was suggested that residence area may play a role [46,47]. This needs to be addressed, and the socioeconomic components in our research database favours such studies. Moreover, the data can be used to investigate inequities in access to medicines, and for safety and effectiveness studies.
The apparent differences in income and civil status found among prostate cancer patients, in comparison to the other two groups, trigger the need of deeper analysis taking into account other potential factors explaining these findings.
The prevalence of hypertension in our study was apparently higher in all three study groups compared to that reported in a study on the general population in Stockholm during 2007-2011 [36]. In that study the prevalence of hypertension was: 43.1% (men 65-74 years), 31.9% (women 45-74 years), and 51.5% (both genders 65 years and above). On the other hand the prevalence of diabetes mellitus in the prostate cancer group seems to be lower than that found in the same study where diabetes mellitus prevalence in men was 25.5% (65-74 years). Data from that study was recalculated for comparison. Some of the comorbidities might be related to cancer while others represent common conditions in the general population. Further analysis is required to enlighten this issue.
Most of the oncological drugs used by the patients in the study groups were mature drugs available on the Swedish market before the millennium shift [48]. The exceptions were polyestradiolphosphate (approved 2007) for prostate cancer and exemestane (2000), capecitabine (2001), and bevacizumab (2005) for breast cancer. The drugs were used mainly according to their approved indications for prostate cancer or breast cancer. Some exceptions were tamoxifen in prostate cancer patients and carboplatin and cisplatin in breast cancer patients. However, tamoxifen has been used in the preventive treatment of gynecomastia and breast pain in patients with prostate cancer receiving antigonadal treatment [49]. Carboplatin in combination with paclitaxel and trastuzumab has been proposed as an advantageous alternative treatment in patients with breast cancer [50]. Etoposide, unsuccessful as single treatment in patients with breast cancer, has shown better effect when used in combination with cisplatin [51]. Some patients with skin cancer received cisplatin with among other an indication for squamous-cell carcinoma. The skin cancer patients who received anti-androgen treatment (bicalutamide or leuprorelin) had a prostate cancer recorded only in primary care. Anti-androgens may also be used against hirsutism in certain skin cancer types [52].
A Dutch study showed a minimal use of cyclophosphamide, * The following ICD-10 codes were used: Cardiovascular: I00-I99; Urogenital: N00-N99; Musculoskeletal: M00-M99; Endocrine: E00-E90; Respiratory: J00-J99, Skin: L00-L99; Eye: H00-H59 methotrexate, and 5-fluorouracil during 2005 to 2008 in early-stage breast cancer. They also reported a decrease of the use of anthracyclines to 68% of the patients, while the use of trastuzumab-and taxanecontaining treatments increased to 24% and 34%, respectively during this period [53]. In contrast we found that these oncological drugs were all used to a lower extent in our study while the antiestrogen tamoxifen was the most frequent treatment. These differences may be explained by cross country differences in the determinants behind the introduction of new medicines in healthcare including guidelines, regulations, support structures, participation in clinical trials and pharmaceutical company marketing [54]. Different countries have also in recent years presented various models to optimize the introduction of new medicines including horizon scanning, forecasting, risk-sharing arrangements and health technology assessment post-launch [55].
Despite that cancer diagnoses set in primary care were not used in the selection of the study groups, this data was shown to be important for gathering all the information about diagnoses and contacts in the health care, and to explain the indication for observed treatment. It was found that all three groups had other cancer diagnoses only recorded in primary care. Some of these diagnoses may have been reported to the national Cancer Register before 2001, or were not confirmed in specialist care.

Strength and limitations
The use of person-data to link different data sources is an advantage in this study. Data from the different registries on other diagnoses, recorded in specialist or primary care, deaths, and socioeconomics was included. The comprehensive coverage of the healthcare administrative database in Stockholm including hospitalizations, outpatient specialist care and primary care is an important strength of this study. With the exception of very few private clinics that operate without subsidies, all consultations and diagnoses in Stockholm are recorded in this database. An additional strength is the combination of data on drug treatment from different sources enabling an overview of the cancer treatment. Different methods have been applied in other epidemiological studies to obtain information on drug use in cancer patients. However, to our knowledge, this is the first Swedish study undertaken with individual level data on cancer medication administered in the hospital for three malignant diseases. The registry of drug preparations has previously been used in a Swedish study on castration resistant prostate cancer [35]. Still, it is important to acknowledge that further information on administered drugs in hospital care may be found in the electronic medical records [56]. These data will be incorporated in future studies.
A common limitation in registry studies is the validity and completeness of diagnoses [12]. However, the validity of recorded hospital diagnoses in Sweden in general is well documented as well as the completeness and validity of the Swedish Cancer Register [27,30]. We used diagnoses from both the National Patient Register and from the administrative database in order to retrieve patients care in other regions. Comorbidity data for the patients from registered in Stockholm and Gotland was gathered from the National Patient Register. Currently, the health care is obliged to report hospitalization and out-patient specialist care to this register. The regional administrative database includes only the patients from Stockholm. Therefore, primary care data from Gotland representing 3.2% of the selected population is missing. Another limitation of this study may be the inclusion of patients who died during the two-year period which may have led to underestimation of treatment. However, this would probably have a minor effect on the results since at the most 5.8% of the patients died during the two years.
The main purpose of this study was to describe some of the possibilities for surveillance of cancer patients and their treatment in clinical practice using record linkage of existing databases. The results show that the current research database fulfil the criteria for obtaining information about cancer diagnosis, other diagnoses, drug therapy (including temporal associations between diagnosis and treatment), diagnosis-related procedures, demographics, and socioeconomic status. However, for studies on effectiveness and safety, additional registries and medical records with data on additional drugs administered in the hospital, clinical assessments, and laboratory data need to be linked.