Predictive model algorithms identifying early and advanced stage ER+/HER2− breast cancer in claims data
Purpose Claims databases offer large populations for research, but lack clinical details. We aimed to develop predictive models to identify estrogen receptor positive (ER+) and human epidermal growth factor negative (HER2−) early breast cancer (ESBC) and advanced stage breast cancer (ASBC) in a clai...
Saved in:
Published in: | Pharmacoepidemiology and drug safety Vol. 28; no. 2; pp. 171 - 178 |
---|---|
Main Authors: | , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
England
Wiley Subscription Services, Inc
01-02-2019
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Purpose
Claims databases offer large populations for research, but lack clinical details. We aimed to develop predictive models to identify estrogen receptor positive (ER+) and human epidermal growth factor negative (HER2−) early breast cancer (ESBC) and advanced stage breast cancer (ASBC) in a claims database.
Methods
Female breast cancer cases in Anthem's Cancer Care Quality Program served as the gold standard validation sample. Predictive models were developed from clinical knowledge and empirically from claims data using logistic and lasso regression. Model performance was assessed by classification rates and c‐statistics. Models were applied to the HealthCore Integrated Research Database (claims) to identify cohorts of women with ER+/HER2− ESBC and ASBC.
Results
The validation sample included 3184 women with ER+/HER2− ESBC and 1436 with ER+/HER2− ASBC. Predictive models for ER+/HER2− ESBC and ASBC included 25 and 20 factors, respectively. Models had robust discrimination in identifying cases (c‐stat = 0.92 for ESBC and 0.95 for ASBC). Compared with a traditional a priori algorithm developed with clinical insight alone, the ER+/HER2− ASBC‐predictive model had better positive predictive value (PPV) (0.91, 95% CI, 0.90‐0.93, vs 0.69, 95% CI, 0.66‐0.73) and sensitivity (0.54 vs 0.35). Models were applied to the claims database to identify cohorts of 33 001 and 3198 women with ER+/HER2− ESBC and ASBC.
Conclusion
We conducted a validation study and developed predictive models to identify in a claims database cohorts of women with ER+/HER2− ESBC and ASBC. The models identified large cohorts in the claims data that can be used to characterize indications in the evaluation of targeted therapies. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Undefined-3 |
ISSN: | 1053-8569 1099-1557 |
DOI: | 10.1002/pds.4681 |