Propensity score methods for merging observational and experimental datasets

We consider how to merge a limited amount of data from a randomized controlled trial (RCT) into a much larger set of data from an observational data base (ODB), to estimate an average causal treatment effect. Our methods are based on stratification. The strata are defined in terms of effect moderato...

Full description

Saved in:
Bibliographic Details
Published in:Statistics in medicine Vol. 41; no. 1; pp. 65 - 86
Main Authors: Rosenman, Evan T. R., Owen, Art B., Baiocchi, Mike, Banack, Hailey R.
Format: Journal Article
Language:English
Published: England Wiley Subscription Services, Inc 15-01-2022
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We consider how to merge a limited amount of data from a randomized controlled trial (RCT) into a much larger set of data from an observational data base (ODB), to estimate an average causal treatment effect. Our methods are based on stratification. The strata are defined in terms of effect moderators as well as propensity scores estimated in the ODB. Data from the RCT are placed into the strata they would have occupied, had they been in the ODB instead. We assume that treatment differences are comparable in the two data sources. Our first “spiked‐in” method simply inserts the RCT data into their corresponding ODB strata. We also consider a data‐driven convex combination of the ODB and RCT treatment effect estimates within each stratum. Using the delta method and simulations, we identify a bias problem with the spiked‐in estimator that is ameliorated by the convex combination estimator. We apply our methods to data from the Women's Health Initiative, a study of thousands of postmenopausal women which has both observational and experimental data on hormone therapy (HT). Using half of the RCT to define a gold standard, we find that a version of the spiked‐in estimator yields lower‐MSE estimates of the causal impact of HT on coronary heart disease than would be achieved using either a small RCT or the observational component on its own.
Bibliography:Funding information
American Society for Engineering Education, NDSEG Fellowship; Division of Mathematical Sciences, 1407397; 1521145; Google: Graduate Support, National Science Foundation, IIS‐1837931
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-4
ObjectType-News-2
content type line 23
ObjectType-Undefined-3
ISSN:0277-6715
1097-0258
DOI:10.1002/sim.9223