Challenge: Many UCSF researchers are interested in questions about human health and the delivery of health care services that could be studied using large administrative datasets, such as those generated by the Centers for Medicare and Medicaid Services (CMS), the California Office of Statewide Health Planning and Development (OSHPD), and agencies that collect vital statistics data (e.g., birth certificates, death certificates). While some UCSF researchers have conducted important research with these datasets, expanding the pool of researchers who work with them is challenging for several reasons. Organizations that produce these datasets often have difficulty responding to requests promptly due to limited resources and competing priorities. In addition, extensive programming is often required to transform the raw data into usable information. For some research questions, researchers also need to link vital statistics records with administrative data on the delivery of health care services. Because many of the publicly available versions of data sets are de-identified, linking such datasets typically relies upon probabilistic matching algorithms. The complexities and error-prone nature of probabilistic matching represents a barrier to the full exploitation of administrative and vital statistics data by researchers who are not experts in these techniques.
Solution - Data Concierge Service: Building upon resources for analysis of large, public datasets that are already available to UCSF researchers through the Comparative Effectiveness Large Dataset Analysis Core (CELDAC), this project would establish a concierge service that would assist UCSF researchers in accessing large administrative datasets. A ‘special access’ data manager who would simultaneously be employed by UCSF and agencies that collect administrative data (OSHPD, CA Department of Public Health, CMS) would link records across datasets. The data manager would have expertise in the use of deterministic and probabilistic matching algorithms to merge datasets using unique identifiers (where available) and other variables such as date, age, gender, and zip code. The data manager would generate customized datasets that are tailored to researchers’ specifications, create de-identified versions of them, and deliver them to requestors (probably through MyResearch). The data manager and CELDAC’s principal investigator would work with investigators to ensure they secure the approvals needed to analyze and report upon the data and serve as a liaison with data-providing agencies. This service could be funded in a manner similar to CTSI’s existing consultation services (CTSI subsidy for initial hour of consultation, recharge for subsequent hours of service). It could also be made available to researchers at other CTSAs to broaden the potential user base.
Potential Partners: Members of Stanford’s CTSA have expressed interest in collaborating with UCSF CTSI to enhance capacity to conduct research using secondary datasets. CTSAs at other UC campuses may be interested as may faculty and trainees in the School of Public Health at UC-Berkeley. The UC Research Exchange would be a valuable partner in this effort due to its experience in bringing UC campuses together to improve access to administrative data for health research. In addition, CELDAC’s principal investigator has good contacts with staff of OSHPD’s Healthcare Information Division who are interested in enhancing their ability to serve researchers and other customers.
Innovation: This proposal builds upon UCSF CTSI’s existing resources for conducting research with large administrative datasets by creating a data concierge service that would help UCSF researchers to more quickly obtain secondary datasets tailored to their specific research interests. If successful, CELDAC could be transformed from a conduit of information about datasets that other organizations generate to a concierge that works proactively with these organizations to help researchers at UCSF and potentially other CTSAs obtain the data they need for their research. Making requisite public data more accessible is expected to significantly expand the use of secondary data to address important hypotheses in public health and comparative effectiveness research. This innovation may be particularly valuable during the current contraction in national research funding.
Projected Impact: This project could enhance UCSF researchers’ ability to conduct timely research using large administrative datasets that would enhance our understanding of factors that affect human health. If successful, the project could serve as a model for other CTSAs.
Commenting is closed.