Doing Big Data All By Yourself: Interactive Data Driven Decision Making by Non-Programmers

A session at Strata Rx

Tuesday 16th October, 2012

5:15pm to 5:55pm (PST)

As healthcare organizations gain access to more and more data, they rely increasingly on complex systems and predictive tools to make sense of it all. This has led to a divide between those who perform data analysis (programmers, statisticians, database administrators) and those who make decisions (physicians, insurance administrators, policy makers). In addition, the needs of those users are fairly heterogeneous, even a on a single data set.

In this presentation, we will show a working system that bridges the gap between data analysis and decision making using a carefully composed set of big-data technologies mated with an interactive, high-level interface. By leveraging a powerful backend infrastructure and an intuitive set of analytical tools, health professionals ranging from physicians to insurance analysts can perform interactive, real-time analysis on all data within their enterprise.

To build the system for this realtime, interactive demonstration, we integrated ten years of Medicare claims, including information from 700,000 physicians, 30 million beneficiaries, 100 million claims, and 1 billion medical procedures. We also integrated 20 million PubMed journal articles and information on one million physicians and 3000 hospitals using the National Provider Identifier Database and Medicare Hospital Compare.

The first analysis is from the perspective of a policy analyst who wants to better characterize Medicare’s “high spenders”—beneficiaries who rank among the top 0.01% of in terms of health spending – to drive policy decisions. We drill down on 30 million beneficiaries in seconds to identify high spenders and look at their providers, diagnoses, and procedures in more detail.

The second analysis is from the perspective of a single physician who wants guidance on a patient consult. In this analysis, the physician is able to determine the estimated total cost of this disease, the clusters of treatment strategies, and information detailing how much a set of treatments will cost and how painful they will be for the patient. The physician can also use the platform to zoom out for a summary of all of the physician’s patients and all patients at her hospital.

Our presentation underscores the importance of data-driven decision making, an endeavor that becomes more challenging as more data become available to health organizations. Most importantly, using the right composition of technologies and proper data integration, it’s possible to build a system that is flexible and extensible, letting the different types of users who are interested in health care data quickly answer nuanced questions that are rigorously backed by data.

About the speaker

This person is speaking at this event.
Ari Gesher

Senior Software Engineer and blogger at Palantir Technologies (@PalantirTech) I used to be named Ari Gordon-Schlosberg; an explanation: http://bit.ly/c40tHM bio from Twitter

Sign in to add slides, notes or videos to this session

Tell your friends!


Time 5:15pm5:55pm PST

Date Tue 16th October 2012

Short URL


View the schedule


See something wrong?

Report an issue with this session