Munzoor Shaikh and Greg Layok featured in Becker's Health IT & CIO Review
In a recent article by CIO, the volume of healthcare data at the end of 2013 was estimated at just over 150 Exabytes, and is expected to climb north of 2,300 Exabytes by 2020 - a growth rate of 1,500 percent in just 7 years.

In response, both healthcare payers and providers are increasing investments in technology and infrastructure to establish competitive advantages by making sense of the growing pool of data. But key actionable insights, such as improving the quality of patient care; increasing operational efficiency; or refine revenue cycle management, are difficult to find. Core challenges surrounding data analytics - capturing, cleaning, analyzing, and reporting – are complex and daunting tasks from both a technical and subject matter perspective.

It's no surprise then that many healthcare organizations struggle to make sense of this data. While the advents of big data technologies, such as Hadoop, provide the tools to collect and store this data, they aren't a magic bullet to translate these heaps of information into actionable business insights. To do so, organizations must carefully plan infrastructure, software, and human capital to support analysis on this scale, which can quickly prove to be prohibitively expensive and time consuming.

But by starting small in the new era of big data, healthcare organizations are able to create an agile and responsive environment to analyze data without assuming any unnecessary risk. To do so, however, they must be able to answer three questions:

1. What narrowly-tailored problem has a short term business case that we can solve?
2. How can we reduce the complexity of the analysis without sacrificing results?
3. Do we truly understand the data? And if not, what can we learn from the results?

Consider two examples to illustrate the effectiveness of starting small – that of a healthcare services provider looking to prevent unnecessary hospital visits, and that of a large healthcare provider looking to improve revenue cycle operations universally after a three-practice merger.

The first example, a healthcare services provider, concerns an organization that specializes in care coordination. This particular organization consumes a sizeable volume of claims – often more than five million per month. And to supplement core operations (e.g. patient scheduling and post-visit follow ups), it sought to answer a question that could carry significant value to both payers and providers – how can we reduce the number of unnecessary hospital visits? By digging even further, a more-refined question was heard clearly from payer and provider clients– can we identify patients who are high-risk for a return visit to the ER? Last, and not the least, the organization came to ask the key question that often such big data projects fail to do – is there a short term business case for solving this problem?

To answer the question, they considered the available data. Although the entire patient population would provide a significant sample size, it could potentially be skewed by various factors relating to income, payer mix, etc. So the organization decided to narrow the search to a few geographically grouped facilities, and use this sample as a proof of concept. This would not only eliminate the volume of data analyzed – not likely a problem with only a few million records per month – but reduce the complexity of the analysis as well by not requiring more advanced concepts of control groups and population segmentation. It may also allow, if necessary, subject matter experts to weigh in from the individual facilities to provide guidance on the analysis.

The results returned from the analysis were simple and actionable. The service provider found that particular discharge diagnoses has comparatively high rates of return visits to the ER, often related to patients not closely following discharge instructions. And by providing the payers and providers this information, they were able to improve the clarity of discharge instructions, and drive post-discharge follow ups, to decrease the total number of unnecessary readmissions. The cost of unnecessary admissions was significant enough to grant further momentum to the small data project, allowing the project to expand to other regions.

In the second example, a large regional healthcare services provider posed a similarly tailored question – how can we improve revenue cycle efficiency by reducing penalties related to patient overpayments? At first glance, this would seem to be a relatively small insight for traditional revenue cycle analyses. Potentially more impactful questions (who owes me money now? Which payer pays the best rates for procedure "xyz"?), could provide a larger payoff, but would inevitably complicate the task of standardizing and streamlining both data and definitions for all three practice groups.

However, the analysis would provide a jumping off point to better understand the data at a granular level. Not only was this particular provider able to create reports to identify delayed payments, and prioritize accounts to work by the "age" of the delayed payment, it was able to better understand the underlying cause of such delayed payments and was able to adjust the billing process to ensure timely payments improving the accounts receivable inventory. Once again, timely payments in this case significantly helped the working capital requirements of the organization proving a rather short term and significant business case. As a result, the small data project was expanded to include more complex revenue cycle management problems related to underpayment and specialty practice related claims.

In both examples, these organizations deliberately started small – both in terms of the amount of data, and the complexity of their approach. And by showing restraint and limiting the scope of their analyses, they were able to define a clear business case, derive actionable insights, and gain further momentum to tackle larger challenges faced by the organization.

This article originally appeared on Becker's Health IT & CIO Review.