The terms ‘Big Data’ and ‘Hadoop’ have come to be almost synonymous in today’s world of business intelligence and analytics. The promise of easy access to large volumes of heterogeneous data, at low cost compared to traditional data warehousing platforms, has led many organizations to dip their toe in the water of a Hadoop data lake. The Life Sciences industry is no exception.
With the extremely large amounts of clinical and exogenous data being generated by the healthcare industry, a data lake is an attractive proposition for companies looking to mine data for new indications, optimize or accelerate trials, or gain new insights into patient and prescriber behavior. Often, the results do not live up to their expectations. It’s one thing to gather all kinds of data together, but quite another to make sense of it. Hadoop was originally designed for relatively small numbers of very large data sets.
Author: Neil Stokes