Several years ago, the educational technology company Blackboard selected Apache Hadoop to run a new data analytics application designed to turn data exhaust into actionable insight. Months later, the failed project was cancelled, and Blackboard implemented a hosted relational data warehousing product instead.
The reasons behind Blackboard‘s initial selection of Hadoop for this project will sound familiar: a desire to maximize data exhaust, a need to bring large amounts of data together for analysis, and a curiosity to work with emerging technology. But the factors leading to the Hadoop failure will also ring a bell to those experienced with Hadoop projects: difficulty integrating opens source pieces, complex architectures and data flows, and an inability to read data from Hadoop in a useful and timely fashion.
Author: Alex Woodie