Hidden from view in the “I want to be data-driven” conversation are the nitty-gritty details of how actually to become a data-driven organization. The grand hope is that artificial intelligence, in the guise of machine learning, will power our data-driven aspirations, and it’s clear that big data is the raw commodity that makes it all possible. But the tough question is how to effectively turn that raw data into something the algorithms can work with.
The truth is that many organizations are struggling to turn that raw data into high powered AI fuel. Capturing, storing, and processing big data still poses significant challenges for those organizations that can’t afford to hire an army of data engineers to hand-build custom systems from available open source components. For those of us who want to use big data systems in the real world, as opposed to designing and building and maintaining them, it’s worthwhile to know that the open source patterns for building distributed systems for storing and processing big data are in flux at the moment.
Author: Alex Woodie