You’ve heard of Apache Spark, but can you really explain it? What problems does it solve? How does it solve them? If you want to know the answers to these questions, then this post is for you.
The Problems A tool is only as useful if it solves a set of problems, right? So let’s talk about the problems Spark solves. We need answers (and quickly) In batch processing, waiting for long running jobs is expected, but in today’s enterprise, answers are needed quickly (in “near real time”). But the attributes of Big Data (velocity, volume, and variety) continue to make it tougher to get answers to business questions