Getting jobs to run on Hadoop is one thing, but getting them to run well is something else entirely. With a nod to the pain that parallelism and big data diversity brings, LinkedIn unveiled a new release of Dr. Elephant that aims to simplify the process of writing tight code for Hadoop. Pepperdata also introduced new software that takes Dr. Elephant the next step into DevOps.
The Hadoop infrastructure at LinkedIn is as big and complex as you likely imagine it to be. Distributed clusters run backend metrics, power experiments, and drive production data products that are used by millions of people. Thousands of internal users interact with Hadoop via dozens of stacks each day, while hundreds of thousands of data flows move data to where it needs to be.
Author: Alex Woodie