Imagine someone has thrown the pieces of five different jigsaw puzzles into a single pile. Before you can start solving the puzzles, you must first separate all the pieces first. Then you discover key pieces are missing. You know they are somewhere in the house, but you must go hunting for them.
Data scientists working in traditional IT environments face this kind of challenge every day. Before they can even begin their analytical work, they must complete four basic tasks: • Gather data: hunting for it across multiple systems. • Validate data: ensuring they have the right data from the right data set, and that it is accurate, complete and up to date.
Author: Nitay Joffe