Big Fish Swim in the Data Lake

I’ve been known as something of a data lake detractor, deeply suspicious of its early “definition” by James Dixon, CTO of Pentaho in a 2010 blog as a place where “the contents… stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.”

I have elaborated elsewhere on the many problems with this description and its direct descendants, but there is also an underlying truth in Dixon’s statement of the problems of traditional data warehousing and his vision that the Hadoop ecosystem has a significant role to play in their solution.

Author: Barry Devlin


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s