Articles

Big Data File Formats Demystified

So you’re filling your Hadoop cluster with reams of raw data, and your data analysts and scientists are champing at the bit to get started. Then the question hits you: How are you going to store all this data so they can actually use it?

The good news is Hadoop is one of the most cost-effective ways to store huge amounts of data. You can store all types of structured, semi-structure, and unstructured data within the Hadoop Distributed File System, and process it in a variety of ways using Hive, HBase, Spark, and many other engines. You have many choices when it comes to storing and processing data on Hadoop, which can be both a blessing and a curse.

Source: datanami.com
Author: Alex Woodie

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s