Articles

Pandora’s Journey from Hadoop to MemSQL

Pandora Media, the music streaming service, transitioned two years ago from heavy reliance on Hadoop for distributed processing to MemSQL as its primary database. The switch was prompted by the company’s inability by 2016 to keep up with evolving requirements and new features on the advertising-based side of the streaming service.

“Answers about Monday weren’t available until Wednesday,” the company noted in a blog post detailing its transition to MemSQL. While query performance was sufficient, keeping up with “24-hour period” data was not, with a roughly 20-hour delay on a large Hadoop cluster. Performance declined further as the daily haul of data grew larger, requiring more processing time. As data backed up, Pandora engineers said they considered 30 different platforms for delivering query results in real time.

Source: datanami.com
Author: George Leopold

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s