Apache Spark 1.4 adds R language and hardened machine-learning

With support for stats language R, along with a range of new features, the latest update to in-memory data-processing engine Apache Spark is now out.

By providing access to the popular R statistical programming language, the latest iteration of fast-growing analytics cluster framework Spark is aiming to make life easier for data scientists. Along with support for Python 3, Spark 1.4, which is now generally available, lets R users work directly on large datasets through the SparkR R API. “Because SparkR uses Spark’s parallel engine underneath, operations take advantage of multiple cores or multiple machines, and can scale to data sizes much larger than standalone R programs,” says Patrick Wendell.

Author: Toby Wolpe

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s