Data Science How-To: Using Apache Spark for Sports Analytics

Apache Spark has become a common tool in the data scientist’s toolbox, and in this post we show how to use the recently released Spark 2.1 for data analysis using data from the National Basketball Association (NBA). All code and examples from this blog post are available on GitHub.

Analytics have become a major tool in the sports world, and in the NBA in particular analytics have shaped how the sport is played. The league has skewed towards taking more 3-point shots due to their high efficiency as measured by points per field goal attempt. In this post we evaluate and analyze this trend in the NBA using season statistics data going back to 1979 along with geospatial shot chart data.

Author: Chris Rawles


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s