Articles

Python plus R equals Data Scientist?

datasciencemanI have read a number of articles and posts that claim that if you had courses in linear algebra and multivariate calculus, and know R and Python, then you are a data scientist. That sounds really good. I have courses in embryology, histology, and physiology, so I must be an OBGYN doctor—I also stayed in a Holiday Inn Express last night.

Other things I have read indicate that data scientists are carrying a blemish which causes them not to be needed. No wonder! It appears as if I give $10 to Vincent ‘Vinnie’ Antonelli [Steve Martin in My Blue Heaven], then he can make me a data scientist. I have talked to recruiters who have been disappointed in the actual qualifications of some who label themselves a data scientist.

Are these people real? Two college courses plus R and Python? Although my view may not be popular, a monkey can learn to use tools. Tools do not make you anything other than a tool user. A deeper understanding of data structure, data analysis, data modeling, data … is what one needs to be Data Scientist. Conceptual understanding, not tools. If you have the former, you can pick up the latter in no time.

However, I will go on to say that conceptual understanding is not enough. When I see the data scientist depicted in cartoons or on TV, I see a reclusive geek with no social skills and a propensity to immerse themselves in data at the expense of all else. But that is not a data scientist.

The term “scientist” refers to someone who is an expert in their science and uses the scientific method. As a profession the scientist of today is widely recognized. Scientists include theoreticians who mainly develop new models to explain existing data and predict new results, and experimentalists who mainly test models by making measurements — though in practice the division between these activities is not clear-cut, and many scientists perform both tasks.

Then what makes a data scientist? I would say that coursework leading to a degree, conceptual understanding, a sprinkle of research, a dash of new development, an occasional bath, plus “effectiveness” makes a data scientist. When a client uses the product produced by the data scientist, whether it be a database, a data architecture, a data model, or data analysis, then we have “effectiveness”, and they person who did all of this is a data scientist.

If you are a pilot of a Boeing 757 and you cannot land the aircraft (it may be a skill you never acquired), then you are not a pilot. You are just an airline employee with wings on your chest. Landing the plane is pretty important, and “landing” your solution with the big data customer is pretty important. When you talk as a data scientist, then clients should listen and take action based on what you tell them.

This is where we are missing the boat. We can ‘python-ate’ or ‘statistic-ate’ [two new verbs] all day long, but at the end of the day when the client is no longer listening, then what are we?


Jeffrey Strickland

Authored by:
Jeffrey Strickland, Ph.D.

Jeffrey Strickland, Ph.D., is the Author of “Predictive Analytics Using R” and a Senior Analytics Scientist with Clarity Solution Group. He has performed predictive modeling, simulation and analysis for the Department of Defense, NASA, the Missile Defense Agency, and the Financial and Insurance Industries for over 20 years. Jeff is a Certified Modeling and Simulation professional (CMSP) and an Associate Systems Engineering Professional. He has published nearly 200 blogs on LinkedIn, is also a frequently invited guest speaker and the author of 20 books including:

  • Operations Research using Open-Source Tools
  • Discrete Event simulation using ExtendSim
  • Crime Analysis and Mapping
  • Missile Flight Simulation
  • Mathematical Modeling of Warfare and Combat Phenomenon
  • Predictive Modeling and Analytics
  • Using Math to Defeat the Enemy
  • Verification and Validation for Modeling and Simulation
  • Simulation Conceptual Modeling
  • System Engineering Process and Practices
  • Weird Scientist: the Creators of Quantum Physics
  • Albert Einstein: No one expected me to lay a golden eggs
  • The Men of Manhattan: the Creators of the Nuclear Era
  • Fundamentals of Combat Modeling
  • LinkedIn Memoirs
  • Quantum Phaith
  • Dear Mister President
  • Handbook of Handguns
  • Knights of the Cross: The True Story of the Knights Templar

Connect with Jeffrey Strickland
Contact Jeffrey Strickland

4 replies »

  1. I’d say what makes a scientist a scientist (as opposed to an engineer or programmer, or doctor) is a curiosity to understand what is being studied and then develop a theory that can predict and be tested. The scientist will have a variety of skill sets include domain knowledge (you can’t be a physicist without knowing something about physics), mathematical skills, coding skills and writing and presentation skills. All of these are needed, otherwise your really a lab assistant. 🙂

    A data scientist will require these too, otherwise they are a data analyst, data engineer, etc.

    Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s