Predictive modeling (#PredictiveAnalytics) does not lie solely in the domain of Big Data Analytics or Data Science. I am sure that there are a few “data scientists” who think they invented predictive modeling. However, predictive modeling has existed for a while and at least since World War II. In simple terms, a predictive model is a model with some predictive power. I will elaborate on this later.
I have been building predictive models since 1990. Doing the math, 2015 – 1990 = 25 years, I have been engaged in the predictive modeling business longer that data science has been around. My first book on the subject, “Fundamentals of Combat Modeling (2007), predates the “Data Science” of 2009 (see below).
How old is Data Science?
It is really a trick question. The term was first used in 1997 by C. F. Jeff Wu. In his inaugural lecture for the H. C. Carver Chair in Statistics at the University of Michigan, Professor Wu (currently at the Georgia Institute of Technology), calls for statistics to be renamed data science and statisticians to be renamed data scientists. That idea did not land on solid ground, but the topic reemerges in 2001 when William S. Cleveland publishes “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics.” But it is really not until 2009 that data science gains any significant following and that is also the year that Troy Sadkowsky created the data scientists group on LinkedIn as a companion to his website, datasceintists.com (which later became datascientists.net). 
What is Predictive Modeling?
It is not a field of statistics! Yes, we do #predictive #modeling in statistics, but it is really a multidisciplinary field and is based more in mathematics than in other fields. Now, if you consult the most authoritative source of factual information available to the world, Wikipedia, you will find an incorrect view of predictive modeling (of course, I do not believe what I said about Wikipedia). It was formed by people with too much time on their hands and too little exposure to other disciplines, such as physics and mathematics.
Predictive modeling may have begun as early as World War II in the Planning of Operation Overlord, the Normandy Invasion, but was certainly used in determining air defenses and bombing raid sizes (it may have appeared as early as 1840 ). Now, this is not an article about the history of operations research, so suffice it to say that the modern field of operational research arose during World War II. In the World War II era, operational research was defined as “a scientific method of providing executive departments with a quantitative basis for decisions regarding the operations under their control.”
What is a Predictive Model?
The answer is easy: a model with some predictive power. I say that with caution, and use the word “some”, because more often than not, decision makers think that these model are absolute. Of course, they become very disappointed when the predictions do not occur as predicted. Rather than expand on my simplistic definition, I think some examples my help.
Examples of Predictive Models
The taxonomy of predictive models represented here is neither exhaustive of exclusive. In other words, there are other ways to classify predictive models, but here is one.
1. Times Series Models/Forecasting Models. This kind of model is a statistical model based on time series data. It uses “smoothing” techniques to account for things like seasonality in predicting or forecasting what may happen in the near future. These models are based on time-series data.
2. Regression Models. Time series model are technically regression models, but machine learning algorithms like auto neural networks have been employed recently in Time Series Analysis. Here I am referring to logistic regression models used in propensity modeling, and other regression models like linear regression models, robust regression models, etc. These models are based on data.
3. Physical models. These models are based on physical phenomena. They include 6-DoF (Degrees of Freedom) flight models, space flight models, missile models, combat attrition models (based on physical properties of munitions and equipment).
4. Machine Leaning Models. These include auto neural networks (ANN), support vector machines, classification trees, random forests, etc. These are based on data, but unlike statistical models, they “learn” from the data.
5. Weather models. These are forecasting models based on data, but the amount of data, the short interval of prediction windows and the physical phenomena involved make them much different that statistical forecasting models.
6. Mathematical Models. These are usually restricted to continuous time models based on differential equations or estimated using difference equations. They are often used to model very precise processes like the dynamics solid fuel rockets, or to approximate physical phenomena in the absence of actual data, like attrition coefficients approximation or direct fire effects in combat models.
7. Statistical Models. The first two examples, Time Series and Regression models, are statistical models. However, I list it separately because many do not realize that statistical models are mathematical models, based on mathematical statistics. Things like means and standard deviations are statistical moments, derived from mathematical moment generating functions. Every statistic in Statistics is based on a mathematical function.
What Predictive Models have I Built?
I have built predictive models in all example categories except weather models. Models I have built include Reliability, Availability and Maintainability (RAM) models for Unmanned Aerial Vehicle design; unspecified models involving satellites (unspecified because they are classified); unspecified missile models; combat attrition models; 6-DoF missiles models; missile defense models; propensity to purchase, propensity to engage, and share or wallet models regression models; time-series forecasting models for logistics; uplift (net-lift models) marketing models; ANN models as part of ensembles, classification trees, and random forests marketing models. I have also worked on descriptive and prescriptive models.
Models I have consulted on include the NASA Ares I Crew Launch Vehicle Reliability and Launch Availability; The Extended Range Multi-Purpose (ERMP) Unmanned Aerial Vehicle RAM Model, The Future Combat Systems (FCS) C4ISR family of models; FCS Logistic Decision Support System Test-Bed Model; Unspecified models (unspecified because they are classified).
- Press, G. “A Very Short History Of Data Science”, Forbes, May 28, 2013 @ 7:09 AM, Retrieved 05-29-2015.
- W. Bridgman, The Logic of Modern Physics, The MacMillan Company, New York, 1927.
- Operational Research in the British Army 1939–1945, October 1947, Report C67/3/4/48, UK National Archives file WO291/1301. Quoted on the dust-jacket of: Morse, Philip M, and Kimball, George E, Methods of Operations Research, 1st Edition Revised, pub MIT Press & J Wiley, 5th printing, 1954
Jeffrey Strickland, Ph.D.
Jeffrey Strickland, Ph.D., is the Author of Predictive Analytics Using R and a Senior Analytics Scientist with Clarity Solution Group. He has performed predictive modeling, simulation and analysis for the Department of Defense, NASA, the Missile Defense Agency, and the Financial and Insurance Industries for over 20 years. Jeff is a Certified Modeling and Simulation professional (CMSP) and an Associate Systems Engineering Professional (ASEP). He has published nearly 200 blogs on LinkedIn, is also a frequently invited guest speaker and the author of 20 books including:
- Operations Research using Open-Source Tools
- Discrete Event simulation using ExtendSim
- Crime Analysis and Mapping
- Missile Flight Simulation
- Mathematical Modeling of Warfare and Combat Phenomenon
- Predictive Modeling and Analytics
- Using Math to Defeat the Enemy
- Verification and Validation for Modeling and Simulation
- Simulation Conceptual Modeling
- System Engineering Process and Practices