I write a good bit of content about using open-source tools for analytics and operations research. However, my workhorse happens to be SAS. Actually, I use SAS Enterprise Guide (EG) and SAS Enterprise Miner (EM).

### When to use open-source

I have argued both for and against the use of open-source tools and they certainly have their place. If you have a limited budget, open-source is a good path to journey down. Also, if you are teaching or are a student at a university, open-source seems like a logical option.

### How I use SAS

I perform predictive modeling in the Financial and Insurance (FSI) industry. My method requires me to use SAS EG to retrieve variables from multiple data sets (between 10 and 20 sometimes), resulting in about 2000 variables. Once these variables are merged into one data set (for up to 11 million customers), I run an information value algorithm to determine which variables have the most predictive power for the response variable. After eliminating variables that cannot be used for marketing bank products (fair lending acts and so on), I import between 150 and 300 variables into SAS EM. In EM, I perform data partitioning, data imputations, and data transformations prior to running a model, like logistic regression. When I get an adequate model, I take the scoring code from EM and port it to EG, wrapped in a macro. I run the macro in EG to measure model performance. When I get a model that performs well enough against challenger models, I develop model production code in EG, which includes the EM scoring code.

### Why I use SAS

SAS EG and EM are trusted by the FSI I service. I actually began using SAS in 1990 on a mainframe computer. It has since become the statistical software of choice by many industries for many years. With external model governance and validation, the models I have developed in SAS EG and EM have been stress tested and proven valid for use by the FSI. This does not imply that models created with open-source tools are any less valid, as I will discuss next.

### Open-source and SAS

Once I build a predictive model in SAS, I attempt the build the same model in R, for example. The models in R are similar enough to show that the models in SAS are indeed valid models. For instance, I constructed an uplift model in SAS using logistic regression and an uplift model for the same acquisition situation using random forest in R. The overall net lift was identical, although the distribution among deciles was slightly different.

### Conclusion

I will not argue that either SAS or open-source tools are better than the other. Instead, I will state that they each have their use in various situations. Yes, SAS is tried and true…but I will continue to use both.

**Authored by:
**

**Jeffrey Strickland, Ph.D.**

Jeffrey Strickland, Ph.D., is the Author of *Predictive Analytics Using R* and a Senior Analytics Scientist with Clarity Solution Group. He has performed predictive modeling, simulation and analysis for the Department of Defense, NASA, the Missile Defense Agency, and the Financial and Insurance Industries for over 20 years. Jeff is a Certified Modeling and Simulation professional (CMSP) and an Associate Systems Engineering Professional (ASEP). He has published nearly 200 blogs on LinkedIn, is also a frequently invited guest speaker and the author of 20 books including:

*Operations Research using Open-Source Tools**Discrete Event simulation using ExtendSim**Crime Analysis and Mapping**Missile Flight Simulation**Mathematical Modeling of Warfare and Combat Phenomenon**Predictive Modeling and Analytics**Using Math to Defeat the Enemy**Verification and Validation for Modeling and Simulation**Simulation Conceptual Modeling**System Engineering Process and Practices*

Connect with Jeffrey Strickland

Contact Jeffrey Strickland

Categories: Articles, Featured, Jeffrey Strickland

I occasionally read articles like this hoping that I might be enlightened and learn a relevant use case for SAS. While there is no doubt adoptance of this tool across many segments of many industries (it has been around since the dark ages and in many cases all that people know, hence the “trust”), this isn’t, in my opinion, a very compelling selling point for the software, unless of course management is so intransigent that they require the use of SAS and only SAS. So the question remains, Is there anything about SAS that allows you to perform your analysis in a superior manner to other tools (free or not)? And yes, I know SAS does a lot of work on disk and can thus handle “large” datasets (memory was scarce in the 1960’s). I find this argument less convincing with the advent of 64bit machines.

As for FOSS software in general, While I agree that it doesn’t make sense to pay exorbitant fees for 1980’s grade software if you are on a budget, I would extend this to add that it also doesn’t make sense to pay for such software if you are a corporation interested in profit, or regardless of who you are, if there are superior (free or not) tools available elsewhere.

LikeLike

Thanks for your comment.

LikeLike