- Be prepared to Wrangle – When dealing with “big data” (#bigdata), about 2/3 of you project time is spent getting access to the data, getting the right data, preprocessing the data, and exploratory data analysis, prior to modeling building.
- Be a Story Teller – Presentation of the results are as important as the analytics performed. If you cannot convey the results clearly and succinctly, you’re not adding value for your customer. In particular, your results must be translated to economic value or a similar metric.
- Plan for Peer Reviews – Everything benefits from peer reviews, including the 2/3 work performed before modeling (See #1). You should develop a template for conducting peer reviews so that they are thorough and consistent.
- Validate, Validate, Validate – You should attempt to validate everything. This includes any requirements, data provided (response data, etc.), macros provided, etc. That is to say, not just your model.
- Partnering with the “Business” is a key to Success – Listen carefully to the customer and help them express their “real” problem. Often the customer knows they have a problem but either cannot express it or they do not identify the root problem.
- Manage your Variables – Ensure that the variables in a model are intuitive and limit them to the most predictive variable. Rule of thumb: not more than ten variables.
- Use a “Document-As-You-Go” Approach – Document your work as you progress. Either keep a notebook for your project or record it electronically. I use a spreadsheet with a tab for each task or sub-task. This will make things easier down the road when writing development document or providing information for validation.
Jeffrey Strickland, Ph.D.
Jeffrey Strickland, Ph.D., is the Author of Predictive Analytics Using R and a Senior Analytics Scientist with Clarity Solution Group. He has performed predictive modeling, simulation and analysis for the Department of Defense, NASA, the Missile Defense Agency, and the Financial and Insurance Industries for over 20 years. Jeff is a Certified Modeling and Simulation professional (CMSP) and an Associate Systems Engineering Professional (ASEP). He has published nearly 200 blogs on LinkedIn, is also a frequently invited guest speaker and the author of 20 books including:
- Operations Research using Open-Source Tools
- Discrete Event simulation using ExtendSim
- Crime Analysis and Mapping
- Missile Flight Simulation
- Mathematical Modeling of Warfare and Combat Phenomenon
- Predictive Modeling and Analytics
- Using Math to Defeat the Enemy
- Verification and Validation for Modeling and Simulation
- Simulation Conceptual Modeling
- System Engineering Process and Practices