Articles

Distribution Fitting: R & SCILAB Compared

HeaderIn this article I compare Scilab with R for generating a histogram from data and fitting a curve to data using the least squares procedure.

Using Scilab

Consider an experiment where we’ve measured the time to failure for 50 identical electrical components.

fig1

Notice that only one variable has been measured — the components’ lifetimes. There is no notion of response and predictor variables; rather, each observation consists of just a single measurement. The objective of an analysis for data like these is not to predict the lifetime of a new component given a value of some other variable, but rather to describe the full distribution of possible lifetimes. This is distribution fitting with univariate data.

One simple way to visualize these data is to make a histogram.

fig2

Consider an experiment where we measure the concentration of a compound in blood samples taken from several subjects at various times after taking an experimental medication.

fig3

fig4

Notice that we have one response variable, blood concentration, and one predictor variable, time after ingestion. The predictor data are assumed to be measured with little or no error, while the response data are assumed to be affected by experimental error. The main objective in analyzing data like these is often to define a model that predicts the response variable. That is, we are trying to describe the trend line, or the mean response of y (blood concentration), as a function of x (time). This is curve fitting with bivariate data.  The Scilab function we define will use the coefficients generated by the least squares procedure.

fig5

fig6

Using R

First we duplicate the histogram of the life data in R, and see that R renders the same result.

fig7

Next we take the blood concentration data and fit a curve of the same family to the data using R.

fig8

fig9

Comparison

From looking at the fitted curves, it appears that both programs render similar fits. If we look at the coefficients x(1) and x(2) from Scilab, we see that they are 0.1733551  and 1.969835, respectively. Comparing these with p1 and p2 from R, we have 0.17336 and 1.96986, respectively. Thus, the only difference is due to round-off error. If the number of digits are indicative of the actual decimal places used in calculations, I would take the Scilab result over R. However, the R least squares routine was much easier to implement.


Jeffrey StricklandAuthored by:
Jeffrey Strickland, Ph.D.

Jeffrey Strickland, Ph.D., is the Author of “Predictive Analytics Using R” and a Senior Analytics Scientist with Clarity Solution Group. He has performed predictive modeling, simulation and analysis for the Department of Defense, NASA, the Missile Defense Agency, and the Financial and Insurance Industries for over 20 years. Jeff is a Certified Modeling and Simulation professional (CMSP) and an Associate Systems Engineering Professional. He has published nearly 200 blogs on LinkedIn, is also a frequently invited guest speaker and the author of 20 books including:

  • Discrete Event simulation using ExtendSim
  • Crime Analysis and Mapping
  • Missile Flight Simulation
  • Mathematical modeling of Warfare and Combat Phenomenon
  • Predictive Modeling and Analytics
  • Using Math to Defeat the Enemy
  • Verification and Validation for Modeling and Simulation
  • Simulation Conceptual Modeling
  • System Engineering Process and Practices
  • Weird Scientist: the Creators of Quantum Physics
  • Albert Einstein: No one expected me to lay a golden eggs
  • The Men of Manhattan: the Creators of the Nuclear Era
  • Fundamentals of Combat Modeling

Connect with Jeffrey Strickland
Contact Jeffrey Strickland

2 replies »

  1. This question may be out of context of this article but I was curious as to how you decided on the non-linear equation of acos(b)x+bcos(a)x for defining the given distribution. Thank you

    Like

Leave a Reply to perfectatdat Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s