Scatterplot of Loan Closes per month vs Respas received 2 months prior
I decided to run a simple regression test on the data. The regression equation that will predict loan closes is:
y = 10 + .476*x
(y = number of closed loans, x = number of RESPAS received 2 months prior) However, in this case we would want to set our y intercept at zero. This is obvious because if you received zero RESPAS back two months ago you would expect to close zero loans, not 10. After this adjustment and forcing the y intercept at zero, the resulting equation is:
y = .52x Take a look at the following graph to see how accurately this test would predict the number of loan closes you could expect.
Scatterplot of Loan Closes per month vs Adjusted Respas received 2 months prior The strong linear relationship indicates that in fact, RESPAS received 2 months prior is a great indicator of how many loans you will close in that month. In fact, statistically speaking, this test has a p-value of less than .01. This means that there is less than a 1% chance that the above regression equation does not predict loan closes accurately for us. The main assumption of this test is that the data is normally distributed. The simplest way to check for normally distributed data is to use a normal probability plot. These plots are a quick graphing technique to see if a data set exhibits the properties of a normal distribution. If the data is indeed approximately normally distributed, then the normal probability plot of the residuals should lie in a straight line. The straighter the line, the better. As the figure shows, the data are in fact normally distributed.
Normal Probability Plot In addition, the plot indicates that the predictor is a useful tool for the entire range of X. At no point in the graph are there are group of residuals that stray away from the predicted straight line. Because any value for X will accurately predict Y, we know we can use this for any amount of RESPAS we receive.* Using this information I am now able to predict good and bad months. Also, if we are expecting to close a certain number of loans, but close noticeably fewer than what we expected, we can know something may be wrong. We can then investigate the problem as opposed to not knowing there is a problem.
*In our case the rage includes receive 150-300 RESPAS back, we cannot be sure if the predictor will work for X values outside of that range.