Using+regression+analysis+to+derive+a+demand+curve,+also+t-stats,++R-squaredt-stats,+F-stat,+R-squared,+adjusted+R-square

Jeff Butcher Michael Rager Rachel Hill

Regression analysis is used to estimate quantitative functional relationships between a dependent variable and one or more independent variables from data when the relationship between the variables is statistical rather than exact. The estimated relationship is used to identify expected outcomes (dependent variable) based on inputs (independent variables). Basically, regression analysis is the process of reverse engineering a formula based on a set of data in order to estimate or forecast values.

If the relationship is assumed to be linear then it is represented by the following equation: Y = a + bx + e. Y represents the dependent variable, x represents the independent variable (s), a represents the y-intercept, and e represents a random variable (error term) that has a zero mean. There are many examples of real world applications for the use of regression analysis. The country of Turkey used regression analysis to analyze the relationship between energy use and agricultural productivity (Karkacier, Gokalp and Cicek 2006). It is most benefical as a forecasting technique for managers to provide estimated projections for the future.

T-stats, or the t-statistic is the ratio of a parameters value to the standard error of the parameter. The larger the t-statistic, the more confident one can be that the true parameter is not zero. As the parameter value increases and the standard error decreases, the larger the t-statistic becomes. A large t-statistic represents a higher level of confidence that a randomly selected sample of data drawn from a larger population will give a relatively accurate representation of the overall population. The nice thing about conducting a //t//-test is that it can be used in many situations where you do not know the population variability.

The F-Stat measures the total variation of the regression relative to the total unexaplained variation. The greater the F-Stat, the better the fit of the regression line to the actual data. The F-stat is best used by managers for hypothesis testing.

R-squared tells the fraction of the total variation in the dependent variable that is explained by the regression. The value of R-square ranges from 0 to 1. The closer R-square is to one, the closer the regression line is to the actual data. Problems with the R-square is that it does not take in to account the number of estimated coefficients in the regression. As more explanatory variables are included, the value for R-square cannot decrease. It can only increase until it eventually reaches one. This can potentially mislead one to think that the overall fit of the regression line to the data is closer than it actually is.

Adjusted R-square is used to correct deficiencies in R-squared with regards to numerous amounts of estimated coefficients that can result in a misleading value for R-squared. The Adjusted R-square is given as: ajusted R^2 = 1 - (1-R^2)*[(n-1)/(n-k)] where n represents the number of total observations and k is the number of estimated coefficients. The equation is designed to penalize a researcher who performs a regression with only a small difference between n and k, known as degrees of freedom. If there is a significant difference between the two, the value of R-square is high due to the amount of estimated coefficients relative to the sample size. This is seen by a large difference in the adjusted R-square and R-squared values. Adjusted R-square and R-squared have many real world applications to test the validity of the regression. They both have been used to predict anything from future stock prices to health care utilization.

Test Questions: 1. If the absolute value of the t-stat is > 2, what is the confidence level that you would have that the true value of the underlying parameter in the regression is not zero. a. 50% b. 75% c. 85% d. 95% e. 100%

2. The line that minimizes teh sum of squared deviations between the line and the actual data points is a. Linear Demand b. Isocost line c. Least squares regression d. None of the above

3. This form of analysis is used to estimate quantitative functional relationships between a dependent variable and one or more independant variables from data when the relationship between the variables is statistical rather than exact. a. Regression Analysis b. Repression Analysis c. Quantitative Relationship Analysis d. Economic Analysis

4. When using regression analysis, this factor is used to correct deficiencies in R-squared with regards to numerous amounts of estimated coefficients that can result in a misleading value for R-squared a. F-Stat b. Adjusted R-Square c. T-Stat d. None of the above

Answers: 1. d 2. c 3. a 4. b

References: Karkacier, O., Gokalp, G., and A. Cicek. (2006) "A regression analysis of the effect of entergy use in agriculture" //Energy Policy// 3(18), 3796-3800.

Baye, Michael R. __Managerial Economics And Business Strategy.__New York:McGraw Hill Irwin, 2006.