# Anyone here a statistician?

Discussion in 'Off-Topic Discussions' started by Justitia, Jul 12, 2012.

1. ### Justitia Elite Member Lifetime Platinum Member

Joined:
Aug 3, 2004
Messages:
4,593
33
It's been so long ago since I studied and used statistics, my mind is drawing a blank on some items.

When one runs a linear regression with multiple independent variables, the coefficient of the independent variable indicates its contribution to the variance of the dependent variable.

Obviously the coefficients can't add up to more than 100% of the explanation of the dependent variable's variance.

What I can't remember is are coefficients expressed in a manner that they sum up to less than or equal to 1? or something else?

The reason I ask is that I am in the midst of one of my articles and there is a criticism of an author's results who found an independent variable statistically significant. The size of the coefficient is 0.38

Obviously if all coefficients have to be less than or equal to 1 then 0.38 is a pretty substantial contribution to the variance of the dependent variable. But if there is no innate limit on the size of the coefficient, 0.38 might be considered trivially small (as one critic argues -- the only issue there is that it is the only independent variable which has a statistically significant coefficient. All the other explanatory variable considered relevant have coefficients that are far from being statistically significant.)

So anyone here who knows about linear regression -- can you enlighten me? #1
2. ### raptor Well-Known Member

Joined:
May 13, 2004
Messages:
200
0
I work with several statisticians, I'll ask them tomorrow at work.

#2
3. ### Justitia Elite Member Lifetime Platinum Member

Joined:
Aug 3, 2004
Messages:
4,593
33
After digging a bit online -- I think found my answer.

The coefficients of the independent variables reflect % change in the dependent variable. I was dealing with a coefficient another scholar found that was 0.38 -- the coefficient was statistically significant but at 0.38, i.e. increases in the independent variable would be predicted to increase the value of the dependent variable by 0.38 % which is quite small.

The fact that the coefficient was so small was one of the criticisms of the author's results. However, none of the other interesting explanatory variables had coefficients that were statistically significant.

If i have figure this out incorrectly, please chime in.

Assuming I have figured this out correctly, now I have to come up with a word that conveys to an unsophisticated (statistics) audience that indicates the coefficient is statistically significant (i.e., it is indeed with 95% certainty highly correlated with the dependent variable) but changes in that independent variable will likely effect changes in the dependent variable but only to a small degree.

As the subject matter is the hotly debated affirmative action policies in education admissions -- which are about to come under review at the U.S. Supreme Court for the third time in history when the current make-up of the Court is noted for its conservatism,these stats issues become important to the debate....

#3
4. ### chicanerous Elite Member Lifetime Platinum Member

Joined:
Feb 1, 2004
Messages:
16,364
25
The coefficients of a multiple linear model reflect the units of change in the response variable per unit change in the predictor variables. They will not typically range between 0 and 1 or sum to 1. Typically the statistical significance of the predictors will be given as p-values (which will range between 0 and 1), where lower values will indicate higher significance. With linear regression, it's important for model assumptions to be satisfied so that the coefficients are estimated well. Some of the assumptions for ordinary least squares, for example, are that the predictors are not heteroscedastic or multicollinear and the errors show constant variance.

As an example, a model using type of company, number of sales, and total assets to predict profits might look something like:

Profits = 98.825083 - 226.2494*Type + 0.1154404*Sales - 0.043795*Assets

You can see that the intercept as well as the coefficient for type are not in the 0 to 1 range and that the coefficients do not add to 1. Both Type and Sales were significant predictors of Profits (F ratio = 17.8177, p = 0.0002 and F ratio = 17.7091, p = 0.0002, respectively). Assets was not significant at the α = 0.05 level, but was significant at the 0.10 level (F ratio = 3.8079, p = 0.0611). That is of course additional information that would be generated as output by statistical software and provided separately in a research paper.

The size of the coefficients mainly reflects the scale and range of your variables.

(Don't take this as an expert explanation.)

#4
Last edited: Jul 14, 2012
5. ### JoeSchmo Well-Known Member

Joined:
Nov 7, 2004
Messages:
4,018
11
Just saw this thread -- I run multiple regressions all of the time. Sounds like you are just describing the beta-weights -- The p-values for individual slopes (showing independent contributions for each predictor) are often obtained with slope t-tests. R-squared will give you percent variance accounted for -- so it is the r-squared values that should sum to one (not the beta-weights).

Small effects are often one reason why the mantra that large sample sizes are always best may not necessarily be true. Huge samples can reveal statistically significant effects that have very little practical significance.

One example that I use is using Drug X to reduce the length of a common cold. With a large enough sample size, Drug X will prove to be statistically significant if it reduces the length of a common cold from 120 hours to 118 hours. But really, would anybody care?

#5
6. ### debacle Active Member

Joined:
Apr 26, 2010
Messages:
21
0
I just saw this thread myself, but Chicanerous is exactly correct. The coefficent only has meaning with respect to the mean and variance of the regressor it's modifying. If you want to be able to compare coefficients with each other, you will need to normalize all your independent variables (e.g., mean-sigma normalization is one option, subtract the mean of each I.V. and divide by S.D.).

#6
7. ### guava Elite Member Lifetime Platinum Member

Joined:
Feb 15, 2004
Messages:
12,604
19
Exactly how stupid is your audience? I think that explanation is pretty clear
I don't do much stats any more, but from what I remember, your .38 shows the amount of correlation; your p value is the one that shows the statistical significance.

Statistical significance means that if you repeated that same experiment 100 times in the same circumstances, you'd get at least the same amount of correlation 100(1-p) % of the time. eg 95 percent of the time for p=.05.

#7
Last edited: Aug 2, 2012
8. ### Justitia Elite Member Lifetime Platinum Member

Joined:
Aug 3, 2004
Messages:
4,593
33
Well as Joe Schmo said -- chicanerous, you are exactly correct. I called my statistician I keep on tap and this is precisely what he said.

#8

Joined:
Dec 16, 2004
Messages:
1,054
0
Guava and Justitia both back posting on a thread? It's like the good ol' days! #9