Online Learning Platform

Business Analytics > Predictive Modelling > Calculation

Excel Command:

Data -> Data Analysis -> Regression

So the final multiple regression model is

$\text{Sales} = -21.02 + 0.897 (\text{Advertising}) - 0.084 (\text{Size}) + 2.22 (\text{Income})$

Multiple correlation coefficient is 0.998 indicates strong association between Sales and all independent variables. R2 =0.996 indicates 99.6 % variation of Sales cause by independent variables.

Hypothesis:

There mainly two types of hypotheses that has to test for multiple linear regression: test of the entire model’s linearity and test of individual parameters.

Test of the linearity of the entire model: ANOVA tests for significance of the entire model. That is, it computes an F-statistic for testing the hypotheses

H₀: b₁ = b₂ = …. = b_k = 0

H₁: at least one b_j is not 0

The null hypothesis states that no linear relationship exists between the dependent and any of the independent variables, whereas the alternative hypothesis states that the dependent variable has a linear relationship with at least one independent variable. If the null hypothesis is rejected, we cannot conclude that a relationship exists with every independent variable individually. The test statistic is

$F = \frac{SSR / k}{SSE / (n - k - 1)}$

$F = \frac{R^2 / k}{(1 - R^2) / (n - k - 1)}$

SSR = Regression Sum of Squares

SSE = Error Sum of Squares

R² = Coefficient of determination

k = Number of independent variables

n = Sample size

In the example, results show in the ANOVA Table that p-value / significance value of F-statistics is very low i.e. zero. So the null hypothesis is rejected. In other words, at least one independent variable has significant effect on the dependent variable Sales. i.e. model is valid.

Test of individual coefficient: In multiple linear regression, we have to test hypotheses about each of the individual regression coefficients. Specifically, we may test the null hypothesis that

H₀: b_i = 0

H₁: b_i is not 0 for all i

If we reject the null hypothesis that the slope associated with independent variable then we may state that independent variable i is significant in the regression model; that is, it contributes to reducing the variation in the dependent variable and improves the ability of the model to better predict the dependent variable. However, if we cannot reject H₀, then that independent variable is not significant and probably should not be included in the model.

In the results, p-value of Size has larger p-value which is greater than 0.05. This indicates that the null hypothesis corresponding to this variable is accepted. i.e. there is no effect of Size on Sales.

The test statistic is

$t = \frac{\hat{\beta}_i - \beta_i}{SE(\hat{\beta}_i)}$