Bias of Omitted Variables
A variable may affect dependent variable and as well as correlated with independent variable(s). When we omit such an independent variable out of the regression model then bias may occurs. This causes the estimated results to be biased and misleading.
Example: Suppose we have a model as Sales = f(Advertising)
But if we omit Income which affects sales as well as correlated with advertising as richer areas may have more advertising then the effect of Income gets wrongly mixed into Advertising, making advertising look more (or less) important than it really is. As a result, coefficients become biased, conclusions become incorrect, decisions based on the model may be wrong.
So we should acquire the appropriate domain knowledge when building a model and include all important variables. Also, we should try to trial different model specification.
Root Mean Squared Error (RMSE) is a popular evaluation metric for regression models, representing the square root of the average of squared differences between predicted and actual values. It acts as a measure of a model's prediction accuracy, where lower values indicate better fit and accuracy.
Statlearner
Statlearner