On Wed, 20 Mar 2019 09:43:11 +0000
akshay kulkarni <akshay...@hotmail.com> wrote:

> But doesn't removing some of the parameters reduce the precision of
> the relationship between the response variable and the
> predictors(inefficient estimates of the coefficients)?

No, it doesn't, since there is already more variables in the formula
than it has relationships between response and predictors.

Let me offer you an example. Suppose you have a function y(x) = a*b*x +
c. Let's try to simulate some data and then fit it:

# choose according to your taste
a <- ...
b <- ...
c <- ...

# simulate model data
abc <- data.frame(x = runif(100))
abc$y <- a*b*abc$x + c
# add some normally distributed noise
abc$y <- abc$y + rnorm(100, 0, 0.01)

Now try to fit formula y ~ a*b*x + c using data in data frame abc. Do
you get any results? Do they match the values you have originally
set?[*]

Then try a formula with the ambiguity removed: y ~ d*x + c. Do you get a
result? Does the obtained d match a*b you had originally set?

Note that for the d you obtained you can get an infinite amount of
(a,b) tuples equally satisfying the equation d = a*b and the original
regression problem, unless you constrain a or b.

-- 
Best regards,
Ivan

[*] Using R, I couldn't, but the nonlinear solver in gnuplot is
sometimes able to give *a* result for such a degenerate problem when
data is sufficiently noisy. Of course, such a result usually doesn't
match the originally set variable values and should not be trusted.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to