Re: [R] different regression coeffs with different starting point

Achim Zeileis Mon, 14 Mar 2011 12:03:45 -0700

On Mon, 14 Mar 2011, Jen wrote:

Hi Bill,
Thanks for your response and I'm sorry -- that was a misleading example of
what I was trying to show. This one should illustrate the point:


require(AER)
data_in = c(0,6,12,18,24,30,36,42,48,54,60,66,72,78)
data_in2 = data_in^2
data_in3 = data_in^3
data_out =
c(139487.00,133333.00,62500.00,58823.00,56338.00,27549.00,4244.00,600.00,112.00,95.00,48.00,31.00,15.00,14.99)
ldata_out = log(data_out)

m1 <- lm(ldata_out ~ data_in + data_in2+data_in3)
cubic <- tobit(ldata_out ~ data_in + data_in2 + data_in3, left=
log(15-0.01),init = coef(m1))

The scaling of your regressors is extremely bad. data_in3 is in thehundreds of thousands. This computationally challenging, to put it mildly.I would recommend to either scale data_in by some constant (here 10 or 100is enough) or to use an orthogonal coding of the polynomial. Both can beachieved easily using the poly() function:

  - Your model would be:    ldata_out ~ poly(data_in, 3, raw = TRUE)
  - With scaled regressors: ldata_out ~ poly(data_in/100, 3, raw = TRUE)
  - With orthogonal coding: ldata_out ~ poly(data_in/100, 3)

All models conceptually lead to the same fitted values and the samemaximal likelihood. Numerically, however, the first model only achievessuboptimal values depending on the starting values. The latter two modelsseem to yield more sensible results and be close to the actual maximum.

Note that the coefficients change of course due to the different codingsof the regressors. This needs to be taken into account if you want tocompare the slopes.


In summary, I would probably use

cubicz <- tobit(ldata_out ~ poly(data_in/100, 3, raw = TRUE),
  left = log(15-0.01))

which is very robust to the choice of starting values.

hth,
Z

print(cubic)
fitted(cubic)

n= length(data_in)
m2 <- lm(ldata_out[1:(n-1)] ~ data_in[1:(n-1)] +
data_in2[1:(n-1)]+data_in3[1:(n-1)])
cubic2 <- tobit(ldata_out ~ data_in + data_in2 + data_in3, left=
log(15-0.01),init = coef(m2))
print(cubic2)
fitted(cubic2)

The models cubic and cubic2 seem to show that the intercept value in the
model doesn't change from its starting value:

m1


Call:
lm(formula = ldata_out ~ data_in + data_in2 + data_in3)

Coefficients:
(Intercept)      data_in     data_in2     data_in3
 1.156e+01    1.004e-01   -7.755e-03    6.464e-05

cubic


Call:
tobit(formula = ldata_out ~ data_in + data_in2 + data_in3, left = log(15 -
   0.01), init = coef(m1))

Coefficients:
(Intercept)      data_in     data_in2     data_in3
11.5559540    0.0289083   -0.0040615    0.0000229

Scale: 3.759

m2


Call:
lm(formula = ldata_out[1:(n - 1)] ~ data_in[1:(n - 1)] + data_in2[1:(n -
   1)] + data_in3[1:(n - 1)])

Coefficients:
       (Intercept)   data_in[1:(n - 1)]  data_in2[1:(n - 1)]  data_in3[1:(n
- 1)]
         1.150e+01            1.151e-01           -8.381e-03
7.122e-05

cubic2


Call:
tobit(formula = ldata_out ~ data_in + data_in2 + data_in3, left = log(15 -
   0.01), init = coef(m2))

Coefficients:
(Intercept)      data_in     data_in2     data_in3
 1.150e+01    3.391e-02   -4.187e-03    2.383e-05

Scale: 3.759

Am I missing something here??

--
View this message in context: 
http://r.789695.n4.nabble.com/different-regression-coeffs-with-different-starting-point-tp3353536p3354276.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] different regression coeffs with different starting point

Reply via email to