On 19/07/2020 16:11, Dino wrote: > On 7/19/2020 4:54 PM, duncan smith wrote: >> >> It depends on what you expect the result to be. There's nothing >> inherently wrong with transforming variables before using least squares >> fitting. Whether it gives you the "best" estimates for the coefficients >> is a different issue. > > Thanks a lot for this, Duncan. I guess I have to follow-up questions at > this point: > > 1) in which ways is this approach sub-optimal? > > 2) what's the "right" way to do it? > > Thank you >
You'll have to read up a bit on ordinary least squares (e.g. https://en.wikipedia.org/wiki/Ordinary_least_squares). It is based on assumptions that might not necessarily hold for a given model / dataset. Depending on which assumptions are violated the estimates can be affected in different ways. It is usual to fit the model, then check the residuals to see if the assumptions (approximately) hold. If not, it might indicate a poor model fit or suggest fitting a transformed model (to estimate the same coefficients while satisfying the assumptions). e.g. For the latter case, Y = a + bX has the same coefficients as Y/X = a * 1/X + b but the latter regression might satisfy the assumption of constant variance for the errors. Regression analysis is a bit of an art, and it's a long time since I did any. Ordinary least squares is optimal in a certain sense when the assumptions hold. When they don't there's no single answer to what the best alternative is (unless it's employ a good statistician). Duncan -- https://mail.python.org/mailman/listinfo/python-list