The question really is, why form groups when you already have the two,
numerical continuous variables that you want?  That is, what is the benefit
of grouping?  I can think of none.  I personally think this is a historical
thing that started when computers were unavavailable and it reduced the
mathematics to do-able level.  Today, the stats works without grouping.

Jim

On Fri, Mar 26, 2010 at 09:30, Francisco de Castro <[email protected]>wrote:

> Hi all,
>
> I have a question for the list regarding grouping (binning) of the
> independent variable in a linear regression. This is routinely done
> (at least in limnology) in studies involving so-called biomass
> size-spectra. I'm aware of other (better) methods to fit non-linear
> models. However, I need to compare my results with older literature
> where this method is used widely, and I'd like to know first if the
> method has a problem or if it is outright wrong.
>
> My independent variable is mean body size of the individuals of a
> species (M) and the dependent is either biomass (B, g/m2) or
> population density (D, indiv/m2) of the species. Body size is
> lognormally distributed, and the number of species in the sample is
> ~100. The model to fit is: D= aM^b. First, data are log-transformed in
> order to apply linear least-squares regression. So the model becomes
> log(D)= log(a)+ b log(M). The appropriateness of this transformation
> and possible bias in the estimation of parameters have been discussed
> before (Zar, Smith, others) so my question in not about that. After
> log-transforming, sizes are grouped into even-spaced categories, and
> the densities/biomasses for all sizes within a size group are summed
> up. So, the independent variable becomes the center of each
> log-size-bin, and the dependent becomes the sum of all log-densities
> for each size-bin. Obviously, the number of data gets reduced from the
> original N to the number of size groups/bins used. After grouping, the
> log-log model is fitted by least-squares regression.
>
> So my questions are:
> Is this binning of a log-transformed variable statistically
> appropriate for this problem?
> Shouldn't be better to use directly the size and density for each
> species without any grouping?
>
> Thanks in advance for any suggestion or literature.
> Cheers
>
> Francisco de Castro
> Potsdam University
>

Reply via email to