Re: [R] downsampling

Philipp Pagel Mon, 27 Jul 2009 07:29:16 -0700

On Mon, Jul 27, 2009 at 02:42:33PM +0200, Jan M. Wiener wrote:
> However, both approx() and spline() seem to select the number of
> required data points from the original data (at the correct positions,
> of course) and ignore the remaining data points, as the following
> example demonstrates:
> 
> > a= c(1,0,2,1,0)
> 
> > approx(a,n=3)
> $x
> [1] 1 3 5
> 
> $y
> [1] 1 2 0
> 
> Essentially, what approx has done (spline does the same) is to simply
> select the first, third, and fifth entry (as we want to downsample a 5
> point vector into a three point vector). The second and fourth data
> point are completely ignored.


That seems to be what Warren described as the 'degenerate case'
where approx will 'just throw away every other sample'. If you choose
a differetn n (e.g. n=4) interpolation does happen.

> This can result in quite dramatic changes
> of your data, if the data points selected by approx() or spline() happen
> to be outliers and if you downsample data by a rather strong factor.

Yes, that could affect your downsampled data. For more
robustness it would probably be better to fit a proper model (if you
have one) or a lowess curve (or smooth.spline) and go from there.

cu
        Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] downsampling

Reply via email to