Re: [R] remove extreme values or winsorize – loop - dataframe

2010-08-03 Thread Liviu Andronic
On Tue, 3 Aug 2010 10:00:08 +1000 Glen Barnett wrote: > This might help some: > > RSiteSearch("winsorize") > Or require(sos) findFn("winsorize") Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read t

Re: [R] remove extreme values or winsorize – loop - dataframe

2010-08-02 Thread Glen Barnett
This might help some: RSiteSearch("winsorize") On Sun, Aug 1, 2010 at 11:39 AM, Cecilia Carmo wrote: > Hi everyone! > > #I need a loop or a function that creates a X2 variable that is X1 without > the extreme values (or X1 winsorized) by industry and year. > > #My reproducible example: > firm<-s

Re: [R] remove extreme values or winsorize – loop - dataframe

2010-08-02 Thread jim holtman
I had to look up winsorized; this should do it: > #My reproducible example: > firm<-sort(rep(1:1000,10),decreasing=F) > year<-rep(1998:2007,1000) > industry<-rep(c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10),rep(6,10),rep(7,10),rep(8,10),rep(9,10), + rep(10,10)),1000) > X1<-rnorm(1) > da

Re: [R] remove extreme values or winsorize – loop - dataframe

2010-08-02 Thread Cecilia Carmo
Thank you again, but I think I need to do some homework about the split function, because I'm not understanding it very well. Besides, I think I still have a problem. I also need X2 = X1 winsorized: X2 is equal to X1 between 10%-90%, and is equal to the 10% value when < 10% and equal to the 90%

Re: [R] remove extreme values or winsorize – loop - dataframe

2010-08-02 Thread jim holtman
This is just following up with the example data you sent. This will create a list 'result' that will have the subset of data between the 10% & 90%-tiles of the data: > #My reproducible example: > firm<-sort(rep(1:1000,10),decreasing=F) > year<-rep(1998:2007,1000) > industry<-rep(c(rep(1,10),rep(2

Re: [R] remove extreme values or winsorize – loop - dataframe

2010-08-02 Thread Cecilia Carmo
Thank you for your help but I don't understand how can I have a dataframe with the columns: firm, year, industry, X1 and X2. Could you help me (again)? Cecília Carmo Em Sat, 31 Jul 2010 22:10:38 -0400 jim holtman escreveu: This will split the data by industry & year and then return the va

Re: [R] remove extreme values or winsorize – loop - dataframe

2010-07-31 Thread jim holtman
This will split the data by industry & year and then return the values that include the 80%-tile (>=10% & <= 90%) # split the data by industry/year d.s <- split(data, list(data$industry, data$year), drop=TRUE) result <- lapply(d.s, function(.id){ # get 10/90% values .limit <- quantile(.id$

[R] remove extreme values or winsorize – loop - dataframe

2010-07-31 Thread Cecilia Carmo
Hi everyone! #I need a loop or a function that creates a X2 variable that is X1 without the extreme values (or X1 winsorized) by industry and year. #My reproducible example: firm<-sort(rep(1:1000,10),decreasing=F) year<-rep(1998:2007,1000) industry<-rep(c(rep(1,10),rep(2,10),rep(3,10),rep(4,1