Dear Darrell, Regarding the Error, I think it is the space issue.
dat1[c(TRUE,(diff(dat1$value)<-100)|(diff(dat1$value)>200)),] Error in diff(dat1$value) <- 100 : could not find function "diff<-" res<-dat1[c(TRUE,(diff(dat1$value)< -100) | (diff(dat1$value)>200)),] ^^ #or dat1[c(TRUE,(diff(dat1$value)<(-100)) | (diff(dat1$value)>200)),] ?`<-` #will assign a value to a name res date value 1 01/01/1947 2180 2 01/02/1947 1990 9 01/09/1947 1909 10 01/10/1947 1803 11 01/11/1947 2018 12 01/12/1947 2319 13 01/13/1947 1981 17 01/17/1947 2364 18 01/18/1947 1882 26 01/26/1947 1839 27 01/27/1947 2344 28 01/28/1947 2229 30 01/30/1947 1923 32 02/01/1947 2379 ----- Original Message ----- From: "Bosch, Darrell" <bo...@vt.edu> To: arun <smartpink...@yahoo.com> Cc: Sent: Monday, August 5, 2013 3:26 PM Subject: RE: [R] eliminating outliers Thanks, Arun. When I entered the second line of code, dat1[c(TRUE,(diff(dat1$value)<-100)|(diff(dat1$value)>200)),] I got the following Error in diff(dat1$value) <- 100 : could not find function "diff<-" Do I need to name 'dat1' as a time series dataset in order to invoke the difference operator? I appreciate your help. Darrell Darrell Bosch Professor Department of Agricultural and Applied Economics Virginia Tech Blacksburg, VA 24061 tel. 540/231-5265 fax 540/231-7417 -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: Monday, August 05, 2013 12:07 PM To: R help Subject: Re: [R] eliminating outliers HI, Please use ?dput() to show a reproducible example. set.seed(45) dat1<- data.frame(date= format(seq(as.Date("01-01-1947",format="%m-%d-%Y"),as.Date("02-01-1947",format="%m-%d-%Y"),by=1),"%m/%d/%Y"),value=sample(1800:2400,32,replace=FALSE)) dat1[c(TRUE,(diff(dat1$value)< -100) | (diff(dat1$value)>200)),] which(!c(TRUE,(diff(dat1$value)< -100) | (diff(dat1$value)>200))) # [1] 3 4 5 6 7 8 14 15 16 19 20 21 22 23 24 25 29 31 dat1[which(!c(TRUE,(diff(dat1$value)< -100) | (diff(dat1$value)>200))),] A.K. I am reading a data file consisting of date and GDP as follows gdpdata <- read.table("C:/R-working/R-data/gdp-data-1947-87.txt", header=TRUE) Which results in date value 1 01/01/1947 1932.6 2 04/01/1947 1930.4 Etc. I then first difference the data using the command diff(gdpdata$value) I would like to create a transformed dataset with outliers eliminated, i.e. any value of ‘diff’ that is greater than 200 or less than -100. Further, I would like R to tell me which dates and GDP values were eliminated. Any suggestions with how to do that would be appreciated. Thanks, Darrell Bosch ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.