Re: [R] Cleaning data

2017-09-26 Thread Jim Lemon
Hi Bayan, Your question seems to imply that the "age" column contains floating point numbers, e.g. df height weight age 170 72 21.5 ... If this is so, you will only find an integer in diff(age) if two adjacent numbers happen to have the same decimal fraction _and_ the subtraction d

Re: [R] Cleaning data

2017-09-26 Thread Eric Berger
Hi Bayan, In your code, 'a' is a vector and is.integer(a) is a logical of length 1 - most likely FALSE if even one element of a is not an integer. (Since R will coerce all the elements of a to the same type.) You need to decide whether something "close enough" to an integer is to be considered an i

[R] Cleaning data

2017-09-26 Thread bayan sardini
Hi I want to clean my data frame, based on the age column, whereas i want to delete the rows that the difference between its elements (i+1)-i= integer. i used a <- diff(df$age) for(i in a){if(is.integer(a) == true){df <- df[-a,] }} but, it doesn’t work, any ideas Thanks in advance Bayan ___

Re: [R] Cleaning

2015-11-11 Thread Boris Steipe
If what you posted here is what you typed, your syntax is wrong. I strongly advise you to consult the two links here: http://adv-r.had.co.nz/Reproducibility.html http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example ... and please read the posting guide and don't po

Re: [R] Cleaning

2015-11-11 Thread Ashta
Sarah, Thank you very much. For the other variables I was trying to do the same job in different way because it is easier to list it Example test < which(dat$var1 !="BAA" | dat$var1 !="FAG" ) { dat <- dat[-test,]} and I did not get the right result. What am I missing here? On Wed

Re: [R] Cleaning

2015-11-11 Thread Sarah Goslee
On Wed, Nov 11, 2015 at 8:44 PM, Ashta wrote: > Hi Sarah, > > I used the following to clean my data, the program crushed several times. > > test <- dat[dat$Var1 == "YYZ" | dat$Var1 =="MSN" ,] > > What is the difference between these two > > test <- dat[dat$Var1 %in% "YYZ" | dat$Var1 %in% "MSN" ,]

Re: [R] Cleaning

2015-11-11 Thread Ashta
Hi Sarah, I used the following to clean my data, the program crushed several times. *test <- dat[dat$Var1 == "YYZ" | dat$Var1 =="MSN" ,]* *What is the difference between these two**test <- dat[dat$Var1 **%in% "YYZ" | dat$Var1** %in% "MSN" ,]* On Wed, Nov 11, 2015 at 6:38 PM, Sarah Goslee

Re: [R] Cleaning

2015-11-11 Thread Sarah Goslee
Please keep replies on the list so others may participate in the conversation. If you have a character vector containing the potential values, you might look at %in% for one approach to subsetting your data. Var1 %in% myvalues Sarah On Wed, Nov 11, 2015 at 7:10 PM, Ashta wrote: > Thank you Sar

Re: [R] Cleaning

2015-11-11 Thread Sarah Goslee
Hi, On Wed, Nov 11, 2015 at 6:51 PM, Ashta wrote: > Hi all, > > I have a data frame with huge rows and columns. > > When I looked at the data, it has several garbage values need to be > > cleaned. For a sample I am showing you the frequency distribution > of one variables > > Var1 Freq > 1

[R] Cleaning

2015-11-11 Thread Ashta
Hi all, I have a data frame with huge rows and columns. When I looked at the data, it has several garbage values need to be cleaned. For a sample I am showing you the frequency distribution of one variables Var1 Freq 1:3 2]6 3MSN 1040 4YYZ 300 5\\4 6+

Re: [R] Cleaning up workspace

2013-10-16 Thread Duncan Murdoch
This has been reported before on the bug list (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15481). The message is coming from the methods package, but I don't know if it's a bug or ignorable. Duncan Murdoch On 16/10/2013 11:03 AM, Prof J C Nash (U30A) wrote: In order to have a clea

[R] Cleaning up workspace

2013-10-16 Thread Prof J C Nash (U30A)
In order to have a clean workspace at the start of each chapter of a book I'm "knit"ing I've written a little script as follows: # chapclean.R # This cleans up the R workspace ilist<-c(".GlobalEnv", "package:stats", "package:graphics", "package:grDevices", "package:utils", "package:datasets",

Re: [R] Cleaning up messy Excel data

2012-03-03 Thread John Kane
Seconded John Kane Kingston ON Canada > -Original Message- > From: rolf.tur...@xtra.co.nz > Sent: Sat, 03 Mar 2012 13:46:42 +1300 > To: 538...@gmail.com > Subject: Re: [R] Cleaning up messy Excel data > > On 03/03/12 12:41, Greg Snow wrote: > > >&g

Re: [R] Cleaning up messy Excel data

2012-03-03 Thread Greg Snow
Sometimes we adapt to our environment, sometimes we adapt our environment to us. I like fortune(108). I actually was suggesting that you add a tool to your toolbox, not limit it. In my experience (and I don't expect everyone else's to match) data manipulation that seems easier in Excel than R is

Re: [R] Cleaning up messy Excel data

2012-03-03 Thread John C Nash
> From: jim holtman > To: Greg Snow <538...@gmail.com> > Cc: r-help > Subject: Re: [R] Cleaning up messy Excel data > Message-ID: > > Content-Type: text/plain; charset=ISO-8859-1 > > Unfortunately they only know how to use Excel and Word. They are not > fo

Re: [R] Cleaning up messy Excel data

2012-03-02 Thread jim holtman
Unfortunately they only know how to use Excel and Word. They are not folks who use a computer every day. Many of them run factories or warehouses and asking them to use something like Access would not happen in my lifetime (I have retired twice already). I don't have any problems with them "mess

Re: [R] Cleaning up messy Excel data

2012-03-02 Thread Rolf Turner
On 03/03/12 12:41, Greg Snow wrote: It is possible to do the right thing in Excel, but Excel does not encourage (let alone force) you to do the right thing, but makes it easy to do the wrong thing. Fortune! cheers, Rolf Turner __ R-h

Re: [R] Cleaning up messy Excel data

2012-03-02 Thread Jim Lemon
Unfortunately, a lot of people who use MS Office don't have or know how to use MS Access. Where I work now (as in the past) I have to tie someone to their chair, give them a few pokes with the cattle prod and then show them that a CSV file will load straight into Excel before I can convince the

Re: [R] Cleaning up messy Excel data

2012-03-02 Thread Greg Snow
Try sending your clients a data set (data frame, table, etc) as an MS Access data table instead. They can still view the data as a table, but will have to go to much more effort to mess up the data, more likely they will do proper edits without messing anything up (mixing characters in with number

Re: [R] Cleaning up messy Excel data

2012-03-01 Thread jim holtman
But there are some important reasons to use Excel. In my work there are a lot of people that I have to send the equivalent of a data.frame to who want to look at the data and possibly slice/dice the data differently and then send back to me updates. These folks do not know how to use R, but do ha

Re: [R] Cleaning up messy Excel data

2012-02-29 Thread Rolf Turner
On 01/03/12 04:43, John Kane wrote: (mydata<- as.factor(c("1","2","3", ">2", "5", ">2"))) str(mydata) newdata<- as.character(mydata) newdata[newdata==">2"]<- 0 newdata<- as.numeric(newdata) str(newdata) We really need to keep Excel (and other spreadsheets) out of peoples hands. Amen, bro'!!!

Re: [R] Cleaning up messy Excel data

2012-02-29 Thread John Kane
d other spreadsheets) out of peoples hands. John Kane Kingston ON Canada > -Original Message- > From: noahsilver...@ucla.edu > Sent: Tue, 28 Feb 2012 13:27:13 -0800 > To: r-help@r-project.org > Subject: [R] Cleaning up messy Excel data > > Unfortunately, some d

Re: [R] Cleaning up messy Excel data

2012-02-28 Thread Stephen Sefick
Just replace that value with zero. If you provide some reproducible code I could probably give you a solution. ?dput good luck, Stephen On 02/28/2012 03:27 PM, Noah Silverman wrote: Unfortunately, some data I need to work with was delivered in a rather messy Excel file. I want to import int

Re: [R] Cleaning up messy Excel data

2012-02-28 Thread Noah Silverman
That's exactly what I need. Thank You!! -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095 On Feb 28, 2012, at 1:42 PM, jim holtman wrote: > First of all when reading in the CSV file, use 'as.is = TRUE' to > prevent the changing to factors. > > N

Re: [R] Cleaning up messy Excel data

2012-02-28 Thread Robert Baer
-Original Message- From: Noah Silverman Sent: Tuesday, February 28, 2012 3:27 PM To: r-help Subject: [R] Cleaning up messy Excel data Unfortunately, some data I need to work with was delivered in a rather messy Excel file. I want to import into R and clean up some things so that I can

Re: [R] Cleaning up messy Excel data

2012-02-28 Thread jim holtman
First of all when reading in the CSV file, use 'as.is = TRUE' to prevent the changing to factors. Now that things are character in that column, you can use some pattern expressions (gsub, regex, ...) to search for and change your data. E.g., sub("<.*", "0", yourCol) should do it for you. On Tue

[R] Cleaning up messy Excel data

2012-02-28 Thread Noah Silverman
Unfortunately, some data I need to work with was delivered in a rather messy Excel file. I want to import into R and clean up some things so that I can do my analysis. Pulling in a CSV from Excel is the easy part. My current challenge is dealing with some text mixed in the values. i.e. 118

Re: [R] Cleaning date columns

2011-03-10 Thread natalie.vanzuydam
Dear Bill, Thanks very much for the reply and for the code. I have amended my personal details for future posts. I was wondering if there were any good books or tutorials for writing code similar to what you have provided above? Best wishes, Natalie Van Zuydam - Natalie Van Zuydam PhD Stu

Re: [R] Cleaning date columns

2011-03-09 Thread Bill.Venables
PM To: r-help@r-project.org Subject: [R] Cleaning date columns Hi Everyone, I have the following problem: data <- structure(list(prochi = c("IND1", "IND1", "IND1", "IND2", "IND2", "IND2", "IND2", "IND3", "IND4

[R] Cleaning date columns

2011-03-09 Thread Newbie19_02
Hi Everyone, I have the following problem: data <- structure(list(prochi = c("IND1", "IND1", "IND1", "IND2", "IND2", "IND2", "IND2", "IND3", "IND4", "IND5"), date_admission = structure(c(6468, 6470, 7063, 9981, 9983, 14186, 14372, 5129, 9767, 11168), class = "Date")), .Names = c("prochi", "da

Re: [R] cleaning up a vector

2010-10-01 Thread Henrique Dallazuanna
Complementing: findInterval(x[is.finite(x)], 1:20) On Fri, Oct 1, 2010 at 2:55 PM, Henrique Dallazuanna wrote: > Try this: > > x[is.finite(x)] > > > > On Fri, Oct 1, 2010 at 2:51 PM, wrote: > >> I calculated a large vector. Unfortunately, I have some measurement error >> in my data and some o

Re: [R] cleaning up a vector

2010-10-01 Thread Marc Schwartz
On Oct 1, 2010, at 12:51 PM, mlar...@rsmas.miami.edu wrote: > I calculated a large vector. Unfortunately, I have some measurement error > in my data and some of the values in the vector are erroneous. I ended up > wih some Infs and NaNs in the vector. I would like to filter out the Inf > and Na

Re: [R] cleaning up a vector

2010-10-01 Thread Peter Langfelder
On Fri, Oct 1, 2010 at 10:51 AM, wrote: > I calculated a large vector.  Unfortunately, I have some measurement error > in my data and some of the values in the vector are erroneous.  I ended up > wih some Infs and NaNs in the vector.  I would like to filter out the Inf > and NaN values and only k

Re: [R] cleaning up a vector

2010-10-01 Thread Erik Iverson
Mike, Small, reproducible examples are always useful for the rest of the us. x <- c(0, NA, NaN, 1 , 10, 20, 21, Inf) x[!is.na(x) & x >=1 & x<= 20] Is that what you're looking for? mlar...@rsmas.miami.edu wrote: I calculated a large vector. Unfortunately, I have some measurement error in my d

Re: [R] cleaning up a vector

2010-10-01 Thread Henrique Dallazuanna
Try this: x[is.finite(x)] On Fri, Oct 1, 2010 at 2:51 PM, wrote: > I calculated a large vector. Unfortunately, I have some measurement error > in my data and some of the values in the vector are erroneous. I ended up > wih some Infs and NaNs in the vector. I would like to filter out the Inf

[R] cleaning up a vector

2010-10-01 Thread mlarkin
I calculated a large vector. Unfortunately, I have some measurement error in my data and some of the values in the vector are erroneous. I ended up wih some Infs and NaNs in the vector. I would like to filter out the Inf and NaN values and only keep the values in my vector that range from 1 to 2

Re: [R] Cleaning a time series

2008-05-23 Thread Gabor Grothendieck
The zoo package has six na.* routines for carrying values forward, etc. library(zoo) ?zoo describes them. Also see the vignettes. On Fri, May 23, 2008 at 6:55 AM, <[EMAIL PROTECTED]> wrote: > Dear R Users, > > Was wondering if anyone can give me pointers to functionality in R that > can help "

[R] Cleaning a time series

2008-05-23 Thread tolga . i . uzuner
Dear R Users, Was wondering if anyone can give me pointers to functionality in R that can help "clean" a time series ? For example, some kind of package/functionality which identifies potential "errors" and takes some action, such as replacement by some suitable value (carry-forward, average o

Re: [R] Cleaning up memory in R

2008-05-14 Thread Duncan Murdoch
On 5/14/2008 3:59 PM, Anh Tran wrote: I'm trying to work on a large dataset and after each segment of run, I need a command to flush the memory. I tried gc() and rm(list=ls()) but they don't seem to help. gc() does not do anything beside showing the memory usage. How do you know it does nothing

Re: [R] Cleaning up memory in R

2008-05-14 Thread Anh Tran
Sorry, it's the stupid mistake on my part. Please forgive that question. I have to unload the variable first. On Wed, May 14, 2008 at 1:12 PM, Duncan Murdoch <[EMAIL PROTECTED]> wrote: > On 5/14/2008 3:59 PM, Anh Tran wrote: > >> I'm trying to work on a large dataset and after each segment of r

[R] Cleaning up memory in R

2008-05-14 Thread Anh Tran
I'm trying to work on a large dataset and after each segment of run, I need a command to flush the memory. I tried gc() and rm(list=ls()) but they don't seem to help. gc() does not do anything beside showing the memory usage. I'm using the package BSgenome from BioC. Thanks a bunch -- Regards,

Re: [R] Cleaning database: grep()? apply()?

2007-11-13 Thread jim holtman
Here is how to wittle it down for the first two parts of your question. I am not exactly what you are after in the third part. Is it that you want specific DATEs or do you want the ratio of the DATE[max]/DATE[min]? > x <- read.table(textConnection("CODENAME

[R] Cleaning database: grep()? apply()?

2007-11-13 Thread Jonas Malmros
Dear R users, I have a huge database and I need to adjust it somewhat. Here is a very little cut out from database: CODENAME DATE DATA1 4813ADVANCED TELECOM19870.013 3845ADVANCED THERAPEUTIC SYS LT