On 8/23/2009 9:58 AM, David Winsemius wrote: > I still have problems with this statement. As I understand R, this should be impossible. I have looked at both you postings and neither of them clarify the issues. How can you have blanks or spaces in an R numeric vector?
Just because I search numeric columns doesn't mean that my regex matches them! I posted some info on my data frame in an earlier email: str(final_dataf) 'data.frame': 1127 obs. of 43 variables: $ block : Factor w/ 1 level "2": 1 1 1 1 1 1 1 1 1 1 ... $ treatment : Factor w/ 4 levels "I","M","N","T": 1 1 1 1 1 1 ... $ transect : Factor w/ 1 level "4": 1 1 1 1 1 1 1 1 1 1 ... $ tag : chr NA "121AL" "122AL" "123AL" ... ... $ h1 : num NA NA NA NA NA NA NA NA NA NA ... ... You can see that I do indeed have some numeric columns. And while I search them for spaces, I only do so because my dataset isn't so large as to require me to exclude them from the search. If my dataset grows too big at some point, I will exclude numeric columns, and other columns which cannot hold blanks or spaces. To clarify further with an example: > df = data.frame(a=c(1,2,3,4,5),b=c("a","","c","d"," ")) > df = as.data.frame(lapply(df, function(x){ is.na(x) <- + grep('^\\s*$',x); return(x) }), stringsAsFactors = FALSE) > df a b 1 1 a 2 2 <NA> 3 3 c 4 4 d 5 5 <NA> > str(df) 'data.frame': 5 obs. of 2 variables: $ a: num 1 2 3 4 5 $ b: Factor w/ 5 levels ""," ","a","c",..: 3 NA 4 5 NA And one final clarification: I left out "as.data.frame" in my previous solution. So it now becomes: > final_dataf = as.data.frame(lapply(final_dataf, function(x){ is.na(x) + <- grep('^\\s*$',x); return(x) }), stringsAsFactors = FALSE) Hope that clarifies things, and thanks for your help. Thanks, Allie On 8/23/2009 9:58 AM, David Winsemius wrote: > > On Aug 23, 2009, at 2:47 AM, Alexander Shenkin wrote: > >> On 8/21/2009 3:04 PM, David Winsemius wrote: >>> >>> On Aug 21, 2009, at 3:41 PM, Alexander Shenkin wrote: >>> >>>> Thanks everyone for their replies, both on- and off-list. I should >>>> clarify, since I left out some important information. My original >>>> dataframe has some numeric columns, which get changed to character by >>>> gsub when I replace spaces with NAs. >>> >>> If you used is.na() <- that would not happen to a true _numeric_ vector >>> (but, of course, a numeric vector in a data.frame could not have spaces, >>> so you are probably not using precise terminology). >> >> I do have true numeric columns, but I loop through my entire dataframe >> looking for blanks and spaces for convenience. > > I still have problems with this statement. As I understand R, this > should be impossible. I have looked at both you postings and neither of > them clarify the issues. How can you have blanks or spaces in an R > numeric vector? > > >> >>> You would be well >>> advised to include the actual code rather than applying loose >>> terminology subject you your and our misinterpretation. >> >> I did include code in my previous email. Perhaps you were looking for >> different parts. >> >>> >>> ?is.na >>> >>> >>> I am guessing that you were using read.table() on the original data, in >>> which case you should look at the colClasses parameter. >>> >> >> yep - I use read.csv, and I do use colClasses. But as I mentioned >> earlier, gsub converts those columns to characters. Thanks for the tip >> about is.na() <-. I'm now using the following, thus side-stepping the >> whole "controlling as.data.frame's column conversion" issue: >> >> final_dataf = lapply(final_dataf, function(x){ is.na(x) <- >> + grep('^\\s*$',x); return(x) }) > > > Good that you have a solution. > > David Winsemius, MD > Heritage Laboratories > West Hartford, CT > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.