Re: [R] data.frame transformation

andrija djurovic Mon, 14 Mar 2011 23:50:59 -0700

Thank you Bill for this additional solution.

Andrija


On Tue, Mar 15, 2011 at 12:16 AM, <bill.venab...@csiro.au> wrote:

> It is possible to do it with numeric comparisons, as well, but to make life
> comfortable you need to turn off the warning system temporarily.
>
> df <- data.frame(q1 = c(0,0,33.33,"check"),
>                 q2 = c(0,33.33,"check",9.156),
>                 q3 = c("check","check",25,100),
>                 q4 = c(7.123,35,100,"check"))
>
> conv <- function(x, cutoff) {
>        oldOpt <- options(warn = -1)
>        on.exit(options(oldOpt))
>        x <- as.factor(x)
>        lev <- as.numeric(levels(x))
>        levels(x)[!is.na(lev) & lev < cutoff] <- "."
>        x
> }
>
> Check:
> > (df1 <- data.frame(lapply(df, conv, cutoff = 10)))
>     q1    q2    q3    q4
> 1     .     . check     .
> 2     . 33.33 check    35
> 3 33.33 check    25   100
> 4 check     .   100 check
> >
>
> Bill Venables.
>
> -----Original Message-----
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of David Winsemius
> Sent: Tuesday, 15 March 2011 6:29 AM
> To: andrija djurovic
> Cc: r-help@r-project.org
> Subject: Re: [R] data.frame transformation
>
>
> On Mar 14, 2011, at 3:51 PM, andrija djurovic wrote:
>
> > I would like to hide cells with values less the 10%, so "." or just
> > "" doesn't make me any difference. Also I used apply combined with
> > as.character:
> >
> > apply(df, 2, function(x)  ifelse(as.character(x) < 10,".",x))
> >
> > This is, probably not a good solution, but it works except that I
> > lose  row names and because of that I was wondering if there is some
> > other way to do this.
> >
> > Anyway thank you both i will try to do this before combining numbers
> > and strings.
>
> I saw your later assertion that it didn't work which surprised me. My
> version of your data followed my advice not to use factors and your
> effort did succeed when the columns were character rather than factor.
> I put back the row numbers by coercing back to a data.frame. `apply`
> returns a matrix.
>
>  > df<-data.frame(q1=c(0,0,33.33,"check"),q2=c(0,33.33," check",9.156),
> + q3=c("check","check",25,100),q4=c(7.123,35,100,"check"),
> stringsAsFactors=FALSE)
>  > as.data.frame(apply(df, 2, function(x)  ifelse(as.character(x) <
> 10,".",x)))
>      q1    q2    q3    q4
> 1     .     . check 7.123
> 2     . 33.33 check    35
> 3 33.33     .    25   100
> 4 check 9.156   100 check
>
> There is a danger of using character collation in that if there are
> any leading characters in those strings that are below "1" such as a
> <blank> or any other punctuation, they will get "dotted".
>
>  > "," < "1"
> [1] TRUE
>  > "." < "1"
> [1] TRUE
>  > "-" < "1"
> [1] TRUE
>
> And "1.check" would also get "dotted"
>
>  > "1.check" < 10
> [1] TRUE
>
> >
> > Andrija
> >
> > On Mon, Mar 14, 2011 at 8:11 PM, David Winsemius <dwinsem...@comcast.net
> > > wrote:
> >
> > On Mar 14, 2011, at 2:52 PM, andrija djurovic wrote:
> >
> > Hi R users,
> >
> > I have following data frame
> >
> > df<-data.frame(q1=c(0,0,33.33,"check"),q2=c(0,33.33,"check",9.156),
> > q3=c("check","check",25,100),q4=c(7.123,35,100,"check"))
> >
> > and i would like to replace every element that is less then 10
> > with . (dot)
> > in order to obtain this:
> >
> >    q1    q2    q3    q4
> > 1     .     . check     .
> > 2     . 33.33 check    35
> > 3 33.33 check    25   100
> > 4 check     .   100 check
> >
> > I had a lot of difficulties because each variable is factor.
> >
> > Right, so comparisons with "<" will throw an error.  I would
> > sidestep the factor problem with stringsAsFactors=FALSE in the
> > data.frame call. You might want to reconsider the "." as a missing
> > value. If you are coming from a SAS background, you should try to
> > get comfortable with NA or NA_character as a value.
> >
> >
> > df<-data.frame(q1=c(0,0,33.33,"check"),q2=c(0,33.33,"check",9.156),
> >  q3=c("check","check",25,100),q4=c(7.123,35,100,"check"),
> > stringsAsFactors=FALSE)
> >
> > is.na(df) <- t(apply(df, 1, function(x)  as.numeric(x) < 10))
> >
> > Warning messages:
> > 1: In FUN(newX[, i], ...) : NAs introduced by coercion
> > 2: In FUN(newX[, i], ...) : NAs introduced by coercion
> > 3: In FUN(newX[, i], ...) : NAs introduced by coercion
> > 4: In FUN(newX[, i], ...) : NAs introduced by coercion
> > > df
> >     q1    q2    q3    q4
> > 1  <NA>  <NA> check  <NA>
> > 2  <NA> 33.33 check    35
> >
> > 3 33.33 check    25   100
> > 4 check  <NA>   100 check
> >
> >
> > Could someone help me with this?
> >
> > Thanks in advance for any help.
> >
> > Andrija
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> > David Winsemius, MD
> > West Hartford, CT
> >
> >
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data.frame transformation

Reply via email to