Thanks Thierry,

A quick test shows almost equivalent timing with the modification of relevel() suggested earlier:


relevel <-
function (x, ref, ...)
{
    lev <- levels(x)
    if (is.character(ref))
        ref <- match(ref, lev)
    if (any(is.na(ref)))
        stop("'ref' must be an existing level")
    nlev <- length(lev)
    if (any(ref < 1 | ref > nlev))
        stop(gettextf("ref = %d must be in 1:%d", ref, nlev),
            domain = NA)
    factor(x, levels = lev[c(ref, seq_along(lev)[-ref])])
}

> system.time(relevel(y, c("D", "B")))
   user  system elapsed
  5.972   0.258   6.395
>
> system.time(order.factor3(y, c("D", "B")))
   user  system elapsed
  5.962   0.274   6.459


It's always good to learn other options, though.

Thanks,

baptiste

On 9 Jan 2009, at 15:50, ONKELINX, Thierry wrote:

Dear Baptiste,

You can avoid the recursive stuff. And it will run about twice as fast.

order.factor <- function (x, ref)
+  {
+  last.index <- length(ref) # convenience for matlab's end keyword
+  if(last.index == 1) return(relevel(x, ref)) # end case, normal case
+ my.new.list <- list(x=relevel(x, ref[last.index]), ref=ref[- last.index])
+  return(do.call(order.factor, my.new.list)) # recursive call
+  }

order.factor2 <- function(x, ref){
+     factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in% ref])))
+ }
order.factor3 <- function(x, ref){
+ factor(x, levels = c(ref, sort(levels(x)[!levels(x) %in% ref])), labels = c(ref, sort(levels(x)[!levels(x) %in% ref])))
+ }

x <- factor(sample(LETTERS[1:5], 10000000, replace = TRUE))
y <- factor(sample(LETTERS[1:20], 10000000, replace = TRUE))
system.time(order.factor(x, c("D", "B")))
  user  system elapsed
  5.69    0.38    6.09
system.time(order.factor2(x, c("D", "B")))
  user  system elapsed
  3.90    0.20    4.12
system.time(order.factor3(x, c("D", "B")))
  user  system elapsed
  3.26    0.19    3.46
system.time(order.factor(y, c("D", "B")))
  user  system elapsed
 17.43    0.39   17.84
system.time(order.factor3(y, c("D", "B")))
  user  system elapsed
  8.25    0.17    8.46


HTH,

Thierry


----------------------------------------------------------------------------
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] Namens baptiste auguie
Verzonden: vrijdag 9 januari 2009 15:11
Aan: R R-help
Onderwerp: [R] recursive relevel

Dear list,

I'm having second thoughts after solving a very trivial problem: I
want to extend the relevel() function to reorder an arbitrary number
of levels of a factor in one go. I could not find a trivial way of
using the code obtained by getS3method("relevel","factor"). Instead, I
thought of solving the problem in a recursive manner (possibly after
reading Paul Graham essays on Lisp too recently). Here is my attempt :


order.factor <- function (x, ref)
     {

     last.index <- length(ref) # convenience for matlab's end keyword
if(last.index == 1) return(relevel(x, ref)) # end case, normal case
of relevel
     my.new.list <- list(x=relevel(x, ref[last.index]),  # creating a
list with updated parameters,
# going through the list in reverse order ref=ref[- last.index]) # chop the vector from its last level
     return(do.call(order.factor, my.new.list)) # recursive call
}

ff <- factor(c("a", "b", "c", "d"))
ff
relevel(ff, levels(ff)[1])
relevel(ff, levels(ff)[2]) # that's the usual case: you want to put
a level first

order.factor(x=ff, ref=c("a", "b"))
order.factor(x=ff, ref=c("c"))
order.factor(x=ff, ref=c("c", "d")) # that's my wish: put c and d in
that order as the first two levels



I'm hoping this can be improved in several aspects:

- there is probably already a better function I missed or overlooked
(I'd still be curious about the following points, though)

- after reading a few threads, it appears that some recursive
functions are fragile in some sense, and I'm not sure what this means
in practice. (Should I use Recall, somehow?)

- it's probably quite slow for large data.frames

- I could not think of a good name, this one might clash with some S3
method perhaps?

- any other thoughts welcome!


Best wishes,

Baptiste
_____________________________

Baptiste AuguiƩ

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly
signed document.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

_____________________________

Baptiste AuguiƩ

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to