I've written a recursive function to extract the members of an individual cluster within a hierarchical clustering. I have something that works, but the return value has a list structure I don't like. I know how to work around with 'unlist()' but I suspect the function could be fixed. Can anyone show me show?

Thanks,
Jenny Bryan

Demo of my problem --

(Note: although my question has nothing to do with hierarchical clustering per se, my example does assume knowledge of the 'merge' object.)

## faking the key aspects of an hclust object
myClust <-
  list(merge = rbind(c(-1, -2),
         c(-3, -4),
         c(2, -5),
         c(1, 3),
         c(4, -6)),
       height = 1:5,
       order = 1:6)

## plot the example / fake tree
stats:::plot.hclust(myClust, hang = -1)

## recursive function to extract members of a cluster
## 'sapply' version
clMembFun1 <- function(x) {
  if(x < 0) {
    -x
  } else {
    sapply(1:2, function(j) clMembFun1(myClust$merge[x,j]))
  }
}

Here's a transcript of using clMembFun:

> ## trivial case of cluster = 2 singletons is OK
> clMembFun1(1)
[1] 1 2
> str(clMembFun1(1))                      # num vector
 num [1:2] 1 2

> ## case of cluster that contains a cluster --> list
> clMembFun1(3)
[[1]]
[1] 3 4

[[2]]
[1] 5

> str(clMembFun1(3))
List of 2
 $ : num [1:2] 3 4
 $ : num 5

> ## now the list also has 2D matrix structure
> clMembFun1(4)
     [,1] [,2]
[1,] 1    Numeric,2
[2,] 2    5
> str(clMembFun1(4))
List of 4
 $ : num 1
 $ : num 2
 $ : num [1:2] 3 4
 $ : num 5
 - attr(*, "dim")= int [1:2] 2 2

> ## and it just gets worse
> clMembFun1(5)
[[1]]
     [,1] [,2]
[1,] 1    Numeric,2
[2,] 2    5

[[2]]
[1] 6

> str(clMembFun1(5))
List of 2
 $ :List of 4
  ..$ : num 1
  ..$ : num 2
  ..$ : num [1:2] 3 4
  ..$ : num 5
  ..- attr(*, "dim")= int [1:2] 2 2
 $ : num 6

I know one workaround is to 'unlist' the return value:

> ## post hoc fix
> unlist(clMembFun1(3))
[1] 3 4 5
> unlist(clMembFun1(4))
[1] 1 2 3 4 5
> unlist(clMembFun1(5))
[1] 1 2 3 4 5 6

But can the function itself be fixed/improved?

I also tried using a 'for' loop instead of 'sapply' but that suffered from fatal problems (maybe I didn't implement correctly?).

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to