On Mon, Feb 13, 2012 at 10:05:02AM -0800, zheng wei wrote: > Dear All, > > Sorry for the typoes earlier, let me repost the question. > > Suppose I want to generate sequences of length 3 from two symbols {1,2}, we > get the following 8 sequences > 1 1 1 > 1 1 2 > 1 2 1 > 1 2 2 > 2 1 1 > 2 1 2 > 2 2 1 > 2 2 2 > > However, I do not want all these 8 sequences. I call two sequencs to be > isomorphic if one sequence could be obtained from the other by relabelling > the symbols. For example, 111 is isomorphic to 222, 112 is isomorphic to > 221.?By eliminating all these isomorphic ones, what I want is the following > 1 1 1 > 1 1 2 > 1 2 1 > 2 1 1
Eliminating isomorphic sequences may be done differently, if we select different representatives of each equivalence class. The following also eliminates isomorphic 1,2 sequences 1 1 1 1 1 2 1 2 1 1 2 2 Is this solution OK? > In general, I need to generate non-isomorphic sequences of length p from t > distinct symbols. For example, when p=3, t=3 we have > matrix(c(1,2,3,1,1,2,2,1,1,1,2,1,1,1,1),3,5) > > [1,]??? 1??? 1??? 2??? 1??? 1 > [2,]??? 2??? 1??? 1??? 2??? 1 > [3,]??? 3??? 2??? 1??? 1??? 1 > > When p=4, t=4 we have > matrix(c(1,2,3,4,1,1,2,3,1,2,1,3,1,2,3,1,2,1,1,3,2,3,1,1,2,1,3,1,1,1,2,2,1,2,1,2,1,2,2,1,1,1,1,2,1,1,2,1,1,2,1,1,2,1,1,1,1,1,1,1),4,15) > > [1,]??? 1??? 1??? 1??? 1??? 2??? 2??? 2??? 1??? 1???? 1???? 1???? 1???? 1???? > 2???? 1 > [2,]??? 2??? 1??? 2??? 2??? 1??? 3??? 1??? 1??? 2???? 2???? 1???? 1???? 2???? > 1???? 1 > [3,]??? 3??? 2??? 1??? 3??? 1??? 1??? 3??? 2??? 1???? 2???? 1???? 2???? 1???? > 1???? 1 > [4,]??? 4??? 3??? 3??? 1??? 3??? 1??? 1??? 2??? 2???? 1???? 2???? 1???? 1???? > 1???? 1 > > > In general, I need to do this for arbitrary choices of p and t. If p and t are not too large, try the following check.row <- function(x) { y <- unique(x) all(y == seq.int(along=y)) } p <- 4 tt <- 4 # do not rewrite t() for transpose elem <- lapply(as.list(pmin(1:p, tt)), function(x) seq.int(length=x)) a <- as.matrix(rev(expand.grid(rev(elem)))) ok <- apply(a, 1, check.row) out <- a[ok, ] out Var4 Var3 Var2 Var1 [1,] 1 1 1 1 [2,] 1 1 1 2 [3,] 1 1 2 1 [4,] 1 1 2 2 [5,] 1 1 2 3 [6,] 1 2 1 1 [7,] 1 2 1 2 [8,] 1 2 1 3 [9,] 1 2 2 1 [10,] 1 2 2 2 [11,] 1 2 2 3 [12,] 1 2 3 1 [13,] 1 2 3 2 [14,] 1 2 3 3 [15,] 1 2 3 4 This solution differs from yours, for example, in the row c(1, 2, 3, 3), which is in your solution represented by c(2, 3, 1, 1). This a different choice of the representatives. Is the choice important? A related thread started at https://stat.ethz.ch/pipermail/r-help/2012-January/301489.html There was an additional requirement that each of t symbols has at least one occurrence. Hope this helps. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.