On Mon, Feb 13, 2012 at 10:05:02AM -0800, zheng wei wrote:
> Dear All,
>
> Sorry for the typoes earlier, let me repost the question.
>
> Suppose I want to generate sequences of length 3 from two symbols {1,2}, we
> get the following 8 sequences
> 1 1 1
> 1 1 2
> 1 2 1
> 1 2 2
> 2 1 1
> 2 1 2
> 2 2 1
> 2 2 2
>
> However, I do not want all these 8 sequences. I call two sequencs to be
> isomorphic if one sequence could be obtained from the other by relabelling
> the symbols. For example, 111 is isomorphic to 222, 112 is isomorphic to
> 221.?By eliminating all these isomorphic ones, what I want is the following
> 1 1 1
> 1 1 2
> 1 2 1
> 2 1 1
Eliminating isomorphic sequences may be done differently,
if we select different representatives of each equivalence
class. The following also eliminates isomorphic 1,2 sequences
1 1 1
1 1 2
1 2 1
1 2 2
Is this solution OK?
> In general, I need to generate non-isomorphic sequences of length p from t
> distinct symbols. For example, when p=3, t=3 we have
> matrix(c(1,2,3,1,1,2,2,1,1,1,2,1,1,1,1),3,5)
>
> [1,]??? 1??? 1??? 2??? 1??? 1
> [2,]??? 2??? 1??? 1??? 2??? 1
> [3,]??? 3??? 2??? 1??? 1??? 1
>
> When p=4, t=4 we have
> matrix(c(1,2,3,4,1,1,2,3,1,2,1,3,1,2,3,1,2,1,1,3,2,3,1,1,2,1,3,1,1,1,2,2,1,2,1,2,1,2,2,1,1,1,1,2,1,1,2,1,1,2,1,1,2,1,1,1,1,1,1,1),4,15)
>
> [1,]??? 1??? 1??? 1??? 1??? 2??? 2??? 2??? 1??? 1???? 1???? 1???? 1???? 1????
> 2???? 1
> [2,]??? 2??? 1??? 2??? 2??? 1??? 3??? 1??? 1??? 2???? 2???? 1???? 1???? 2????
> 1???? 1
> [3,]??? 3??? 2??? 1??? 3??? 1??? 1??? 3??? 2??? 1???? 2???? 1???? 2???? 1????
> 1???? 1
> [4,]??? 4??? 3??? 3??? 1??? 3??? 1??? 1??? 2??? 2???? 1???? 2???? 1???? 1????
> 1???? 1
>
>
> In general, I need to do this for arbitrary choices of p and t.
If p and t are not too large, try the following
check.row <- function(x)
{
y <- unique(x)
all(y == seq.int(along=y))
}
p <- 4
tt <- 4 # do not rewrite t() for transpose
elem <- lapply(as.list(pmin(1:p, tt)), function(x) seq.int(length=x))
a <- as.matrix(rev(expand.grid(rev(elem))))
ok <- apply(a, 1, check.row)
out <- a[ok, ]
out
Var4 Var3 Var2 Var1
[1,] 1 1 1 1
[2,] 1 1 1 2
[3,] 1 1 2 1
[4,] 1 1 2 2
[5,] 1 1 2 3
[6,] 1 2 1 1
[7,] 1 2 1 2
[8,] 1 2 1 3
[9,] 1 2 2 1
[10,] 1 2 2 2
[11,] 1 2 2 3
[12,] 1 2 3 1
[13,] 1 2 3 2
[14,] 1 2 3 3
[15,] 1 2 3 4
This solution differs from yours, for example, in
the row c(1, 2, 3, 3), which is in your solution
represented by c(2, 3, 1, 1). This a different choice
of the representatives. Is the choice important?
A related thread started at
https://stat.ethz.ch/pipermail/r-help/2012-January/301489.html
There was an additional requirement that each of t symbols
has at least one occurrence.
Hope this helps.
Petr Savicky.
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.