try this:

myDat <- read.table(textConnection("group id
1 101
1 201
1 301
2 401
2 501
2 601
3 701
3 801
3 901"),header=TRUE)
closeAllConnections()
corr_mat <-as.matrix(read.table(textConnection("1 1   .5  0   0   0   0
0   0   0
2 .5   1  0   0   0   0   0   0   0
3 0    0  1.0   0   0   0   0   0   0
4 0    0  0   1   .5  .5  0   0   0
5 0    0  0   .5  1    .5  0   0   0
6 0    0  0   .5  .5   1 0    0   0
7 0    0  0   0    0   0  1   0  0
8 0   0   0   0    0   0   0  1  .5
9 0   0   0   0   0    0   0  .5 1"),header=FALSE))
closeAllConnections()
corr_mat <- corr_mat[,-1]
colnames(corr_mat) <- myDat$id
rownames(corr_mat) <- myDat$id
# split out the groups
groups <- split(as.character(myDat$id), myDat$group)
# process each subgroup
result <- lapply(groups, function(.grp){
    subgroup <- corr_mat[.grp, .grp]
    output <- NULL
    # zero the diag
    diag(subgroup) <- 0
    same <- apply(subgroup, 1, function(x) any(x != 0))
    if (any(same)){  # some match, choose one
        output <- sample(same[same], 1)
    }
    if (any(!same)){  # get all that don't correlate
        output <- c(output, same[!same])
    }
    output
})
# output as matrix
do.call(rbind, lapply(names(result), function(x) cbind(x,
names(result[[x]]))))



On Mon, Dec 7, 2009 at 7:38 PM, Juliet Hannah <juliet.han...@gmail.com>wrote:

> Hi List,
>
> Here is some example data.
>
> myDat <- read.table(textConnection("group id
> 1 101
> 1 201
> 1 301
> 2 401
> 2 501
> 2 601
> 3 701
> 3 801
> 3 901"),header=TRUE)
> closeAllConnections()
>
> corr_mat <-read.table(textConnection("1 1   .5  0   0   0   0   0   0   0
> 2 .5   1  0   0   0   0   0   0   0
> 3 0    0  1.0   0   0   0   0   0   0
> 4 0    0  0   1   .5  .5  0   0   0
> 5 0    0  0   .5  1    .5  0   0   0
> 6 0    0  0   .5  .5   1 0    0   0
> 7 0    0  0   0    0   0  1   0  0
> 8 0   0   0   0    0   0   0  1  .5
> 9 0   0   0   0   0    0   0  .5 1"),header=FALSE)
> closeAllConnections()
>
> corr_mat <- corr_mat[,-1]
> colnames(corr_mat) <- myDat$id
> rownames(corr_mat) <- myDat$id
>
> I need to subset this data such that observations within a group are not
> related, which is indicated by a 0 in corr_mat.
>
> For example, within group 1, 101 and 201 are related, so one of these
> has to be selected, say
> 101. 301 is not related to 101 or 201, so the final set for group 1
> consists of 101 and 301. There will always be at least 2 members in
> each group. I need to carry this task on all groups.
>
> One possible final data set looks like:
>
>  group  id
> 1     1 101
> 3     1 301
> 4     2 401
> 7     3 701
> 8     3 801
>
> Any suggestions? Thanks!
>
> Juliet
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to