Dear R community,

For 100 sites at human chromosomes, I ran two tests, one is to consider an 
experiment measurement as a continuous variable, so doing multiple regression; 
the other  is to compare top 25% samples to bottom 25% samples based on  values 
of the measured variable, so categorical analysis. A total of 16 sites show 
significance;  In the following results, I only show five variables ( site, 
region,  test, chr, start); then I need to add the sixth variable called 
"common" to label a common region (2 regions in this example file) with p value 
significance from both tests.

In the second "common" region, chr (chromosome) is the same (chr 1) and start 
location are also same for all six sites (three from categorical analysis and 
three from continuous analysis), just end location (not known) different, so I 
labeled them as one common region;  for the first "common" region, they are in 
chromosome 1,  chromosome start location is not the same, but location 
difference is less than 1000 base pairs, so they are in the same chromosome 
region.

I used  SAS first.location  Idea, then using a R cumsum function I learned from 
Bert;  So comparing region variable and num.location variable, I can find out 
the second common region although I have not figured out how to label it using 
R.  I have no idea about how to find the first "common" region.

Can you help me?

Thank you very much!!

Ding

common <- c(NA,NA,1,1,1,1,1,2,2,2,2,2,2, NA, NA, NA);
site <-seq(1, 16);
region <- c(1, 1, 2, 2, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6);
test 
<-c("categorical","categorical","continuous","continuous","continuous","categorical",
         
"categorical","continuous","continuous","continuous","categorical","categorical",
         "categorical","continuous","continuous","continuous");
chr <-c(1,1,1,1,1,1,1,1,1,1,1,1,1,2,2,2);
start 
<-c(3229921,3229921,16553549,16553549,16553549,16554171,16554171,32826843,32826843,
             32826843,32826843,32826843,32826843,30669385,30669385,30669385);
dat <-data.frame(common,site, region, test, chr, start, stringsAsFactors = F);

dat$first.location <- !duplicated(dat$start);
dat$num.location <-cumsum(!duplicated(dat$start));


---------------------------------------------------------------------
-SECURITY/CONFIDENTIALITY WARNING-
This message (and any attachments) are intended solely f...{{dropped:22}}

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to