Hello,

my data is sorted by start.ens (see below). And now I would like to extract
all rows (so called* defined row*s) with type==Expression - subset (df,
type==Expression) - and the aforegoing type==DNase HS (which is not
necessarly row n-1 - assumung that the defined row is n). I dont know how
to add this to my subset command.

Is that possible?
Thanks Hermann

> df
   start.ens fc.trans        type  end.ens peak end.grcm38 dpeak
1    9191942   0.9379 Expresssion       NA   NA         NA    NA
2    9191942   0.9741 Expresssion       NA   NA         NA    NA
3    9191942   0.9748 Expresssion       NA   NA         NA    NA
4    9195570       NA    DNase HS       NA   NA    9195792   109
5    9579854       NA    DNase HS       NA   NA    9580110   131
6   11088023       NA        p300 11088523    7         NA    NA
7   11113787       NA    DNase HS       NA   NA   11114262   279
8   11114744   0.9803 Expresssion       NA   NA         NA    NA
9   11114744   0.9904 Expresssion       NA   NA         NA    NA
10  11114850       NA    DNase HS       NA   NA   11115400   210
11  11455056       NA    DNase HS       NA   NA   11455381   175
12  11461513       NA    DNase HS       NA   NA   11462571   508
13  11462408   1.0129 Expresssion       NA   NA         NA    NA
14  11462408   1.0074 Expresssion       NA   NA         NA    NA
15  11489266   1.0019 Expresssion       NA   NA         NA    NA

My (test)data:
> dput (df)
structure(list(start.ens = c(9191942L, 9191942L, 9191942L, 9195570L,
9579854L, 11088023L, 11113787L, 11114744L, 11114744L, 11114850L,
11455056L, 11461513L, 11462408L, 11462408L, 11489266L), fc.trans =
c(0.9379,
0.9741, 0.9748, NA, NA, NA, NA, 0.9803, 0.9904, NA, NA, NA, 1.0129,
1.0074, 1.0019), type = structure(c(2L, 2L, 2L, 1L, 1L, 3L, 1L,
2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("DNase HS", "Expresssion",
"p300"), class = "factor"), end.ens = c(NA, NA, NA, NA, NA, 11088523L,
NA, NA, NA, NA, NA, NA, NA, NA, NA), peak = c(NA, NA, NA, NA,
NA, 7L, NA, NA, NA, NA, NA, NA, NA, NA, NA), end.grcm38 = c(NA,
NA, NA, 9195792L, 9580110L, NA, 11114262L, NA, NA, 11115400L,
11455381L, 11462571L, NA, NA, NA), dpeak = c(NA, NA, NA, 109L,
131L, NA, 279L, NA, NA, 210L, 175L, 508L, NA, NA, NA)), .Names =
c("start.ens",
"fc.trans", "type", "end.ens", "peak", "end.grcm38", "dpeak"), row.names =
c(NA,
-15L), class = "data.frame")

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to