On Oct 25, 2011, at 6:42 AM, Assa Yeroslaviz wrote:

Hi everybody,

I would like to know whether it is possible to compare to tables for certain
parameters.
I have these two tables:
gene table
name     chr     start     end     str     accession     Length
gen1     4     646752     646838     +     MI0005806     86
gen12     2L     243035     243141     -     MI0005821     106
gen3     2L     159838     159928     +     MI0005813     90
gen7     2L     1831685     1831799     -     MI0011290     114
gen4     2L     2737568     2737661     +     MI0017696     93
...

localization table:
Chr     Start     End     length
4     136532     138654     2122
3     139870     141970     2100
2L     157838     158440     602
X     160834     162966     2132
4     204040     208536     4496
...

I would like to check whether a specific gene lie within a certain region. For example I want to see if gene 3 on chromosome 2L lies within the region
given in the second table.


rd.txt <- function(txt, header=TRUE, ...) {
     rd <- read.table(textConnection(txt), header=header, ...)
       closeAllConnections()
     rd }
# Data input
genetable <- rd.txt("name chr start end str accession Length
 gen1     4     646752     646838     +     MI0005806     86
 gen12     2L     243035     243141     -     MI0005821     106
 gen3     2L     159838     159928     +     MI0005813     90
 gen7     2L     1831685     1831799     -     MI0011290     114
 gen4     2L     2737568     2737661     +     MI0017696     93")
 loctable <- rd.txt("Chr     Start     End     length
 4     136532     138654     2122
 3     139870     141970     2100
 2L     157838     158440     602
 X     160834     162966     2132
 4     204040     208536     4496")

# Helper function
 inregion <- function(vec, locs) {
any( apply(locs, 1, function(x) vec["start"]>x[1] & vec["end"]<=x[2])) }
# Test the function
 inregion(genetable[2, ], loctable[, c("Start", "End")])
# [1] FALSE

apply(genetable, 1, function(x) inregion(x, loctable[, c("Start", "End")]) )
#[1] FALSE FALSE FALSE FALSE FALSE

The logical vector can be used to extract elements from genetable, but seems pointless to offer code that produces an empty dataframe.

(Wouldn't it have been more sensible to offer a test case that had a combination that satisfied you requirements?)

I'm guessing that this facility would already be implemented in one or more BioConductor functions.

--
David.

What I would like to is like
1. check if the gene lies on a specific chromosome
1.a if no - go to the next line
1.b if yes - go to 2
2. check if the start position of the gene is bigger than the start position of the localization table AND if it smaller than the end position (if it
lies between the start and end positions in the localization table)
2.a if no - go to the next gene
2.b if yes - give it to me.

I was having difficulties doing it without running into three interleaved
conditional loops (if).

I would appreciate any help.

Thanks

Assa

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to