On 1/29/20 13:14, Jianhong Ou, Ph.D. wrote: > Oh, I forget that. Thank you for reminder. > Then how about: > > distance(query, narrow(subject, start=2, end=-2)) == 0 > > ?
Yep, that's more accurate. With the following gotcha: 'narrow(subject, start=2, end=-2)' will fail if 'subject' contains ranges that cover less than 2 positions Not an unlikely situation e.g. if 'subject' contains TSS! I just feel that distance() is not really appropriate to detect overlaps. H. > > > On 1/29/20, 12:40 PM, "Pages, Herve" <hpa...@fredhutch.org> wrote: > > On 1/29/20 08:04, Jianhong Ou, Ph.D. wrote: > > Try > > dist=distance(query, subject) > > dist==0 > > ? > > Please be aware that dist==0 does NOT mean that 2 ranges overlap. It > means that they overlap OR are **adjacent**: > > > distance(GRanges("chr1:1-20"), GRanges("chr1:21-25")) > [1] 0 > > H. > > > > > On 1/29/20, 10:50 AM, "Bioc-devel on behalf of web working" > <bioc-devel-boun...@r-project.org on behalf of webwork...@posteo.de> wrote: > > > > Hello, > > > > I have two big GRanges objects and want to search for an overlap > of the > > first range of query with the first range of subject. Then take > the > > second range of query and compare it with the second range of > subject > > and so on. Here an example of my problem: > > > > # GRanges objects > > query <- GRanges(rep("chr1", 4), IRanges(c(1, 5, 9, 20), c(2, 6, > 10, > > 22)), id=1:4) > > subject <- GRanges(rep("chr1",4), IRanges(c(3, 1, 1, 15), c(4, 2, > 2, > > 21)), id=1:4) > > > > # The 2 overlaps at the first position should not be counted, > because > > these ranges are at different rows. > > countOverlaps(query, subject) > > > > # Approach 1 (bad style. I have simplified it to understand) > > dat <- as.data.frame(findOverlaps(query, subject)) > > indexDat <- apply(dat, 1, function(x) x[1]==x[2]) > > indexBool <- dat[indexDat,1] > > out <- rep(FALSE, length(query)) > > out[indexBool] <- TRUE > > as.numeric(out) > > > > # Approach 2 (bad style and takes too long) > > out <- vector("numeric", 4) > > for(i in seq_along(query)) out[i] <- (overlapsAny(query[i], > subject[i])) > > out > > > > # Approach 3 (wrong results) > > as.numeric(overlapsAny(query, subject)) > > as.numeric(overlapsAny(split(query, 1:4), split(subject, 1:4))) > > > > > > Maybe someone has an idea to speed this up? > > > > > > Best, > > > > Tobias > > > > _______________________________________________ > > Bioc-devel@r-project.org mailing list > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIDaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=PXg851DHXyo-Gs3eMIfeo49gUXVh-JSZu_MZDDxGun8&m=CL_4pe8tWi75jDizROxriMm7-LhebnosKRxforvK2Jo&s=Ft0x9f_4tOy2Ov9DHVp5KlTOSI4CeURNB8ywlrwgn9E&e= > > > > > > _______________________________________________ > > Bioc-devel@r-project.org mailing list > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwIGaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=mlMbcbdMyysqzyTia1k6Xb4YO7x7jyDtw2bT7ad0dyg&s=jPRTi7pxhHzcFnU-du42SSiHfemeYcUdEF4RZfqdCvU&e= > > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fredhutch.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319 _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel