
Sorry to intrude, but I think it's a factor issue.
Try the changing the disjunction to, (in multiline edit)

new.bin <- is.na(prev.chrom) |
                df$chrom != levels(df$chrom)[prev.chrom] |
                delta.start >= 115341

It should work, now.

Hope this helps,

Rui Barradas

Em 02-07-2012 20:03, pguilha escreveu:
It's crazy, I'm still getting 1,2,3,4,5,6 in the bin column.....
Also (this is an unrelated problem i think), unless I've misunderstood
it, I think your code will only create a new bin if the difference
between chromStart at i and i-1 position is >=115341....What I want is
for a new bin to be created each time the difference between
chromStart at i and i-j is >=115341, where 'i-j' corresponds to the
first row of the last bin....Im not sure if I'm being
clear...chromStart values correspond to coordinates along a chromosome
so I want to basically cut up each chromosome into sections/bins of
approximately 115341...

thanks again for all your efforts with this, they're much appreciated!

On 2 July 2012 19:36, Jean V Adams [via R]
<ml-node+s789695n4635185...@n4.nabble.com> wrote:

Try this (I changed some of the object names, but the meat of the code is
the same):

df <- data.frame(
         chrom = c("chr1", "chr1", "chr2", "chr2", "chr2", "chr2"),
         chromStart = c(10089, 10132, 10133, 10148, 210382, 216132),
         chromEnd = c(10309, 10536, 10362, 10418, 210578, 216352),
         name = c("ZBTB33", "TAF7_(SQ-8)", "Pol2-4H8", "MafF_(M8194)",
"ZBTB33", "CTCF"),
         cumsum = c(10089, 20221, 30354, 40502, 50884, 67016)

# assign a new bin every time chrom changes and every time chromStart
changes by 115341 or more
L <- nrow(df)
prev.chrom <- c(NA, df$chrom[-L])
delta.start <- c(NA, df$chromStart[-1] - df$chromStart[-L])
new.bin <- is.na(prev.chrom) | df$chrom != prev.chrom | delta.start >=
df$bin <- cumsum(new.bin)

pguilha <[hidden email]> wrote on 07/02/2012 10:23:36 AM:

Jean, that's exactly what it should be, but yes I copied and pasted
from your email so I don't see how I could have introduced an error in

On 2 July 2012 15:57, Jean V Adams [via R]
<[hidden email]> wrote:

Are you submitting the exact code that I included in my previous

When I submit that code, I get this ...

   chrom chromStart chromEnd         name cumsum bin
1  chr1      10089    10309       ZBTB33  10089   1
2  chr1      10132    10536  TAF7_(SQ-8)  20221   1
3  chr2      10133    10362     Pol2-4H8  30354   2
4  chr2      10148    10418 MafF_(M8194)  40502   2
5  chr2     210382   210578       ZBTB33  50884   3
6  chr2     216132   216352         CTCF  67016   3


Paul Guilhamon <[hidden email]> wrote on 07/02/2012 08:59:00 AM:

Thanks for your reply Jean,

I think your interpretation is correct but when I run your code I end
up with the below dataframe and obviously the bins created there

correspond to a chromStart change of 115341:

   chrom chromStart chromEnd         name cumsum bin
1  chr1      10089    10309       ZBTB33  10089   1
2  chr1      10132    10536  TAF7_(SQ-8)  20221   2
3  chr2      10133    10362     Pol2-4H8  30354   3
4  chr2      10148    10418 MafF_(M8194)  40502   4
5  chr2     210382   210578       ZBTB33  50884   5
6  chr2     216132   216352         CTCF  67016   6

the first two rows should have the same bin number (same chrom,
<115341 diff), then rows 3&4 should be in another bin (different

from rows 1&2, <115341 diff), and rows 5&6 in another one (same chrom
but >115341 difference between row 4 and row 5).

it seems the new.bin line of your code isn't quite doing what it
should but I can't pinpoint the error there...

On 2 July 2012 14:19, Jean V Adams <[hidden email]> wrote:

My interpretation is that you are trying to assign a new bin number

a row
every time the variable chrom changes and every time the variable
changes by 115341 or more.  Is that right?  If so, you don't need a
loop at
all.  Check out the code below.  I made a couple changes to the
example data frame so that it would have two changes in bin number,

based on the chrom variable and one based on the chromStart


all.tf7 <- data.frame(
         chrom = c("chr1", "chr1", "chr2", "chr2", "chr2", "chr2"),
         chromStart = c(10089, 10132, 10133, 10148, 210382, 216132),
         chromEnd = c(10309, 10536, 10362, 10418, 210578, 216352),
         name = c("ZBTB33", "TAF7_(SQ-8)", "Pol2-4H8",
"ZBTB33", "CTCF"),
         cumsum = c(10089, 20221, 30354, 40502, 50884, 67016),
         bin = rep(NA, 6)

# assign a new bin every time chrom changes and every time
changes by 115341 or more
L <- nrow(all.tf7)
prev.chrom <- c(NA, all.tf7$chrom[-L])
delta.start <- c(NA, all.tf7$chromStart[-1] -

new.bin <- is.na(prev.chrom) | all.tf7$chrom != prev.chrom |
delta.start >=

all.tf7$bin <- cumsum(new.bin)

pguilha <[hidden email]> wrote on 07/02/2012 06:25:13 AM:

Hello all,

I have written a for loop to act on a dataframe with close to
and 6 columns and I would like to pass it to apply() to speed the
(I let the loop run for 2 days before stopping it and it had only
through 200,000 rows) but I am really struggling to find a way to
pass the
arguments. Below are the loop and the head of the dataframe I am
Any hints would be much appreciated, thank you! (I have searched


but could not find any other posts doing quite what I want)

for (i in 2:nrow(all.tf7)) {
   if (all.tf7[i,1]==all.tf7[i-1,1] & (all.tf7[i,2]-x)<115341)
   else if (all.tf7[i,1]==all.tf7[i-1,1] &
(all.tf7[i,2]-x)>=115341) {
     x<-as.numeric(all.tf7[i,2]) }
   else if (all.tf7[i,1]!=all.tf7[i-1,1])  {
     x<-as.numeric(all.tf7[i,2]) }

#the aim here is to attribute a bin number to each row so that I


split the dataframe according to those bins.

chrom chromStart chromEnd         name cumsum bin
chr1      10089             10309               ZBTB33  10089   1
chr1      10132             10536      TAF7_(SQ-8)  20221   1
chr1      10133             10362            Pol2-4H8  30354   1
chr1      10148             10418  MafF_(M8194)  40502   1
chr1      10382             10578                ZBTB33  50884   1
chr1      16132             16352                    CTCF  67016 1
         [[alternative HTML version deleted]]

[hidden email] mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

If you reply to this email, your message will be added to the discussion
To unsubscribe from apply with multiple conditions, click here.

View this message in context: 
Sent from the R help mailing list archive at Nabble.com.
        [[alternative HTML version deleted]]

R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

R-help@r-project.org mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to