Hello,

Sorry to intrude, but I think it's a factor issue.
Try the changing the disjunction to, (in multiline edit)


new.bin <- is.na(prev.chrom) |
                df$chrom != levels(df$chrom)[prev.chrom] |
                delta.start >= 115341

It should work, now.

Hope this helps,

Rui Barradas

Em 02-07-2012 20:03, pguilha escreveu:
Jean,
It's crazy, I'm still getting 1,2,3,4,5,6 in the bin column.....
Also (this is an unrelated problem i think), unless I've misunderstood
it, I think your code will only create a new bin if the difference
between chromStart at i and i-1 position is >=115341....What I want is
for a new bin to be created each time the difference between
chromStart at i and i-j is >=115341, where 'i-j' corresponds to the
first row of the last bin....Im not sure if I'm being
clear...chromStart values correspond to coordinates along a chromosome
so I want to basically cut up each chromosome into sections/bins of
approximately 115341...

thanks again for all your efforts with this, they're much appreciated!
Paul

On 2 July 2012 19:36, Jean V Adams [via R]
<ml-node+s789695n4635185...@n4.nabble.com> wrote:
Paul,

Try this (I changed some of the object names, but the meat of the code is
the same):

df <- data.frame(
         chrom = c("chr1", "chr1", "chr2", "chr2", "chr2", "chr2"),
         chromStart = c(10089, 10132, 10133, 10148, 210382, 216132),
         chromEnd = c(10309, 10536, 10362, 10418, 210578, 216352),
         name = c("ZBTB33", "TAF7_(SQ-8)", "Pol2-4H8", "MafF_(M8194)",
"ZBTB33", "CTCF"),
         cumsum = c(10089, 20221, 30354, 40502, 50884, 67016)
         )

# assign a new bin every time chrom changes and every time chromStart
changes by 115341 or more
L <- nrow(df)
prev.chrom <- c(NA, df$chrom[-L])
delta.start <- c(NA, df$chromStart[-1] - df$chromStart[-L])
new.bin <- is.na(prev.chrom) | df$chrom != prev.chrom | delta.start >=
115341
df$bin <- cumsum(new.bin)
df


pguilha <[hidden email]> wrote on 07/02/2012 10:23:36 AM:

Jean, that's exactly what it should be, but yes I copied and pasted
from your email so I don't see how I could have introduced an error in
there....
paul

On 2 July 2012 15:57, Jean V Adams [via R]
<[hidden email]> wrote:
Paul,

Are you submitting the exact code that I included in my previous
e-mail?

When I submit that code, I get this ...

   chrom chromStart chromEnd         name cumsum bin
1  chr1      10089    10309       ZBTB33  10089   1
2  chr1      10132    10536  TAF7_(SQ-8)  20221   1
3  chr2      10133    10362     Pol2-4H8  30354   2
4  chr2      10148    10418 MafF_(M8194)  40502   2
5  chr2     210382   210578       ZBTB33  50884   3
6  chr2     216132   216352         CTCF  67016   3

Jean


Paul Guilhamon <[hidden email]> wrote on 07/02/2012 08:59:00 AM:

Thanks for your reply Jean,

I think your interpretation is correct but when I run your code I end
up with the below dataframe and obviously the bins created there
don't

correspond to a chromStart change of 115341:

   chrom chromStart chromEnd         name cumsum bin
1  chr1      10089    10309       ZBTB33  10089   1
2  chr1      10132    10536  TAF7_(SQ-8)  20221   2
3  chr2      10133    10362     Pol2-4H8  30354   3
4  chr2      10148    10418 MafF_(M8194)  40502   4
5  chr2     210382   210578       ZBTB33  50884   5
6  chr2     216132   216352         CTCF  67016   6

the first two rows should have the same bin number (same chrom,
<115341 diff), then rows 3&4 should be in another bin (different
chrom

from rows 1&2, <115341 diff), and rows 5&6 in another one (same chrom
but >115341 difference between row 4 and row 5).

it seems the new.bin line of your code isn't quite doing what it
should but I can't pinpoint the error there...
Paul


On 2 July 2012 14:19, Jean V Adams <[hidden email]> wrote:
Paul,

My interpretation is that you are trying to assign a new bin number
to

a row
every time the variable chrom changes and every time the variable
chromStart
changes by 115341 or more.  Is that right?  If so, you don't need a
loop at
all.  Check out the code below.  I made a couple changes to the
all.tf7
example data frame so that it would have two changes in bin number,
one

based on the chrom variable and one based on the chromStart
variable.

Jean

all.tf7 <- data.frame(
         chrom = c("chr1", "chr1", "chr2", "chr2", "chr2", "chr2"),
         chromStart = c(10089, 10132, 10133, 10148, 210382, 216132),
         chromEnd = c(10309, 10536, 10362, 10418, 210578, 216352),
         name = c("ZBTB33", "TAF7_(SQ-8)", "Pol2-4H8",
"MafF_(M8194)",
"ZBTB33", "CTCF"),
         cumsum = c(10089, 20221, 30354, 40502, 50884, 67016),
         bin = rep(NA, 6)
         )

# assign a new bin every time chrom changes and every time
chromStart
changes by 115341 or more
L <- nrow(all.tf7)
prev.chrom <- c(NA, all.tf7$chrom[-L])
delta.start <- c(NA, all.tf7$chromStart[-1] -
all.tf7$chromStart[-L])

new.bin <- is.na(prev.chrom) | all.tf7$chrom != prev.chrom |
delta.start >=

115341
all.tf7$bin <- cumsum(new.bin)
all.tf7


pguilha <[hidden email]> wrote on 07/02/2012 06:25:13 AM:

Hello all,

I have written a for loop to act on a dataframe with close to
3million
rows
and 6 columns and I would like to pass it to apply() to speed the
process
up
(I let the loop run for 2 days before stopping it and it had only
gone
through 200,000 rows) but I am really struggling to find a way to
pass the
arguments. Below are the loop and the head of the dataframe I am
working
on.
Any hints would be much appreciated, thank you! (I have searched
for

this

but could not find any other posts doing quite what I want)
Paul

x<-as.numeric(all.tf7[1,2])
for (i in 2:nrow(all.tf7)) {
   if (all.tf7[i,1]==all.tf7[i-1,1] & (all.tf7[i,2]-x)<115341)
all.tf7[i,6]<-all.tf7[i-1,6]
   else if (all.tf7[i,1]==all.tf7[i-1,1] &
(all.tf7[i,2]-x)>=115341) {
     all.tf7[i,6]<-(all.tf7[i-1,6]+1)
     x<-as.numeric(all.tf7[i,2]) }
   else if (all.tf7[i,1]!=all.tf7[i-1,1])  {
     all.tf7[i,6]<-(all.tf7[i-1,6]+1)
     x<-as.numeric(all.tf7[i,2]) }
}

#the aim here is to attribute a bin number to each row so that I
can

then

split the dataframe according to those bins.


chrom chromStart chromEnd         name cumsum bin
chr1      10089             10309               ZBTB33  10089   1
chr1      10132             10536      TAF7_(SQ-8)  20221   1
chr1      10133             10362            Pol2-4H8  30354   1
chr1      10148             10418  MafF_(M8194)  40502   1
chr1      10382             10578                ZBTB33  50884   1
chr1      16132             16352                    CTCF  67016 1
         [[alternative HTML version deleted]]

______________________________________________
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


________________________________
If you reply to this email, your message will be added to the discussion
below:
http://r.789695.n4.nabble.com/apply-with-multiple-conditions-tp4635098p4635185.html
To unsubscribe from apply with multiple conditions, click here.
NAML


--
View this message in context: 
http://r.789695.n4.nabble.com/apply-with-multiple-conditions-tp4635098p4635189.html
Sent from the R help mailing list archive at Nabble.com.
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to