Glad to help -- I haven't taken a look at Dennis' solution (which may be far
better than mine), but if you do want to keep going down the path outlined
below you might consider the following:

Instead of throwing away a simulation if something starts negative, why not
just multiply the entire sample by -1: that lets you still use the sample
and saves you some computations: of course you'll have to remember to adjust
your final results accordingly.

This might avoid the loop:

x = ## Whatever x is.
xLag = c(0,x[-length(x)]) # 'lag' x by 1 step.
which.max((x>=0) & (xLag <0)) + 1 # Depending on how you've decided to count
things, this +1 may be extraneous.

The inner expression sets a 0 except where there is a switch from negative
to positive and a one there: the which.max function returns the location of
the first maximum, which is the first 1, in the vector. If you are
guaranteed the run starts negative, then the location of the first positive
should give you the length of the negative run. This all gives you,

f4 <- function(n = 100000, # number of simulations
               length = 100000) # length of iterated sum
{
       R = matrix(sample(c(-1L,1L), length*n,replace=T),nrow=n)

>        R = apply(R,1,cumsum)
>
                      R[R[,1]==(1),] = -1 * R[R[,1]==(-1),] # If the first
element in the row is positive, flip the entire row

>        fTemp <- function(x) {
>
                            xLag = c(0,x[-length(x)])
                            return(which.max((x>=0) & (xLag <0))+1)

>        countNegative = apply(R,2,fTemp)
>        tabulate(as.vector(countNegative), length)
> }


That just crashed my computer though, so I wouldn't recommend it for large
n,length. Instead, you can help a little by combining the lagging and the &
all in one.

f4 <- function(n = 100000, llength = 100000)
{
    R = matrix(sample(c(-1L,1L), length*n,replace=T),nrow=n)
    R = apply(R,1,cumsum)
    R[R[,1]==(1),] = -1 * R[R[,1]==(-1),] # If the first element in the row
is positive, flip the entire row
    R = (cbind(rep(0,NROW(R)),R)<0)&(cbind(R,rep(0,NROW(R)))>=0)
    countNegative = apply(R,1,which.max) + 1
    return (tabulate(as.vector(countNegative), length) )

> }

Of course, this is all starting to approach a very specific question that
could actually be approached much more efficiently if it's your end goal
(though I think I remember from your first email a different end goal):

We can use the symmetry and "restart"ability of RW to do the following:

x = cumsum(sample(c(-1L,1L),BIGNUMBER,replace=T)
D  = diff(which(x == 0))

This will give you a vector of how long x stays positive or negative at a
time. Thinking through some simple translations lets you see that this set
has the same distribution as how long a RW that starts negative stays
negative. Again, this is only good for answering a very specific question
about random walks and may not be useful if you have other more complicated
questions in sight.

Hope this helps,

Michael Weylandt


On Sun, Jul 31, 2011 at 6:36 PM, Paul Menzel <
paulepan...@users.sourceforge.net> wrote:

> Am Mittwoch, den 27.07.2011, 19:59 -0400 schrieb R. Michael Weylandt :
> > Some more skilled folks can help with the curve fitting, but the general
> > answer is yes -- R will handle this quite ably.
>
> Great to read that.
>
> > Consider the following code:
> >
> > <<-------------------------------------->>
> > n = 1e5
> > length = 1e5
> >
> > R = matrix(sample(c(-1,1),length*n,replace=T),nrow=n)
> > R = apply(R,1,cumsum) ## this applies cumsum `row-wise' to R and will
> make
> > your life INFINITELY better
> > R = cbind(rep(0,n),R) ## Now each row is a random walk as you desired.
> >
> > <<---------------------------------------->>
> >
> > There are actually even faster ways to do what you are asking for, but
> this
> > introduces you to some useful R architecture, above all the apply
> function.
>
> Thank you very much. I realized the the 0 column is not need when
> summing this up. Additionally I posted the wrong example code and I
> actually am only interested how long it stays negative from the
> beginning.
>
> > To see how long the longest stretch of negatives in each row is, the
> > following is a little sneaky but works pretty well:
> >
> > countNegative = apply(R,1,function(x){which.max(table(cumsum(x>=0))})
> >
> > then you can study these random numbers to do whatever with them.
> >
> > The gist of this code is that it counts how many positive number have
> been
> > seen up to each point: for any stretch this doesn't increase, you must be
> > negative, so this identifies the longest such stretch on each row and
> > records the length. (It may be off by one so check it on a smaller R
> matrix.
>
> That is a great example. It took me a while what `table()` does here but
> thanks to your explanation I finally understood it.
>
> […]
>
> > So all together:
> >
> > <<-------------------------------------->>
> > n = 1e3
> > length = 1e3
> >
> > R = matrix(sample(c(-1,1),length*n,replace=T),nrow=n)
> > R = apply(R,1,cumsum) ## this applies cumsum `row-wise' to R and will
> make
> > your life INFINITELY better
> > R = cbind(rep(0,n),R) ## Now each row is a random walk as you desired.
> > fTemp <- function(x) {
> >     return(max(table(cumsum(x>=0))))
> > }
> > countNegative = apply(R,1,fTemp)
> > mu = mean(countNegative)
> > sig = sd(countNegative)/sqrt(length(countNegative))
> >
> > <<---------------------------------------->>
> >
> > This runs pretty fast on my laptop, but you'll need to look into the
> > memory.limit() function if you want to increase the simulation
> parameters.
> > There are much faster ways to handle the simulation as well, but this
> should
> > get you off to a nice start with R.
> >
> > Hope this helps,
>
> It did. Thank you again for the kind and elaborate introduction.
>
> Trying to run your example right away froze my system using `n = 1000`
> and `length = 1e5` [1]. So I really need to be careful how big such a
> matrix can get. One thing is to use integers as suggested in [2].
>
> My current code looks like the following.
>
> -------- 8< -------- code -------- >8 --------
> f4 <- function(n = 100000, # number of simulations
>                length = 100000) # length of iterated sum
> {
>        R = matrix(sample(c(-1L,1L),length*n,replace=T),nrow=n)
>         R = apply(R,1,cumsum) ## this applies cumsum `row-wise' to R and
> will make your life INFINITELY better
>         fTemp <- function(x) {
>                if (x[1] >= 0 ) {
>                        return(1)
>                }
>
>                for (i in 1:length-1) {
>                        if (x[i] < 0 && x[i + 1] >= 0) {
>                                return(as.integer(i/2 + 2)) # simple random
> walks only hit 0 on even “times”
>                        }
>                }
>        }
>        countNegative = apply(R,2,fTemp)
>        tabulate(as.vector(countNegative), length)
> }
> -------- 8< -------- code -------- >8 --------
>
> 1.I could actually avoid `cumsum()` half the time, when the first entry
> is already positive. So I am still looking for a way to speed that up in
> comparison to a simple two loops scenario.
> 2. The counting of the length how long the walk stayed negative is
> probably also inefficient and I should find a better way on how to
> return the values.
>
> I am still thinking about both cases, but to come up with
> vectoriazations of the problem is quite hard.
>
> So I welcome any suggestions. ;-)
>
>
> Thanks,
>
> Paul
>
>
> [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=635832
> [2] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=635832#10
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to