2009/11/15 Barry Rowlingson <b.rowling...@lancaster.ac.uk>

> On Sun, Nov 15, 2009 at 11:10 AM, Dimitri Szerman <dimitri...@gmail.com>
> wrote:
> > Hello,
> >
> > This is what I am trying to do: I wrote a little function that takes
> > addresses (coordinates) as input, and returns the road distance between
> > every two points using Google Maps. Catch is, there are 2000 addresses,
> so I
> > have to get around 2x10^6 addresses. On my first go, this is what I did:
>
>  I hope on your first go you didn't run it with 2000 addresses. You
> did test it with 13 addresses first didn't you?
>

I did, and it worked well.


>  Another idea is to replace your Distance function with a function
> that returns runif(1). This will either make your code fail much much
> quicker or identify that the problem is in the Distance function (some
> memory leak there).
>
>  Also, you should check the return value from your google query - I've
> seen google get a bit upset about repeated automated queries and
> return a message saying "This looks like an automated query" and a
> CAPTCHA test.


Mmmm, I weren't aware of that.


> > grid2=grid[!is.na(grid)]
> > n = length(grid2)
> > for (i in 1:n) {
> > temp = Distances(grid2[i])
> > write.table(temp,"distances.csv",col.names=F,row.names=F,append=T)
> > }
>
> This won't work - you're overwriting distances.csv with the new value
> of 'temp' every time.


No, I am not, because "append=TRUE". I did this, and I managed to get 20.000
distances or so.


> Another good reason to test with 13 values
> before waiting and failing after six hours, and then having to hammer
> google's map server again.
>
> I'd write this as a simple loop, and dump all the apply stuff. And
> rewrite Distance to be a function of two lat-longs:
>
> Distance=function(lat1,lon1,lat2,lon2){
> ....
> return(distance)
> }
>
> Then (untested):
>
> Dmat = matrix(NA,nrow(X),nrow(X))
>
> for(i in 2:nrow(X)){
>  for(j in 1:i){
>  d = Distance(X[i,1],X[i,2],X[j,1],X[j,2])
>  Dmat[i,j]=d
> }
> }
>
>  I'm not sure apply wins much here.
>
>
Thanks. The reason I didn't want to do something like that is because, in
the event of a crash, I'll loose everything that was done. That's why I
though of appending the results often.


> Barry
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to