2009/11/15 Barry Rowlingson <b.rowling...@lancaster.ac.uk> > On Sun, Nov 15, 2009 at 11:10 AM, Dimitri Szerman <dimitri...@gmail.com> > wrote: > > Hello, > > > > This is what I am trying to do: I wrote a little function that takes > > addresses (coordinates) as input, and returns the road distance between > > every two points using Google Maps. Catch is, there are 2000 addresses, > so I > > have to get around 2x10^6 addresses. On my first go, this is what I did: > > I hope on your first go you didn't run it with 2000 addresses. You > did test it with 13 addresses first didn't you? >
I did, and it worked well. > Another idea is to replace your Distance function with a function > that returns runif(1). This will either make your code fail much much > quicker or identify that the problem is in the Distance function (some > memory leak there). > > Also, you should check the return value from your google query - I've > seen google get a bit upset about repeated automated queries and > return a message saying "This looks like an automated query" and a > CAPTCHA test. Mmmm, I weren't aware of that. > > grid2=grid[!is.na(grid)] > > n = length(grid2) > > for (i in 1:n) { > > temp = Distances(grid2[i]) > > write.table(temp,"distances.csv",col.names=F,row.names=F,append=T) > > } > > This won't work - you're overwriting distances.csv with the new value > of 'temp' every time. No, I am not, because "append=TRUE". I did this, and I managed to get 20.000 distances or so. > Another good reason to test with 13 values > before waiting and failing after six hours, and then having to hammer > google's map server again. > > I'd write this as a simple loop, and dump all the apply stuff. And > rewrite Distance to be a function of two lat-longs: > > Distance=function(lat1,lon1,lat2,lon2){ > .... > return(distance) > } > > Then (untested): > > Dmat = matrix(NA,nrow(X),nrow(X)) > > for(i in 2:nrow(X)){ > for(j in 1:i){ > d = Distance(X[i,1],X[i,2],X[j,1],X[j,2]) > Dmat[i,j]=d > } > } > > I'm not sure apply wins much here. > > Thanks. The reason I didn't want to do something like that is because, in the event of a crash, I'll loose everything that was done. That's why I though of appending the results often. > Barry > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.