> I have a couple of questions with respect to creating a postal / >zip code proximity search that is remotely accurate. The system I am >using now is OK for small distances, but is terrible at calculating >large distances.
What formula are you using to calculate distance? You see, the Earth is not flat :-) So, after a certain distance, the error margin of the curvature of the Earth becomes quite significant. There are several formulas you could be using... Geographics and Cartographers have been known to challenge each other to duels over the relative merits of these formulas, so be careful which one you pick :-) A quick search for "longitude, latitude, distance, and Great Circle" should turn up several options, with explaining text about their relative merits. (Or it did when I did this same thing years ago...) In the end, though, *I* ended up just using Cartesian distance, since anybody searching for things "close by" is not going to be worried about curvature of the Earth in their calculations. So, yeah, it's way wrong for those establishments thousands of miles away... So what? Nobody cares about how far is the Jiffy Lube 2000 miles away from them :-) In *SOME* applications, you'd need a *MUCH* more accurate formula -- But if you're searching for "nearby" establishments, forget worrying about the long-distance error. The performance on the trigonometric functions for the "better" formula were intolerable, even in PostgreSQL user-defined functions. YMMV. But there's just no way you're going to want to download all the establishments to PHP and do the calculations there. Dog-slow. You'll *HAVE* to get them into SQL some way. Also, don't even think of doing a JOIN with your establishments and the Zip Code table. It's rule-breaking time. :-) Add long/lat columns to your establishments and fill them in with the Zip Code data if you haven't already. Again, YMMV, but the JOIN of 2000 music venues and the ~10K Zip codes was intolerably slow. > What I have is a database of establishments and their corresponding >latitudes and longitudes, and I need to have users enter their own >postal / zip code, and have the system calculate which establishments >fall into a set range ($range) from their location. Not only does it >need to find the establishments, it needs to calculate the distances >(straight line, of course). So long as you keep that "distance" small -- a few score miles, the curvature should not be that big a deal... You will need a "fudge factor" of "arc" (long/lat) to miles to divide your Cartesian long/lat "distance" (using the term distance very loosely). Also note that the closer you get to the North/South Pole, the worse your error rate will be, since the longitude lines converge. > Now, I have dug through the PHP archives, but there is no single >clear explanation of how to do this, from beginning to end. I will >start by putting down what I need, and then maybe other people can >contribute to this and in the end we will have the solution, nice and >tidy, in one place. There really can't be a tidy solution, due to the underlying formulas' relative merits and the performance issues, and the widely different requirements of the definition of "distance"... Still, a summary of the options would be nice. >*** NOTE: I know doing this in PHP will be SLOW, but I do not want to >write a MySQL user-defined function and recompile it and blah, blah, >blah... (unless someone wants to cover that aspect!!! :D) If you switch to PostgreSQL instead of MySQL, you can define a function in PostgreSQL for *MUCH* better performance. (Actually, PostgreSQL may even have a good built-in long/lat distance function by now...) > Step 1: Take the postal / zip code that the user entered, and >search the national postal / zip code database (which I already have) >and return its corresponding latitude and longitude ($userLat and >$userLong). > > Step 2: ??? > > Step 3: Return the establishments information to the user, complete >with the distances to them, ordered by smallest distance to largest. For sure, limit the number of responses. In fact, let the *USER* decide how far they will travel. Somebody in Idaho is probably willing to drive a lot further than, say, a Manhattan resident... For the short distances you are talking about, you could *ALMOST* get away with: # Assume $longitude/$latitude is the user's location. select name, address from establishments where sqrt((longitude - $longitude)*(longitude - $longitude) + (latitude - $latitude)*(latitude - $latitude)) IE, the plain old cartesian (x, y) distance from Algebra II class. You could then take those results and divide by a "fudge factor" of arcs to miles (long/lat to mile conversion) to get a pretty close approximation for *NEARBY* establishments. Once you get past a few score miles, though, it will be noticeably "off" Seek out the formulas as I suggested, though, and you'll be able to get pretty accurate distances, but you'll have to use the Trigonometric functions MySQL might not have, so then you get to review Trigonometry and play some identity function games to convert it to the ones that MySQL *does* have, assuming it has any. If MySQL has *no* Trig, your options just got real easy, if not attractive, to choose... Sample with the Cartesian crappy fake "distance" and fudge factor I chose: http://chatmusic.com/venues.htm When I calculate LA to Chicago, it's way off, but so what? Nobody needs it to work that far away if they're using the application for what it's intended. -- Like Music? http://l-i-e.com/artists.htm -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php