The index filters using bounding boxes. A long, diagonal route will have a large bounding box, relative to the area you actually care about (within a narrow strip of the route). Use ST_Segmentize() to add points to your route, ST_DumpPoints() to dump those out as point and ST_MakeLine to generate new lines from those points, each line very short. The maximum index effectiveness will come when your line length is close to your buffer width.
P On Thu, Jan 5, 2017 at 9:45 AM, Israel Brewster <isr...@ravnalaska.net> wrote: > I have a database (PostgreSQL 9.6.1) containing 62,702,675 rows of > latitude (numeric), longitude(numeric), elevation(integer) data, along with > a PostGIS (2.3.0) geometry column (location), running on a CentOS 6.8 box > with 64GB RAM and a RAID10 SSD data drive. I'm trying to get the maximum > elevation along a path, for which purpose I've come up with the following > query (for one particular path example): > > SELECT elevation FROM data > > > > > WHERE ST_DWithin(location, > ST_GeographyFromText('SRID=4326;LINESTRING(-150.008056 > 61.179167,-156.77 71.285833)'), 600) > > > > ORDER BY elevation LIMIT 1; > > The EXPLAIN ANALYZE output of this particular query ( > https://explain.depesz.com/s/heZ) shows: > > > > QUERY PLAN > > > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------------------------ > ------------------------------------------ > Limit (cost=4.83..4.83 rows=1 width=4) (actual time=22653.840..22653.842 > rows=1 loops=1) > -> Sort (cost=4.83..4.83 rows=1 width=4) (actual > time=22653.837..22653.837 rows=1 loops=1) > Sort Key: elevation DESC > Sort Method: top-N heapsort Memory: 25kB > -> Index Scan using location_gix on data (cost=0.42..4.82 > rows=1 width=4) (actual time=15.786..22652.041 rows=11081 loops=1) > Index Cond: (location && '0102000020E6100000020000002C11 > A8FE41C062C0DFC2BAF1EE964E40713D0AD7A39863C086C77E164BD25140'::geography) > Filter: (('0102000020E6100000020000002C11 > A8FE41C062C0DFC2BAF1EE964E40713D0AD7A39863C086C77E164BD25140'::geography > && _st_expand(location, '600'::double precision)) AND > _st_dwithin(location, '0102000020E6100000020000002C11 > A8FE41C062C0DFC2BAF1EE964E40713D0AD7A39863C086C77E164BD25140'::geography, > '600'::double precision, true)) > Rows Removed by Filter: 4934534 > Planning time: 0.741 ms > Execution time: 22653.906 ms > (10 rows) > > So it is using the index properly, but still takes a good 22 seconds to > run, most of which appears to be in the Index Scan. > > Is there any way to improve this, or is this going to be about as good as > it gets with the number of rows being dealt with? I was planning to use > this for a real-time display - punch in a couple of points, get some > information about the route between, including maximum elevation - but with > it taking 22 seconds for the longer routes at least, that doesn't make for > the best user experience. > > It's perhaps worth noting that the example above is most likely a worst > case scenario. I would expect the vast majority of routes to be > significantly shorter, and I want to say the shorter routes query much > faster [testing needed]. That said, the faster the better, even for short > routes :-) > ----------------------------------------------- > Israel Brewster > Systems Analyst II > Ravn Alaska > 5245 Airport Industrial Rd > Fairbanks, AK 99709 > (907) 450-7293 > ----------------------------------------------- > > > > > >