FYI for those with spatial interests…

From: <Smiley>, "Smiley, David W." <dsmi...@mitre.org<mailto:dsmi...@mitre.org>>
Date: Friday, January 17, 2014 at 11:53 AM
To: Demeter Sztanko <szta...@gmail.com<mailto:szta...@gmail.com>>
Cc: 
"jts-topo-suite-u...@lists.sourceforge.net<mailto:jts-topo-suite-u...@lists.sourceforge.net>"
 
<jts-topo-suite-u...@lists.sourceforge.net<mailto:jts-topo-suite-u...@lists.sourceforge.net>>
Subject: Re: [Jts-topo-suite-user] Persistent STR tree

So 4x not 10x; I’m not feeling depressed anymore ;-)

Your approach of using QuadTree and a coordinate reference system makes sense.  
One day hopefully not too far away, I expect Lucene-spatial/Spatial4j will have 
built-in projection support.  But Instead of indexing bounding boxes, you 
should ideally be indexing the actual shapes, and then you can pull the WKB and 
check for actual intersection.

I’m excited to announce to you and others reading this that I’m currently 
working on a much more sophisticated system indexing shapes and computing 
intersections that will be much faster.  The first release (within 2-3 weeks) 
will index shapes using the grid and then any matches will be double-checked 
against a WKB representation stored in Lucene "doc-values”.  The subsequent 
release to occur within the next ~30 days will tweak the grid encoding to 
include a little bit more metadata such that most queries will be completely 
satisfied by examining the fast index grid; only shapes that barely touch an 
indexed shape will have to be double-checked against the WKB representation.  
The net effect should be a dramatic increase in spatial accuracy and 
performance over the current scheme.  You can expect to see a blog post with 
illustrations about this within 30 days.

~ David

From: Demeter Sztanko <szta...@gmail.com<mailto:szta...@gmail.com>>
Date: Friday, January 17, 2014 at 11:13 AM
To: "Smiley, David W." <dsmi...@mitre.org<mailto:dsmi...@mitre.org>>
Cc: 
"jts-topo-suite-u...@lists.sourceforge.net<mailto:jts-topo-suite-u...@lists.sourceforge.net>"
 
<jts-topo-suite-u...@lists.sourceforge.net<mailto:jts-topo-suite-u...@lists.sourceforge.net>>
Subject: Re: [Jts-topo-suite-user] Persistent STR tree

Hi David,

First of all, thanks for the development of lucene - it is an amazing and 
unique library.

Sorry, 10x was a very rough estimation - lucene is actually 4 times slower.

When using Lucene, I can perform around 700 queries/second (that's 8 threads on 
8 core machine macbook Pro with ssd disks). With JTS STRTree I was able to get 
around 2800 queries/sec, so that's around 4x slowdown. I was counting only 
query performance, not indexing.

I am storing BB rectangles as geometry and the real geometry in WKB format as 
the value field of the record. And I am using QuadPrefixTree.

One thing I have noticed is that Lucene is dealing with lat/lng coordinates 
only - therefore it wont allow any other reference systems (I am using British 
reference grid: http://spatialreference.org/ref/epsg/27700/ ), so I had to 
scale down all bounding boxes so the coordinates fit into 0-180 interval.

I haven't tried any of the standalone databases as I believe the simple network 
overhead will kill all possible performance benefits. Also for other reasons I 
do not want to deal with those.

I still believe the operations I am performing on the RTree are relatively 
simple and Lucene is optimised for much more general use, so I have some hopes 
to enhance it's performance.

D.


On Fri, Jan 17, 2014 at 3:39 PM, Smiley, David W. 
<dsmi...@mitre.org<mailto:dsmi...@mitre.org>> wrote:
Whoops; forgot to reply-all.


From: <Smiley>, "Smiley, David W." <dsmi...@mitre.org<mailto:dsmi...@mitre.org>>
Date: Friday, January 17, 2014 at 10:15 AM
To: Demeter Sztanko <szta...@gmail.com<mailto:szta...@gmail.com>>
Subject: Re: [Jts-topo-suite-user] Persistent STR tree

Thanks for sharing your experience with Lucene-spatial.  I’m responsible for a 
large part of it.  I don’t think you’re ever going to get the performance of an 
in-memory structure to compare to an on-disk one (even SSD).  Of course if you 
find one then let me know.  FWIW I’m looking to improve the accuracy & 
performance of lucene-spatial a lot this year.  Can you tell me if the indexed 
spatial objects are all points or if it’s mostly non-points?  And was the 10x 
slower just query performance or did that include indexing?

In the NoSQL space (or shall we say… not a relational database space), the 
systems with the best spatial support to my knowledge are MongoDB, CouchDB 
(spatial module is add-on separately), and Lucene-spatial.  Your data set isn’t 
huge though; I’d try PostGIS if I were you.  And I’m very impressed with what I 
see in SQL Server.

Good luck,
  ~ David Smiley

From: Demeter Sztanko <szta...@gmail.com<mailto:szta...@gmail.com>>
Date: Friday, January 17, 2014 at 7:56 AM
To: 
"jts-topo-suite-u...@lists.sourceforge.net<mailto:jts-topo-suite-u...@lists.sourceforge.net>"
 
<jts-topo-suite-u...@lists.sourceforge.net<mailto:jts-topo-suite-u...@lists.sourceforge.net>>
Subject: [Jts-topo-suite-user] Persistent STR tree

Hi all,

I need to store around 50M objects in a spatial index (I need only support for 
bulk insert and concurrent intersection() operations). I need then to 
semi-randomly access the objects (that is, I probably will have 300 requests 
within one location, then another 300 in another random location, etc.)

STRTree is great and fast, however I need around 50GB of RAM for fitting the 
tree which is unfortunately too expensive for me to maintain in long term.

I need a solution that can run on 1Gb of RAM and SSD disks (it's a digitalocean 
cloud instance)

I have also tried using Lucene for storing spatial index, which is also 
feasible but around 10 times slower even on SSD disks.

I was wondering if you know of any other minimal java libraries that can do 
what I am looking for yet they are still relatively fast.


Thanks,

D.

------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Jts-topo-suite-user mailing list
jts-topo-suite-u...@lists.sourceforge.net<mailto:jts-topo-suite-u...@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/jts-topo-suite-user


Reply via email to