[ 
https://issues.apache.org/jira/browse/LUCENE-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282976#comment-14282976
 ] 

David Smiley commented on LUCENE-6191:
--------------------------------------

BTW I took a peek at ElasticSearch's geohash aggregations feature to see how 
that similar feature worked.  It's quite different.  AFAICT, it's only for 
point data and works off of DocValues, and at least presently it always exposes 
the counts as faceting on geohashes. (i.e. frequency ordered geohash terms with 
counts).  The algorithmic complexity is based on the number of documents 
matching your search O(docs), whereas this patch is O(log(terms)) with a 
constant factor of how many grid cells you request.

> Spatial 2D faceting (heatmaps)
> ------------------------------
>
>                 Key: LUCENE-6191
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6191
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 5.1
>
>         Attachments: LUCENE-6191__Spatial_heatmap.patch
>
>
> Lucene spatial's PrefixTree (grid) based strategies index data in a way 
> highly amenable to faceting on grids cells to compute a so-called _heatmap_. 
> The underlying code in this patch uses the PrefixTreeFacetCounter utility 
> class which was recently refactored out of faceting for NumberRangePrefixTree 
> LUCENE-5735.  At a low level, the terms (== grid cells) are navigated 
> per-segment, forward only with TermsEnum.seek, so it's pretty quick and 
> furthermore requires no extra caches & no docvalues.  Ideally you should use 
> QuadPrefixTree (or Flex once it comes out) to maximize the number grid levels 
> which in turn maximizes the fidelity of choices when you ask for a grid 
> covering a region.  Conveniently, the provided capability returns the data in 
> a 2-D grid of counts, so the caller needn't know a thing about how the data 
> is encoded in the prefix tree.  Well almost... at this point they need to 
> provide a grid level, but I'll soon provide a means of deriving the grid 
> level based on a min/max cell count.
> I recommend QuadPrefixTree with geo=false so that you can provide a square 
> world-bounds (360x360 degrees), which means square grid cells which are more 
> desirable to display than rectangular cells.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to