[
https://issues.apache.org/jira/browse/SOLR-7005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286979#comment-14286979
]
David Smiley commented on SOLR-7005:
------------------------------------
The perf tests on the Lucene issue were of the whole world. For fun I went to
a small part of the globe, NYC region, and generated a heatmap PNG, 9.5m per
pixel (cell) of 467x467 cells on this 16-segment (unoptimized) index and it
only took 44ms (to include returning the base64'ed PNG to the client). For
more fun I upped the size to be a huge 2096x2096 (4.4M pixels aka cells, bigger
than most people's screens) and it took 521ms. It's sparse though; PNG
encoding consumed 90% of the time since it was so large; normally it's a few
percent. A spatial filter of the same region shows 1298 docs, so yes its
sparse relative to the heatmap/image as a whole. This shows this technique can
be used for point-plotting purposes when you might want to plot all the points
in a search region without them necessarily being in your top-X search list.
That use-case, (high-rez but sparse) suggests a different response format than
a grid of ints, such as simply listing point coordinates with counts. Doing
that is for another time though.
> facet.heatmap for spatial heatmap faceting on RPT
> -------------------------------------------------
>
> Key: SOLR-7005
> URL: https://issues.apache.org/jira/browse/SOLR-7005
> Project: Solr
> Issue Type: New Feature
> Components: spatial
> Reporter: David Smiley
> Assignee: David Smiley
> Fix For: 5.1
>
> Attachments: SOLR-7005_heatmap.patch, heatmap_512x256.png,
> heatmap_64x32.png
>
>
> This is a new feature that uses the new spatial Heatmap / 2D PrefixTree cell
> counter in Lucene spatial LUCENE-6191. This is a form of faceting, and
> as-such I think it should live in the "facet" parameter namespace. Here's
> what the parameters are:
> * facet=true
> * facet.heatmap=fieldname
> * facet.heatmap.bbox=\["-180 -90" TO "180 90"]
> * facet.heatmap.gridLevel=6
> * facet.heatmap.distErrPct=0.10
> Like other faceting features, the fieldName can have local-params to exclude
> filter queries or specify an output key.
> The bbox is optional; you get the whole world or you can specify a box or
> actually any shape that WKT supports (you get the bounding box of whatever
> you put).
> Ultimately, this feature needs to know the grid level, which together with
> the input shape will yield a certain number of cells. You can specify
> gridLevel exactly, or don't and instead provide distErrPct which is computed
> like it is for the RPT field type as seen in the schema. 0.10 yielded ~4k
> cells but it'll vary. There's also a facet.heatmap.maxCells safety net
> defaulting to 100k. Exceed this and you get an error.
> The output is (JSON):
> {noformat}
> {gridLevel=6,columns=64,rows=64,minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0,counts=[[0,
> 0, 2, 1, ....],[1, 1, 3, 2, ...],...]}
> {noformat}
> counts is null if all would be 0. Perhaps individual row arrays should
> likewise be null... I welcome feedback.
> I'm toying with an output format option in which you can specify a base-64'ed
> grayscale PNG.
> Obviously this should support sharded / distributed environments.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]