[
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701990#comment-16701990
]
Rinka Singh edited comment on LUCENE-7745 at 11/28/18 3:08 PM:
---------------------------------------------------------------
[~jpountz]
{quote}(Unrelated to your comment Rinka, but seeing activity on this issue
reminded me that I wanted to share something) There are limited use-cases for
GPU accelelation in Lucene due to the fact that query processing is full of
branches, especially since we added support for impacts and WAND.{quote}
While Yes branches do impact the performance, well designed (GPU) code will
consist of a combo of both CPU (the decision making part) and GPU code. For
example, I wrote a histogram as a test case that saw SIGNIFICANT acceleration
and I also identified further code areas that can be improved. I'm fairly sure
(gut feel), I can squeeze out a 40-50x kind of improvement at the very least on
a mid-sized GPU (given the time etc.,). I think things will be much, much
better on a high end GPU and with further scale-up on a multi-gpu system...
My point is - thinking (GPU) parallel is a completely different ball-game and
requires a mind-shift. Once that happens, the value add will be massive and
gut tells me Lucene is a huge opportunity.
Incidentally, this is why I want to develop a library that I can put out there
for integration.
{quote}That said Mike initially mentioned that BooleanScorer might be one
scorer that could benefit from GPU acceleration as it scores large blocks of
documents at once. I just attached a specialization of a disjunction over term
queries that should make it easy to experiment with Cuda, see the TODO in the
end on top of the computeScores method.
{quote}
Lucene is really new to me (and so is working with Apache - sorry, I am a
newbie to Apache) :). Please will you put links here...
was (Author: rinka):
[~jpountz]
{quote}(Unrelated to your comment Rinka, but seeing activity on this issue
reminded me that I wanted to share something) There are limited use-cases for
GPU accelelation in Lucene due to the fact that query processing is full of
branches, especially since we added support for impacts and WAND.{quote}
While Yes branches do impact the performance, well designed (GPU) code will
consist of a combo of both CPU (the decision making part) and GPU code. For
example, I wrote a histogram as a test case that saw SIGNIFICANT acceleration
and I also identified further code areas that can be improved. I'm fairly sure
(gut feel), I can squeeze out a 40-50x kind of improvement at the very least on
a mid-sized GPU (given the time etc.,). I think things will be much, much
better on a high end GPU and with further scale-up on a multi-gpu system...
Incidentally, this is why I want to develop a library that I can put out there
for integration.
{quote}That said Mike initially mentioned that BooleanScorer might be one
scorer that could benefit from GPU acceleration as it scores large blocks of
documents at once. I just attached a specialization of a disjunction over term
queries that should make it easy to experiment with Cuda, see the TODO in the
end on top of the computeScores method.
{quote}
Lucene is really new to me (and so is working with Apache - sorry, I am a
newbie to Apache) :). Please will you put links here...
> Explore GPU acceleration
> ------------------------
>
> Key: LUCENE-7745
> URL: https://issues.apache.org/jira/browse/LUCENE-7745
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Ishan Chattopadhyaya
> Assignee: Ishan Chattopadhyaya
> Priority: Major
> Labels: gsoc2017, mentor
> Attachments: TermDisjunctionQuery.java, gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known
> to be a good candidate for GPU based speedup (esp. when complex polygons are
> involved). In the past, Mike McCandless has mentioned that "both initial
> indexing and merging are CPU/IO intensive, but they are very amenable to
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I
> volunteer to mentor any GSoC student willing to work on this this summer.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]