[jira] [Commented] (LUCENE-7745) Explore GPU acceleration

Rinka Singh (JIRA) Wed, 28 Nov 2018 05:11:32 -0800


    [ 
https://issues.apache.org/jira/browse/LUCENE-7745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16701855#comment-16701855
 ]


Rinka Singh commented on LUCENE-7745:
-------------------------------------

Hi everyone,

I wanted to check if this issue was still open.  I have been experimenting with 
CUDA for a bit and would love to take a stab at this.

A few thoughts:
 * This is something I'll do over weekends and so I'm going to be horribly slow 
(its going to be just me on this unless you have someone working on it and I 
can collaborate with them) - would that be OK?
 * I think the right thing to do would be to build a CUDA library (C/C++), put 
JNI and then integrate it into Lucene.  If done right then I think this library 
will be useful to (and be possible to integrate with) other Analytic tools.
 * If I get it right, then I'd love to create an OS library that other OS tools 
can integrate and use (Yes, I'm thinking of an OpenCL port in the future but 
given the tools available in CUDA and my familiarity with it...)
 * Licensing is not an issue as I prefer the Apache License.
 * Testing (especially scalability testing) will be an issue - like you said, 
your setups won't have GPUs but would it be possible to rent a few GPU 
instances on the cloud (AWS, Google)?  I can do my dev testing locally as I 
have a GPU (its a  pretty old and obsolete one but good enough for my needs) on 
my dev machine.
 * It is important to get a few users who will experiment with this.  Can you 
guys help in having someone deploy, experiment and give feedback?
 * I would rather take something that is used by everyone and I'm thinking that 
indexing, filtering and searching is something that I would rather take up: 
[http://lucene.apache.org/core/7_5_0/demo/overview-summary.html#overview.description]
 ** These can certainly be accelerated.  I think I should be able to get some 
acceleration out of a GPU enabled search.
 ** The good part of this is one would able to scale volumes almost linearly on 
a multi-GPU machine.
 ** Related to the previous point (though this is in the future). I don't have 
a multi-GPU setup and will not be able to develop multi-GPU versions. I'll need 
help in getting the infrastructure to do that. We can talk about that once a 
single GPU version is done.
 ** Yes I agree that it will be better to have a separate library / classes 
doing this rather than directly integrating it into Lucene's class library.  
This suits me too as I can develop this as a separate library that other OS 
components can integrate and I can package this as part of nVidia's OS 
libraries.
 * I'm open to other alternatives - I scanned the ideas above but didn't 
consider them as they would not bring massive value to the users and I don't 
really want to experiment as I know what I'm doing.
 * Related to the previous point, I don't know Lucene (Help!! - do I really 
need to?) and will need support/hand-holding in terms of reviewing the 
identification/interfacing/design/code etc., etc.,
 * Finally, this IS GOING TO take time because thinking (and programming) 
massively parallel is completely different from writing a simple sequential 
search and sort.  How much time, think 7-10x at least given all my constraints.

If you guys like, I can write a brief (one or two paras) description of what is 
possible for indexing, searching, filtering (with zero knowledge of Lucene of 
course) to start off...

Your thoughts please...

> Explore GPU acceleration
> ------------------------
>
>                 Key: LUCENE-7745
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7745
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Ishan Chattopadhyaya
>            Assignee: Ishan Chattopadhyaya
>            Priority: Major
>              Labels: gsoc2017, mentor
>         Attachments: gpu-benchmarks.png
>
>
> There are parts of Lucene that can potentially be speeded up if computations 
> were to be offloaded from CPU to the GPU(s). With commodity GPUs having as 
> high as 12GB of high bandwidth RAM, we might be able to leverage GPUs to 
> speed parts of Lucene (indexing, search).
> First that comes to mind is spatial filtering, which is traditionally known 
> to be a good candidate for GPU based speedup (esp. when complex polygons are 
> involved). In the past, Mike McCandless has mentioned that "both initial 
> indexing and merging are CPU/IO intensive, but they are very amenable to 
> soaking up the hardware's concurrency."
> I'm opening this issue as an exploratory task, suitable for a GSoC project. I 
> volunteer to mentor any GSoC student willing to work on this this summer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7745) Explore GPU acceleration

Reply via email to