I was partly right. In earlier versions (pre-1.0) was not thread safe (due to 
shuffle grouping). 1.0 introduced a new shuffle grouping implementation that is 
thread-safe, in addition to a load-aware shuffle grouping. I had missed the 
fact that the new shuffle grouping is thread safe.

-Taylor

> On Apr 28, 2016, at 1:48 PM, Steven Lewis <[email protected]> wrote:
> 
> We tried implementing our own thread safe output collector but it seems 
> ridiculous for such a concurrent system. Why don’t they implement it in core 
> Storm? They can have the current one, and then build a separate one called 
> Concurrent_OutputCollector or some such. I was hoping they fixed that in 
> version 1.0 and I missed it.
> 
> From: Stephen Powis <[email protected] <mailto:[email protected]>>
> Reply-To: "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Date: Thursday, April 28, 2016 at 10:58 AM
> To: "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Subject: Re: thread safe output collector
> 
> So the Spout documentation (assuming its correct...) here 
> (http://storm.apache.org/releases/current/Concepts.html#spouts 
> <http://storm.apache.org/releases/current/Concepts.html#spouts>) mentions 
> this:
> 
> "The main method on spouts is nextTuple. nextTuple either emits a new tuple 
> into the topology or simply returns if there are no new tuples to emit. It is 
> imperative that nextTuple does not block for any spout implementation, 
> because Storm calls all the spout methods on the same thread."
> 
> When developing a custom spout we interpreted it to mean that any "real work" 
> done by a spout should be done in a separate thread, and decided on the 
> following pattern which seems some what relevant to what you are trying to do 
> in your bolts.
> 
> On Spout prepare, we create a concurrent/thread safe queue.  We then create a 
> new Thread passing it a reference to our thread safe queue.  This thread 
> handles finding new data that needs to be emitted.  When that thread finds 
> data, it adds it to the shared queue.  When the spout's nextTuple() method is 
> called, it looks for data on the shared queue and emits it.
> 
> I imagine doing async processing in a bolt using one or more threads could 
> work with a similar pattern.  On prepare you setup your thread(s) with 
> references to a shared queue.  The bolt passes work to be completed to the 
> thread(s), the thread(s) communicate back to the bolt the result via a shared 
> queue.  Add in the concept of tick tuples to ensure your bolt checks for 
> completed work on a regular basis?
> 
> Is there a better way to do this?
> 
> On Thu, Apr 28, 2016 at 11:22 AM, Julien Nioche 
> <[email protected] <mailto:[email protected]>> wrote:
> Thanks for the clarification
> 
> On 28 April 2016 at 15:12, P. Taylor Goetz <[email protected] 
> <mailto:[email protected]>> wrote:
> The documentation is wrong. See:
> 
> https://issues.apache.org/jira/browse/STORM-841 
> <https://issues.apache.org/jira/browse/STORM-841>
> 
> At some point it looks like the change made there got reverted. I will reopen 
> it to make sure the documentation is corrected.
> 
> OutputCollector is NOT thread-safe.
> 
> -Taylor
> 
>> On Apr 28, 2016, at 9:06 AM, Stephen Powis <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> "Its perfectly fine to launch new threads in bolts that do processing 
>> asynchronously. OutputCollector 
>> <http://storm.apache.org/releases/current/javadocs/org/apache/storm/task/OutputCollector.html>
>>  is thread-safe and can be called at any time."
>> 
>> 
>> From the docs for 0.9.6: 
>> http://storm.apache.org/releases/0.9.6/Concepts.html#bolts 
>> <http://storm.apache.org/releases/0.9.6/Concepts.html#bolts>
>> 
>> On Thu, Apr 28, 2016 at 9:03 AM, P. Taylor Goetz <[email protected] 
>> <mailto:[email protected]>> wrote:
>> IIRC there was discussion about making it thread safe, but I don't believe 
>> it was implemented.
>> 
>> -Taylor
>> 
>> On Apr 28, 2016, at 3:52 AM, Julien Nioche <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>>> Hi Stephen
>>> 
>>> I asked the same question in February but did not get a reply
>>> 
>>> https://mail-archives.apache.org/mod_mbox/storm-user/201602.mbox/%3cca+-fm0urpf3fuerozywpzmxu-kdbgf-zj3wbyr8evsaqjc6...@mail.gmail.com%3E
>>>  
>>> <https://mail-archives.apache.org/mod_mbox/storm-user/201602.mbox/%3cca+-fm0urpf3fuerozywpzmxu-kdbgf-zj3wbyr8evsaqjc6...@mail.gmail.com%3E>
>>> 
>>> Anyone who could confirm this?
>>> 
>>> Thanks
>>> 
>>> On 27 April 2016 at 14:05, Steven Lewis <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> I have conflicting information, and have not checked personally but has the 
>>> output collector finally been made thread safe for emitting in version 1.0 
>>> or 0.10? I know it was a huge problem in 0.9.5 when trying to do threading 
>>> in a bolt for async future calls and emitting once it returns.
>>> 
>>> This email and any files transmitted with it are confidential and intended 
>>> solely for the individual or entity to whom they are addressed. If you have 
>>> received this email in error destroy it immediately. *** Walmart 
>>> Confidential ***
>>> 
>>> 
>>> 
>>> --
>>> 
>>> Open Source Solutions for Text Engineering
>>> 
>>> http://www.digitalpebble.com <http://www.digitalpebble.com/>
>>> http://digitalpebble.blogspot.com/ <http://digitalpebble.blogspot.com/>
>>> #digitalpebble <http://twitter.com/digitalpebble>
>> 
> 
> 
> 
> 
> --
> 
> Open Source Solutions for Text Engineering
> 
> http://www.digitalpebble.com <http://www.digitalpebble.com/>
> http://digitalpebble.blogspot.com/ <http://digitalpebble.blogspot.com/>
> #digitalpebble <http://twitter.com/digitalpebble>
> 
> This email and any files transmitted with it are confidential and intended 
> solely for the individual or entity to whom they are addressed. If you have 
> received this email in error destroy it immediately. *** Walmart Confidential 
> ***

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to