RJ, could you provide a code example that can re-produce the bug you
observed in local testing? Breeze's += is not thread-safe. But in a
Spark job, calls to a resultHandler is synchronized:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala#L52
. Let's move our discussion to the JIRA page. -Xiangrui

On Wed, Sep 3, 2014 at 12:07 PM, RJ Nowling <rnowl...@gmail.com> wrote:
> Here's the JIRA:
>
> https://issues.apache.org/jira/browse/SPARK-3384
>
> Even if the current implementation uses += in a thread safe manner, it can
> be easy to make the mistake of accidentally using += in a parallelized
> context.  I suggest changing all instances of += to +.
>
> I would encourage others to reproduce and validate this issue, though.
>
>
> On Wed, Sep 3, 2014 at 3:02 PM, David Hall <d...@cs.berkeley.edu> wrote:
>
>> mutating operations are not thread safe. Operations that don't mutate
>> should be thread safe. I can't speak to what Evan said, but I would guess
>> that the way they're using += should be safe.
>>
>>
>> On Wed, Sep 3, 2014 at 11:58 AM, RJ Nowling <rnowl...@gmail.com> wrote:
>>
>>> David,
>>>
>>> Can you confirm that += is not thread safe but + is?  I'm assuming +
>>> allocates a new object for the write, while += doesn't.
>>>
>>> Thanks!
>>> RJ
>>>
>>>
>>> On Wed, Sep 3, 2014 at 2:50 PM, David Hall <d...@cs.berkeley.edu> wrote:
>>>
>>>> In general, in Breeze we allocate separate work arrays for each call to
>>>> lapack, so it should be fine. In general concurrent modification isn't
>>>> thread safe of course, but things that "ought" to be thread safe really
>>>> should be.
>>>>
>>>>
>>>> On Wed, Sep 3, 2014 at 10:41 AM, RJ Nowling <rnowl...@gmail.com> wrote:
>>>>
>>>>> No, it's not in all cases.   Since Breeze uses lapack under the hood,
>>>>> changes to memory between different threads is bad.
>>>>>
>>>>> There's actually a potential bug in the KMeans code where it uses +=
>>>>> instead of +.
>>>>>
>>>>>
>>>>> On Wed, Sep 3, 2014 at 1:26 PM, Ulanov, Alexander <
>>>>> alexander.ula...@hp.com>
>>>>> wrote:
>>>>>
>>>>> > Hi,
>>>>> >
>>>>> > Is breeze library called thread safe from Spark mllib code in case
>>>>> when
>>>>> > native libs for blas and lapack are used? Might it be an issue when
>>>>> running
>>>>> > Spark locally?
>>>>> >
>>>>> > Best regards, Alexander
>>>>> > ---------------------------------------------------------------------
>>>>> > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>>>>> > For additional commands, e-mail: dev-h...@spark.apache.org
>>>>> >
>>>>> >
>>>>>
>>>>>
>>>>> --
>>>>> em rnowl...@gmail.com
>>>>> c 954.496.2314
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> em rnowl...@gmail.com
>>> c 954.496.2314
>>>
>>
>>
>
>
> --
> em rnowl...@gmail.com
> c 954.496.2314

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to