Was anyone able find a solution or recommended conf for this? I am running
into the same "java.lang.OutOfMemoryError: Direct buffer memory" but during
snappy compression.
Thanks,
Aniket
On Tue, Sep 23, 2014 at 7:04 PM Aaron Davidson [via Apache Spark Developers
List] wrote:
>
Thanks Ryan. I am running into this rarer issue. For now, I have moved away
from parquet but if I will create a bug in jira if I am able to produce
code that easily reproduces this.
Thanks,
Aniket
On Mon, Nov 21, 2016, 3:24 PM Ryan Blue [via Apache Spark Developers List] <
ml-node+s1001551n19
Hi Chris
This is super cool. I was wondering if this would be an open source project
so that people can contribute or reuse?
Thanks,
Aniket
On Thu Jan 15 2015 at 07:39:29 Mattmann, Chris A (3980) [via Apache Spark
Developers List] wrote:
> Hi Spark Devs,
>
> Just wanted to FYI t
Hi Patrick,
I am wondering if this version will address issues around certain artifacts
not getting published in 1.2 which are gating people to migrate to 1.2. One
such issue is https://issues.apache.org/jira/browse/SPARK-5144
Thanks,
Aniket
On Wed Jan 28 2015 at 15:39:43 Patrick Wendell [via
painful and I share the pain :)
Thanks,
Aniket
On Tue, Sep 15, 2015, 5:06 AM sim [via Apache Spark Developers List] <
ml-node+s1001551n14116...@n3.nabble.com> wrote:
> I'd like to get some feedback on an API design issue pertaining to RDDs.
>
> The design goal to avoid RDD nesting
My apologies in advance if this is a dev mailing list topic. I am working on
a small project to provide web interface to spark REPL. The interface will
allow people to use spark REPL and perform exploratory analysis on the data.
I already have a play application running that provides web interface
I too would like this feature. Erik's post makes sense. However, shouldn't
the RDD also repartition itself after drop to effectively make use of
cluster resources?
On Jul 21, 2014 8:58 PM, "Andrew Ash [via Apache Spark Developers List]" <
ml-node+s1001551n7434...@n3.nabble.com> wrote:
> Personally
Looks like the same issue as
http://mail-archives.apache.org/mod_mbox/spark-dev/201409.mbox/%3ccajob8btdxks-7-spjj5jmnw0xsnrjwdpcqqtjht1hun6j4z...@mail.gmail.com%3E
On Sep 20, 2014 11:09 AM, "tian zhang [via Apache Spark Developers List]" <
ml-node+s1001551n8481...@n3.nabble.com> wrote:
>
>
> Hi,
Hi all,
I was stuck on a problem that I faced recently. The problem statement is
like :
Event Bean consists of eventId, eventTag, text, .
We need to run a spark job that aggregates the eventTag column and picks
top K1 of them.
Additionally, we need for each eventTag, list of eventIds (first K2
ual objects
> cheaply. Right now, that’s only possible at the stream level. (There are
> hacks around this, but this would enable more idiomatic use in efficient
> shuffle implementations.)
>
>
> Have serializers indicate whether they are deterministic. This provides
> much of
fely. I am also happy mutating the original SparkContext just
not break backward compatibility as long as the returned SparkContext is
not affected by set/unset of job groups on original SparkContext.
Thoughts please?
Thanks,
Aniket
would be a great help
for windows users (like me).
Thanks,
Aniket
Ohh right. It is. I will mark my defect as duplicate and cross check my
notes with the fixes in the pull request. Thanks for pointing out Zsolt :)
On Mon, Jan 12, 2015, 7:42 PM Zsolt Tóth wrote:
> Hi Aniket,
>
> I think this is a duplicate of SPARK-1825, isn't it?
>
> Zsolt
chema upfront?
Thanks,
Aniket
Thanks Reynold and Cheng. It does seem quiet a bit of heavy lifting to have
schema per row. I will for now settle with having to do a union schema of
all the schema versions and complain any incompatibilities :-)
Looking forward to do great things with the API!
Thanks,
Aniket
On Thu Jan 29 2015
large relation
broadcast-able. Thoughts?
Aniket
e more accurate than Catalyst's prediction.
Therefore, if its not a fundamental change in Catalyst, I would think this
makes sense.
Thanks,
Aniket
On Sat, Feb 7, 2015, 4:50 AM Reynold Xin wrote:
> We thought about this today after seeing this email. I actually built a
> patch fo
Circling back on this. Did you get a chance to re-look at this?
Thanks,
Aniket
On Sun, Feb 8, 2015, 2:53 AM Aniket Bhatnagar
wrote:
> Thanks for looking into this. If this true, isn't this an issue today? The
> default implementation of sizeInBytes is 1 + broadcast thresh
ances
which makes sense. Maybe the API should provide ability to provide
parallelism and default to numShards?
I can submit pull requests for some of the above items, provided the
community agrees and nobody else is working on it.
Thanks,
Aniket
d user.
My personal preference is OSGi (or atleast some support for OSGi) but I
would love to hear what Spark devs are thinking in terms of resolving the
problem.
Thanks,
Aniket
d deal with
>> > some of these issues, but I don't think it works.
>> > On Sep 4, 2014 9:01 AM, "Felix Garcia Borrego"
>> wrote:
>> >
>> > > Hi,
>> > > I run into the same issue and apart from the ideas Aniket said, I on
upgrading httpclient? (or jets3t?)
>
> 2014-09-11 19:09 GMT+09:00 Aniket Bhatnagar :
>
>> Thanks everyone for weighing in on this.
>>
>> I had backported kinesis module from master to spark 1.0.2 so just to
>> confirm if I am not missing anything, I did a dependenc
22 matches
Mail list logo