question about combining small input splits

2015-11-23 Thread Nezih
y large. I then looked for getting the total input size from an rdd to come up with some heuristic to set the partition count, but I couldn't find any ways to do it. Any help is appreciated. Thanks, Nezih -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabbl

question about combining small parquet files

2015-11-26 Thread Nezih Yigitbasi
ifying the spark source). Any help is appreciated. Thanks, Nezih PS: this email is the same as my previous email as I learned that my previous email ended up as spam for many people since I sent it through nabble, sorry for the double post.

Re: question about combining small parquet files

2015-11-30 Thread Nezih Yigitbasi
of small files is discussed recently > > http://blog.cloudera.com/blog/2015/11/how-to-ingest-and-query-fast-data-with-impala-without-kudu/ > > > AFAIK Spark supports views too. > > > -- > Ruslan Dautkhanov > > On Thu, Nov 26, 2015 at 10:43 AM, Nezih Yigitbasi <

how about a custom coalesce() policy?

2016-02-24 Thread Nezih Yigitbasi
be useful, I already have an implementation and I will be happy to work with the community to contribute it. Thanks, Nezih ​

SparkContext.stop() takes too long to complete

2016-03-19 Thread Nezih Yigitbasi
Hi Spark experts, I am using Spark 1.5.2 on YARN with dynamic allocation enabled. I see in the driver/application master logs that the app is marked as SUCCEEDED and then SparkContext stop is called. However, this stop sequence takes > 10 minutes to complete, and YARN resource manager kills the app

java.lang.OutOfMemoryError: Unable to acquire bytes of memory

2016-03-21 Thread Nezih Yigitbasi
ra/browse/SPARK-10309>, SPARK-10379 <https://issues.apache.org/jira/browse/SPARK-10379>. Any workarounds to this issue or any plans to fix it? Thanks a lot, Nezih 16/03/19 05:12:09 INFO memory.TaskMemoryManager: Memory used in task 4687016/03/19 05:12:09 INFO memory.

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

2016-03-21 Thread Nezih Yigitbasi
Andrew, thanks for the suggestion, but unfortunately it didn't work -- still getting the same exception. On Mon, Mar 21, 2016 at 10:32 AM Andrew Or wrote: > @Nezih, can you try again after setting `spark.memory.useLegacyMode` to > true? Can you still reproduce the OOM that way? > &

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

2016-03-22 Thread Nezih Yigitbasi
Interesting. After experimenting with various parameters increasing spark.sql.shuffle.partitions and decreasing spark.buffer.pageSize helped my job go through. BTW I will be happy to help getting this issue fixed. Nezih On Tue, Mar 22, 2016 at 1:07 AM james wrote: Hi, > I also found 'U

Re: how about a custom coalesce() policy?

2016-04-01 Thread Nezih Yigitbasi
Hey Reynold, Created an issue (and a PR) for this change to get discussions started. Thanks, Nezih On Fri, Feb 26, 2016 at 12:03 AM Reynold Xin wrote: > Using the right email for Nezih > > > On Fri, Feb 26, 2016 at 12:01 AM, Reynold Xin wrote: > >> I think this can be u

Re: how about a custom coalesce() policy?

2016-04-02 Thread Nezih Yigitbasi
Sure, here <https://issues.apache.org/jira/browse/SPARK-14042> is the jira and this <https://github.com/apache/spark/pull/11865> is the PR. Nezih On Sat, Apr 2, 2016 at 10:40 PM Hemant Bhanawat wrote: > correcting email id for Nezih > > Hemant Bhanawat <https://ww

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

2016-04-04 Thread Nezih Yigitbasi
Nope, I didn't have a chance to track the root cause, and IIRC we didn't observe it when dyn. alloc. is off. On Mon, Apr 4, 2016 at 6:16 PM Reynold Xin wrote: > BTW do you still see this when dynamic allocation is off? > > On Mon, Apr 4, 2016 at 6:16 PM, Reynold Xin

Re: java.lang.OutOfMemoryError: Unable to acquire bytes of memory

2016-04-14 Thread Nezih Yigitbasi
Thanks Imran. I will give it a shot when I have some time. Nezih On Thu, Apr 14, 2016 at 9:25 AM Imran Rashid wrote: > Hi Nezih, > > I just reported a somewhat similar issue, and I have a potential fix -- > SPARK-14560, looks like you are already watching it :). You can try out