Re: Enabling mapreduce.input.fileinputformat.list-status.num-threads in Spark?

2016-01-12 Thread Cheolsoo Park
eads"? > > Thanks. > > On Thu, Jul 23, 2015 at 8:50 PM, Cheolsoo Park > wrote: > >> Hi, >> >> I am wondering if anyone has successfully enabled >> "mapreduce.input.fileinputformat.list-status.num-threads" in Spark jobs. I >> usually se

Re: Flaky test in DAGSchedulerSuite?

2015-09-04 Thread Cheolsoo Park
gt; isn't one already. I have a simple fix. >> >> On 4 September 2015 at 19:09, Cheolsoo Park wrote: >> >>> Hi devs, >>> >>> I noticed this test case fails intermittently in Jenkins. >>> >>> For eg, see the followin

Flaky test in DAGSchedulerSuite?

2015-09-04 Thread Cheolsoo Park
Hi devs, I noticed this test case fails intermittently in Jenkins. For eg, see the following builds- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41991/ https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/41999/ The test failed in different PRs, and the failu

Re: Jenkins having issues?

2015-08-18 Thread Cheolsoo Park
/* > > other than that, i'm looking around the codebase some older builds and > seeing if i can't find the culprit. > -- Forwarded message -- > From: Cheolsoo Park > Date: Fri, Aug 14, 2015 at 4:11 PM > Subject: Jenkins having issues? >

Jenkins having issues?

2015-08-14 Thread Cheolsoo Park
Hi devs, Jenkins failed twice in my PR for unknown error- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40930/console https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/40931/console Can you help? Thank you! Cheols

Re: pyspark.sql.tests: is test_time_with_timezone a flaky test?

2015-07-13 Thread Cheolsoo Park
bug in when run with Python3.4, will sending out a fix soon. > > > > On Sun, Jul 12, 2015 at 1:33 PM, Cheolsoo Park > wrote: > >> Hi devs, > >> > >> For some reason, I keep getting this test failure (3 out of 4 builds) > in my > >> PR- >

pyspark.sql.tests: is test_time_with_timezone a flaky test?

2015-07-12 Thread Cheolsoo Park
Hi devs, For some reason, I keep getting this test failure (3 out of 4 builds) in my PR - == FAIL: test_time_with_timezone (__main__.SQLTests) ---

Re: SparkSQL errors in 1.4 rc when using with Hive 0.12 metastore

2015-05-24 Thread Cheolsoo Park
igating on this also. > > > > Hao > > > > > > *From:* Mark Hamstra [mailto:m...@clearstorydata.com] > *Sent:* Sunday, May 24, 2015 9:06 PM > *To:* Cheolsoo Park > *Cc:* u...@spark.apache.org; dev@spark.apache.org > *Subject:* Re: SparkSQL errors in 1.4 rc w

Re: Spark Sql reading hive partitioned tables?

2015-04-14 Thread Cheolsoo Park
Is there a plan to fix this? I also ran into this issue with a *"select * from tbl where ... limit 10"* query. Spark SQL is 100x slower than Presto in worst case (1.6M partitions table). This is a serious blocker for us since we have many tables with near (and over) 1M partitions, and any query aga