Re: Found a typo in Catalyst's exception and want to write a test -- help needed

2016-08-17 Thread Reynold Xin
I'd use the new SQLQueryTestSuite. Test cases defined in sql files. On Wed, Aug 17, 2016 at 11:46 PM, Jacek Laskowski wrote: > Hi devs, > > While reviewing the code in Catalyst for doing query parsing I found > that UnresolvedStar has this typo in the exception [1]. > > I do understand that it'

Found a typo in Catalyst's exception and want to write a test -- help needed

2016-08-17 Thread Jacek Laskowski
Hi devs, While reviewing the code in Catalyst for doing query parsing I found that UnresolvedStar has this typo in the exception [1]. I do understand that it's a very trivial issue but I thought I'd write a test for it as part of the change so I could improve my understanding of the low-level bit

Re: Aggregations with scala pairs

2016-08-17 Thread Jean-Baptiste Onofré
Agreed. Regards JB On Aug 18, 2016, 07:32, at 07:32, Olivier Girardot wrote: >CC'ing dev list, you should open a Jira and a PR related to it to >discuss it c.f. >https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-ContributingCodeChanges > > > > > >On W

Re: Aggregations with scala pairs

2016-08-17 Thread Olivier Girardot
CC'ing dev list, you should open a Jira and a PR related to it to discuss it c.f. https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-ContributingCodeChanges On Wed, Aug 17, 2016 4:01 PM, Andrés Ivaldi iaiva...@gmail.com wrote: Hello, I'd like to report

Re: Spark SQL and Kryo registration

2016-08-17 Thread Olivier Girardot
Hi everyone, it seems that it works now out of the box. So nevermind, registration is compatible with spark 2.0 when using dataframes. Regards, Olivier. On Fri, Aug 5, 2016 10:07 AM, Maciej Bryński mac...@brynski.pl wrote: Hi Olivier, Did you check performance of Kryo ? I have observations th

How is mapped LogicalPlan to RDDs eventually if ever? How about Dataset?

2016-08-17 Thread Jacek Laskowski
Hi, I'm wondering how far off base I am with the question: Is a LogicalPlan in #SparkSQL similar to a RDD in #ApacheSpark Core in that they both seem a metadata of the computation that eventually gets executed to produce records? What am I missing if anything? How imprecise I am by comparing Log

Re: Spark R - Loading Third Party R Library in YARN Executors

2016-08-17 Thread Shivaram Venkataraman
I think you can also pass in a zip file using the --files option (http://spark.apache.org/docs/latest/running-on-yarn.html has some examples). The files should then be present in the current working directory of the driver R process. Thanks Shivaram On Wed, Aug 17, 2016 at 4:16 AM, Felix Cheung

Re: [master] ERROR RetryingHMSHandler: AlreadyExistsException(message:Database default already exists)

2016-08-17 Thread Yin Huai
Yea. Please create a jira. Thanks! On Tue, Aug 16, 2016 at 11:06 PM, Jacek Laskowski wrote: > On Tue, Aug 16, 2016 at 10:51 PM, Yin Huai wrote: > > > Do you want to try it? > > Yes, indeed! I'd be more than happy. Guide me if you don't mind. Thanks. > > Should I create a JIRA for this? > > Jace

Re: Spark R - Loading Third Party R Library in YARN Executors

2016-08-17 Thread Felix Cheung
When you call library(), that is the library loading function in native R. As of now it does not support HDFS but there are several packages out there that might help. Another approach is to have a prefetch/installation mechanism to call HDFS command to download the R package from HDFS onto the

Spark R - Loading Third Party R Library in YARN Executors

2016-08-17 Thread Senthil Kumar
Hi All , We are using Spark 1.6 Version R library .. Below is our code which Loads the THIRD Party Library . library("BreakoutDetection", lib.loc = "*hdfs://xx/BreakoutDetection/*") : library("BreakoutDetection", lib.loc = "*//xx/BreakoutDetection/*") : When i try to execute the code u