RE: Need advice for Spark newbie

2015-02-26 Thread Steve Nunez
Hi Vikram, There was a recent presentation at Strata that you might find useful: Hive on Spark is Blazing Fast .. Or Is It? Generally those conclusions mirror my own observations: on large data sets, Hive s

Re: Surprising Spark SQL benchmark

2014-11-05 Thread Steve Nunez
world record at the Nürburgring in a 2014 1000hp LaFerrari and somehow forgetting to mention that the last record was held by a 2001 Toyota Celica. - Steve From: Nicholas Chammas Date: Wednesday, November 5, 2014 at 15:56 To: Steve Nunez Cc: Patrick Wendell , dev Subject: Re: Surprising Spark

Re: Surprising Spark SQL benchmark

2014-10-31 Thread Steve Nunez
To be fair, we (Spark community) haven’t been any better, for example this benchmark: https://databricks.com/blog/2014/10/10/spark-petabyte-sort.html For which no details or code have been released to allow others to reproduce it. I would encourage anyone doing a Spark benchmark in futur

Re: Breaking the previous large-scale sort record with Spark

2014-10-10 Thread Steve Nunez
Great stuff. Wonderful to see such progress in so short a time. How about some links to code and instructions so that these benchmarks can be reproduced? Regards, - Steve From: Debasish Das Date: Friday, October 10, 2014 at 8:17 To: Matei Zaharia Cc: user , dev Subject: Re: Breaking the

Re: Issues with HDP 2.4.0.2.1.3.0-563

2014-08-04 Thread Steve Nunez
x27;s the vendor's problem.) > >This isn't any argument about being purist but just that I am not sure >these are things that the project can meaningfully bother with. > >It makes sense to set vendor repos in the pom for convenience, and >makes sense to run smoke tests

Re: Issues with HDP 2.4.0.2.1.3.0-563

2014-08-04 Thread Steve Nunez
I don’t think there is an hwx profile, but there probably should be. - Steve From: Patrick Wendell Date: Monday, August 4, 2014 at 10:08 To: Ron's Yahoo! Cc: Ron's Yahoo! , Steve Nunez , , "dev@spark.apache.org" Subject: Re: Issues with HDP 2.4.0.2.1.3.0-563 Ah I

Re: Issues with HDP 2.4.0.2.1.3.0-563

2014-08-04 Thread Steve Nunez
Provided you¹ve got the HWX repo in your pom.xml, you can build with this line: mvn -Pyarn -Phive -Phadoop-2.4 -Dhadoop.version=2.4.0.2.1.1.0-385 -DskipTests clean package I haven¹t tried building a distro, but it should be similar. - SteveN On 8/4/14, 1:25, "Sean Owen" wrote: >For a

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Steve Nunez
t; > with a stub. >> > [WARNING] Class org.antlr.runtime.Token not found - continuing with a >> stub. >> > [WARNING] Class org.antlr.runtime.tree.Tree not found - continuing >>with a >> > stub. >> > [ERROR] >> > while compiling: &g

Re: Working Formula for Hive 0.13?

2014-07-28 Thread Steve Nunez
4:47 PM, Ted Yu >> wrote: >> >> > >> > hive-exec (as of 0.13.1) is published here: >> >> > >> > >> >> > >> >> >> > >> >> >> >>http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C >>0.13.1%7Cj

'Proper' Build Tool

2014-07-28 Thread Steve Nunez
Gents, It seem that until recently, building via sbt was a documented process in the 0.9 overview: http://spark.apache.org/docs/0.9.0/ The section on building mentions using sbt/sbt assembly. However in the latest overview: http://spark.apache.org/docs/latest/index.html There¹s no mention of b

Working Formula for Hive 0.13?

2014-07-28 Thread Steve Nunez
I saw a note earlier, perhaps on the user list, that at least one person is using Hive 0.13. Anyone got a working build configuration for this version of Hive? Regards, - Steve -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is ad

Re: No such file or directory errors running tests

2014-07-27 Thread Steve Nunez
Whilst we¹re on this topic, I¹d be interested to see if you get hive failures. I¹m trying to build on a Mac using HDP and seem to be getting failures related to Parquet. I¹ll know for sure once I get in tomorrow and confirm with engineering, but this is likely because the version of Hive is 0.12.0,