Re: [SPARK-3324] make yarn module as a unified maven jar project

2014-08-31 Thread Sean Owen
This isn't possible since the two versions of YARN are mutually incompatible at compile-time. However see my comments about how this could be restructured to be a little more standard, and so that IntelliJ would parse it out of the box. Still I imagine it is not worth it if YARN alpha will go away

Re: [SPARK-3324] make yarn module as a unified maven jar project

2014-08-31 Thread Yi Tian
Hi Sean Before compile-time, maven could dynamically add either stable or alpha source to the yarn/ project. So there are no incompatible at the compile-time. Here are an example: yarn/pom.xml org.codehaus.mojo build-helper-maven-plugin a

Re: [SPARK-3324] make yarn module as a unified maven jar project

2014-08-31 Thread Sean Owen
Yes, alpha and stable need to stay in two separate modules. I think this is a little less standard than simply having three modules: common, stable, alpha. On Sun, Aug 31, 2014 at 1:32 PM, Yi Tian wrote: > Hi Sean > > Before compile-time, maven could dynamically add either stable or alpha > sour

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-08-31 Thread Sean Owen
All the signatures are correct. The licensing all looks fine. The source builds fine. Now, let me ask about unit tests, since I had a more detailed look, which I should have done before. dev/run-tests fails two tests (1 Hive, 1 Kafka Streaming) for me locally on 1.1.0-rc3. Does anyone else see t

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-08-31 Thread Will Benton
- Original Message - > dev/run-tests fails two tests (1 Hive, 1 Kafka Streaming) for me > locally on 1.1.0-rc3. Does anyone else see that? It may be my env. > Although I still see the Hive failure on Debian too: > > [info] - SET commands semantics for a HiveContext *** FAILED *** > [info]

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-08-31 Thread Sean Owen
Fantastic. As it happens, I just fixed up Mahout's tests for Java 8 and observed a lot of the same type of failure. I'm about to submit PRs for the two issues I identified. AFAICT these 3 then cover the failures I mentioned: https://issues.apache.org/jira/browse/SPARK-3329 https://issues.apache.o

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-08-31 Thread Patrick Wendell
For my part I'm +1 on this, though Sean it would be great separately to fix the test environment. For those who voted on rc2, this is almost identical, so feel free to +1 unless you think there are issues with the two minor bug fixes. On Sun, Aug 31, 2014 at 10:18 AM, Sean Owen wrote: > Fantasti

Re: HiveContext, schemaRDD.printSchema get different dataTypes, feature or a bug? really strange and surprised...

2014-08-31 Thread chutium
Hi Cheng, thank you very much for helping me to finally find out the secret of this magic... actually we defined this external table with SID STRING REQUEST_ID STRING TIMES_DQ TIMESTAMP TOTAL_PRICE FLOAT ... using "desc table ext_fullorders" it is only shown as [# col_name

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-08-31 Thread Nicholas Chammas
-1: I believe I've found a regression from 1.0.2. The report is captured in SPARK- . On Sat, Aug 30, 2014 at 6:07 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.1.0! > > The tag to b

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-08-31 Thread chutium
has anyone tried to build it on hadoop.version=2.0.0-mr1-cdh4.3.0 or hadoop.version=1.0.3-mapr-3.0.3 ? see comments in https://issues.apache.org/jira/browse/SPARK-3124 https://github.com/apache/spark/pull/2035 i built spark snapshot on hadoop.version=1.0.3-mapr-3.0.3 and the ticket creator built

RE: HiveContext, schemaRDD.printSchema get different dataTypes, feature or a bug? really strange and surprised...

2014-08-31 Thread Cheng, Hao
Yes, the root cause for that is the output ObjectInspector in SerDe implementation doesn't reflect the real typeinfo. Hive actually provides the API like TypeInfoUtils.getStandardJavaObjectInspectorFromTypeInfo(TypeInfo) for the mapping. You probably need to update the code at https://github.

Re: [VOTE] Release Apache Spark 1.1.0 (RC3)

2014-08-31 Thread Nicholas Chammas
On Sun, Aug 31, 2014 at 6:38 PM, chutium wrote: > has anyone tried to build it on hadoop.version=2.0.0-mr1-cdh4.3.0 or > hadoop.version=1.0.3-mapr-3.0.3 ? > Is the behavior you're seeing a regression from 1.0.2, or does 1.0.2 have this same problem? Nick

Re: [Spark SQL] off-heap columnar store

2014-08-31 Thread Ian O'Connell
I'm not sure what you mean here? Parquet is at its core just a format, you could store that data anywhere. Though it sounds like you saying, correct me if i'm wrong: you basically want a columnar abstraction layer where you can provide a different backing implementation to keep the columns rather