RE: [SparkSQL] Reuse HiveContext to different Hive warehouse?

2015-03-10 Thread Haopu Wang
Hao, thanks for the response. For Q1, in my case, I have a tool on SparkShell which serves multiple users where they can use different Hive installation. I take a look at the code of HiveContext. It looks like I cannot do that today because "catalog" field cannot be changed after initialize.

[RESULT] [VOTE] Release Apache Spark 1.3.0 (RC3)

2015-03-10 Thread Patrick Wendell
This vote passes with 13 +1 votes (6 binding) and no 0 or -1 votes: +1 (13): Patrick Wendell* Marcelo Vanzin Krishna Sankar Sean Owen* Matei Zaharia* Sandy Ryza Tom Graves* Sean McNamara* Denny Lee Kostas Sakellis Joseph Bradley* Corey Nolet GuoQiang Li 0: -1: I will finalize the release notes a

Re: SparkSpark-perf terasort WIP branch

2015-03-10 Thread Reynold Xin
Hi Ewan, Sorry it took a while for us to reply. I don't know spark-perf that well, but I think this would be problematic if it works with only a specific version of Hadoop. Maybe we can take a different approach -- just have a bunch of tasks using the HDFS client API to read data, and not relying

GitHub Syncing Down

2015-03-10 Thread Michael Armbrust
FYI: https://issues.apache.org/jira/browse/INFRA-9259

RE: [SparkSQL] Reuse HiveContext to different Hive warehouse?

2015-03-10 Thread Cheng, Hao
I am not so sure if Hive supports change the metastore after initialized, I guess not. Spark SQL totally rely on Hive Metastore in HiveContext, probably that's why it doesn't work as expected for Q1. BTW, in most of cases, people configure the metastore settings in hive-site.xml, and will not c

Re: Spark tests hang on local machine due to "testGuavaOptional" in JavaAPISuite

2015-03-10 Thread Sean Owen
Yes and I remember it was caused by ... well something related to the Guava shading and the fact that you're running a mini cluster and then talking to it. I can't remember what exactly resolved it but try a clean build. Somehow I think it had to do with multiple assembly files or something like th

Spark tests hang on local machine due to "testGuavaOptional" in JavaAPISuite

2015-03-10 Thread Ganelin, Ilya
Hi all – building Spark on my local machine with build/mvn clean package test runs until it hits the JavaAPISuite where it hangs indefinitely. Through some experimentation, I’ve narrowed it down to the following test: /** * Test for SPARK-3647. This test needs to use the maven-built assembly t

Re: Spark 1.3 SQL Type Parser Changes?

2015-03-10 Thread Yin Huai
Hi Nitay, Can you try using backticks to quote the column name? Like org.apache.spark.sql.hive.HiveMetastoreTypes.toDataType( "struct<`int`:bigint>")? Thanks, Yin On Tue, Mar 10, 2015 at 2:43 PM, Michael Armbrust wrote: > Thanks for reporting. This was a result of a change to our DDL parser

Re: Spark 1.3 SQL Type Parser Changes?

2015-03-10 Thread Michael Armbrust
Thanks for reporting. This was a result of a change to our DDL parser that resulted in types becoming reserved words. I've filled a JIRA and will investigate if this is something we can fix. https://issues.apache.org/jira/browse/SPARK-6250 On Tue, Mar 10, 2015 at 1:51 PM, Nitay Joffe wrote: >

Spark 1.3 SQL Type Parser Changes?

2015-03-10 Thread Nitay Joffe
In Spark 1.2 I used to be able to do this: scala> org.apache.spark.sql.hive.HiveMetastoreTypes.toDataType("struct") res30: org.apache.spark.sql.catalyst.types.DataType = StructType(List(StructField(int,LongType,true))) That is, the name of a column can be a keyword like "int". This is no longer t

RE: Using CUDA within Spark / boosting linear algebra

2015-03-10 Thread Ulanov, Alexander
I can run benchmark on another machine with GPU nVidia Titan and Intel Xeon E5-2650 v2, although it runs Windows and I have to run Linux tests in VirtualBox. It would be also interesting to add results on netlib+nvblas, however I am not sure I understand in details how to build this and will ap

[SparkSQL] Reuse HiveContext to different Hive warehouse?

2015-03-10 Thread Haopu Wang
I'm using Spark 1.3.0 RC3 build with Hive support. In Spark Shell, I want to reuse the HiveContext instance to different warehouse locations. Below are the steps for my test (Assume I have loaded a file into table "src"). == 15/03/10 18:22:59 INFO SparkILoop: Created sql context (with

SparkSQL 1.3.0 (RC3) failed to read parquet file generated by 1.1.1

2015-03-10 Thread Pei-Lun Lee
Hi, I found that if I try to read parquet file generated by spark 1.1.1 using 1.3.0-rc3 by default settings, I got this error: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'StructType': was expecting ('true', 'false' or 'null') at [Source: StructType(List(StructField(a,Integ