Re: Are we running SparkR tests in Jenkins?

2016-01-15 Thread Jeff Zhang
Created https://issues.apache.org/jira/browse/SPARK-12846 On Fri, Jan 15, 2016 at 3:29 PM, Jeff Zhang wrote: > Right, I forget the documentation, will create a follow up jira. > > On Fri, Jan 15, 2016 at 3:23 PM, Shivaram Venkataraman < > shiva...@eecs.berkeley.edu> wrote: > >> Ah I see. I was

Re: Are we running SparkR tests in Jenkins?

2016-01-15 Thread Jeff Zhang
Right, I forget the documentation, will create a follow up jira. On Fri, Jan 15, 2016 at 3:23 PM, Shivaram Venkataraman < shiva...@eecs.berkeley.edu> wrote: > Ah I see. I wasn't aware of that PR. We should do a find and replace > in all the documentation and rest of the repository as well. > > S

Re: Are we running SparkR tests in Jenkins?

2016-01-15 Thread Shivaram Venkataraman
Ah I see. I wasn't aware of that PR. We should do a find and replace in all the documentation and rest of the repository as well. Shivaram On Fri, Jan 15, 2016 at 3:20 PM, Reynold Xin wrote: > +Shivaram > > Ah damn - we should fix it. > > This was broken by https://github.com/apache/spark/pull/1

Re: Are we running SparkR tests in Jenkins?

2016-01-15 Thread Shivaram Venkataraman
Yes - we should be running R tests AFAIK. That error message is a deprecation warning about the script `bin/sparkR` which needs to be changed in https://github.com/apache/spark/blob/7cd7f2202547224593517b392f56e49e4c94cabc/R/run-tests.sh#L26 to bin/spark-submit. Thanks Shivaram On Fri, Jan 15, 2

Re: Are we running SparkR tests in Jenkins?

2016-01-15 Thread Reynold Xin
+Shivaram Ah damn - we should fix it. This was broken by https://github.com/apache/spark/pull/10658 - which removed a functionality that has been deprecated since Spark 1.0. On Fri, Jan 15, 2016 at 3:19 PM, Herman van Hövell tot Westerflier < hvanhov...@questtec.nl> wrote: > Hi all, > > I j

Are we running SparkR tests in Jenkins?

2016-01-15 Thread Herman van Hövell tot Westerflier
Hi all, I just noticed the following log entry in Jenkins: > Running SparkR tests > > Running R applications through 'sparkR' is not supported as of Sp

Partitioned parquet files missing partition columns from data

2016-01-15 Thread vonnagy
When writing a DataFrame into partitioned parquet files, the partition columns are removed from the data. For example: df.write.mode(SaveMode.Append).partitionBy('year','month','day', 'hour').parquet(somePath) This creates a directory structure like: events |-> 2016 |-> 1 |-> 15

Re: Spark Streaming KafkaUtils missing Save API?

2016-01-15 Thread Benjamin Fradet
There was a PR regarding this which was closed but the author of the PR created a spark-package: https://github.com/cloudera/spark-kafka-writer. I don't know exactly why it was decided not be incorporated into spark however. On 15 Jan 2016 8:04 p.m., "Renyi Xiong" wrote: > Hi, > > We noticed the

Re: Tungsten in a mixed endian environment

2016-01-15 Thread Nong Li
On Fri, Jan 15, 2016 at 1:30 AM, Tim Preece wrote: > So if Spark does not support heterogeneous endianness clusters, should > Spark > at least always support homogeneous endianess clusters ? > > I ask because I just noticed > https://issues.apache.org/jira/browse/SPARK-12785 which appears to be >

Spark Streaming KafkaUtils missing Save API?

2016-01-15 Thread Renyi Xiong
Hi, We noticed there's no Save method in KafkaUtils. we do have scenarios where we want to save RDD back to Kafka queue to be consumed by down stream streaming applications. I wonder if this is a common scenario, if yes, any plan to add it? Thanks, Renyi.

[1.6] Coalesce/binary operator on casted named column

2016-01-15 Thread Robert Kruszewski
Hi Spark devs, I have been debugging failing unit test in our application and it led me to believe that the bug is in spark itself. The exception I am getting is org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to dataType on unresolved object, tree: unresolvedalias(cast(

Re: Tungsten in a mixed endian environment

2016-01-15 Thread Tim Preece
So if Spark does not support heterogeneous endianness clusters, should Spark at least always support homogeneous endianess clusters ? I ask because I just noticed https://issues.apache.org/jira/browse/SPARK-12785 which appears to be introducing a new feature designed for Little Endian only. -

Re: Elasticsearch sink for metrics

2016-01-15 Thread Nick Pentreath
I haven't come across anything, but could you provide more detail on what issues you're encountering? On Fri, Jan 15, 2016 at 11:09 AM, Pete Robbins wrote: > Has anyone tried pushing Spark metrics into elasticsearch? We have other > metrics, eg some runtime information, going into ES and would

Elasticsearch sink for metrics

2016-01-15 Thread Pete Robbins
Has anyone tried pushing Spark metrics into elasticsearch? We have other metrics, eg some runtime information, going into ES and would like to be able to combine this with the Spark metrics for visualization with Kibana. I experimented with a new sink using ES's ElasticsearchReporter for the Coda