date:20170704

Window function / streaming

2017-07-04 Thread Julien CHAMP

Hi there ! Let me explain my problem to see if you have a good solution to help me :) Let's imagine that I have all my data in a DB or a file, that I load in a dataframe DF with the following columns : *id | timestamp(ms) | value* A | 100 | 100 A | 110 | 50 B | 100 | 100 B | 11

Re: Analysis Exception after join

2017-07-04 Thread Bernard Jesop

It seems to be because of this issues: https://issues.apache.org/jira/browse/SPARK-10925 I added a checkpoint, as suggested, to break the lineage and it worked. Best regards, 2017-07-04 17:26 GMT+02:00 Bernard Jesop : > Thank Didac, > > My bad, actually this code is incomplete, it should have b

Re: Analysis Exception after join

2017-07-04 Thread Bernard Jesop

Thank Didac, My bad, actually this code is incomplete, it should have been : - dfAgg = df.groupBy("S_ID").agg(...). I want to access the aggregated values (of dfAgg) for each row of 'df', that is why I do a left outer join. Also, regarding the second parameter, I am using this signature of join

Kafka 0.10 with PySpark

2017-07-04 Thread Daniel van der Ende

Hi, I'm working on integrating some pyspark code with Kafka. We'd like to use SSL/TLS, and so want to use Kafka 0.10. Because structured streaming is still marked alpha, we'd like to use Spark streaming. On this page, however, it indicates that the Kafka 0.10 integration in Spark does not support

RE: [PySpark] - running processes and computing time

2017-07-04 Thread Sidney Feiner

To initialize it per executor, I used a class with only class attibutes and class methods (like an `object` in Scala), but because I was using PySpark, it was actually being initiated per process ☹ What I went for was the broadcast variable but there still is something suspicious with my applic

SparkSession via HS2 - is it supported?

2017-07-04 Thread Sudha KS

This is the code: created a java class by extending org.apache.hadoop.hive.ql.udf.generic.GenericUDTF; ,and creates a sparkSession as- SparkSession spark = SparkSession.builder().enableHiveSupport().master("yarn-client").appName("SampleSparkUDTF_yarnV1").getOrCreate(); ,and tries to rea

Window function / streaming

Re: Analysis Exception after join

Re: Analysis Exception after join

Kafka 0.10 with PySpark

RE: [PySpark] - running processes and computing time

SparkSession via HS2 - is it supported?

6 matches

Site Navigation

Mail list logo

Footer information