Why do not push sort into shuffle in Exchange operator

2017-12-10 Thread John Fang
Hi All, Spark RDD pushes sorting operations and delegates aggregation into the shuffle layer by specifying a key ordering as part of the shuffle dependency. Now Spark SQL doesn't push sort and delegate aggregation into the shuffle layer,as the SPARK-8317

Infer JSON schema in structured streaming Kafka.

2017-12-10 Thread satyajit vegesna
Hi All, I would like to infer JSON schema from a sample of data that i receive from, Kafka Streams(specific topic), and i have to infer the schema as i am going to receive random JSON string with different schema for each topic, so i chose to go ahead with below steps, a. readStream from Kafka(la

Re: GenerateExec, CodegenSupport and supportCodegen flag off?!

2017-12-10 Thread Stephen Boesch
A relevant observation: there was a closed/executed jira last year to remove the option to disable the codegen flag (and unsafe flag as well): https://issues.apache.org/jira/browse/SPARK-11644 2017-12-10 13:16 GMT-08:00 Jacek Laskowski : > Hi, > > I'm wondering why a physical operator like Gener

GenerateExec, CodegenSupport and supportCodegen flag off?!

2017-12-10 Thread Jacek Laskowski
Hi, I'm wondering why a physical operator like GenerateExec would extend CodegenSupport [1], but had the supportCodegen flag turned off? What's the meaning of such a combination -- be a CodegenSupport with supportCodegen off? [1] https://github.com/apache/spark/blob/master/sql/core/src/main/scal