date:20180707

Re: [SPARK on MESOS] Avoid re-fetching Spark binary

2018-07-07 Thread Mark Hamstra

Essentially correct. The latency to start a Spark Job is nowhere close to 2-4 seconds under typical conditions. Creating a new Spark Application every time instead of running multiple Jobs in one Application is not going to lead to acceptable interactive or real-time performance, nor is that an exe

Re: Unable to see the table created using saveAsTable From Beeline. Please help!

2018-07-07 Thread Mich Talebzadeh

Hi Anna, Google this Spark Dataframe and HIVE HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com *Disc

Re: Unable to see the table created using saveAsTable From Beeline. Please help!

2018-07-07 Thread anna stax

Is some configuration missing ? Appreciate any help On Fri, Jul 6, 2018 at 4:10 PM, anna stax wrote: > I am running spark 2.1.0 on AWS EMR > > In my Zeppelin Note I am creating a table > > df.write > .format("parquet") > .saveAsTable("default.1test") > > and I see the table when I

Structured streaming

2018-07-07 Thread amin MH

Can anyone help me to understand in what use cases we should use spark structure streaming? We currently are dealing with flat files that are generating on a daily basis, which are time series data , we currently push the data into influxdb and visualise it Grafana, my question is that can we us