date:20191118

Re: Structured Streaming & Enrichment Broadcasts

2019-11-18 Thread Burak Yavuz

If you store the data that you're going to broadcast as a Delta table (see delta.io) and perform a stream-batch (where your Delta table is the batch) join, it will auto-update once the table receives any updates. Best, Burak On Mon, Nov 18, 2019, 6:21 AM Bryan Jeffrey wrote: > Hello. > > We're

Re: SparkR integration with Hive 3 spark-r

2019-11-18 Thread Alfredo Marquez

Hello Nicolas, Well the issue is that with Hive 3, Spark gets it's own metastore, separate from the Hive 3 metastore. So how do you reconcile this separation of metastores? Can you continue to "enableHivemetastore" and be able to connect to Hive 3? Does this connection take advantage of Hive's L

Re: SparkR integration with Hive 3 spark-r

2019-11-18 Thread Nicolas Paris

Hi Alfredo my 2 cents: To my knowlegde and reading the spark3 pre-release note, it will handle hive metastore 2.3.5 - no mention of hive 3 metastore. I made several tests on this in the past[1] and it seems to handle any hive metastore version. However spark cannot read hive managed table AKA tra

SparkR integration with Hive 3 spark-r

2019-11-18 Thread Alfredo Marquez

Hello, Our company is moving to Hive 3, and they are saying that there is no SparkR implementation in Spark 2.3.x + that will connect to Hive 3. Is this true? If it is true, will this be addressed in the Spark 3 release? I don't use python, so losing SparkR to get work done on Hadoop is a huge

Performance of PySpark 2.3.2 on Microsoft Windows

2019-11-18 Thread Wim Van Leuven

Hello, we are writing a lot of data processing pipelines for Spark using pyspark and add a lot of integration tests. In our enterprise environment, a lot of people are running Windows PCs and we notice that build times are really slow on Windows because of the integration tests. These metrics are

Structured Streaming & Enrichment Broadcasts

2019-11-18 Thread Bryan Jeffrey

Hello. We're running applications using Spark Streaming. We're going to begin work to move to using Structured Streaming. One of our key scenarios is to lookup values from an external data source for each record in an incoming stream. In Spark Streaming we currently read the external data, broa

Re: Structured Streaming & Enrichment Broadcasts

Re: SparkR integration with Hive 3 spark-r

Re: SparkR integration with Hive 3 spark-r

SparkR integration with Hive 3 spark-r

Performance of PySpark 2.3.2 on Microsoft Windows

Structured Streaming & Enrichment Broadcasts

6 matches

Site Navigation

Mail list logo

Footer information