If you store the data that you're going to broadcast as a Delta table (see
delta.io) and perform a stream-batch (where your Delta table is the batch)
join, it will auto-update once the table receives any updates.
Best,
Burak
On Mon, Nov 18, 2019, 6:21 AM Bryan Jeffrey wrote:
> Hello.
>
> We're
Hello Nicolas,
Well the issue is that with Hive 3, Spark gets it's own metastore, separate
from the Hive 3 metastore. So how do you reconcile this separation of
metastores?
Can you continue to "enableHivemetastore" and be able to connect to Hive 3?
Does this connection take advantage of Hive's L
Hi Alfredo
my 2 cents:
To my knowlegde and reading the spark3 pre-release note, it will handle
hive metastore 2.3.5 - no mention of hive 3 metastore. I made several
tests on this in the past[1] and it seems to handle any hive metastore
version.
However spark cannot read hive managed table AKA tra
Hello,
Our company is moving to Hive 3, and they are saying that there is no
SparkR implementation in Spark 2.3.x + that will connect to Hive 3. Is
this true?
If it is true, will this be addressed in the Spark 3 release?
I don't use python, so losing SparkR to get work done on Hadoop is a huge
Hello,
we are writing a lot of data processing pipelines for Spark using pyspark
and add a lot of integration tests.
In our enterprise environment, a lot of people are running Windows PCs and
we notice that build times are really slow on Windows because of the
integration tests. These metrics are
Hello.
We're running applications using Spark Streaming. We're going to begin
work to move to using Structured Streaming. One of our key scenarios is to
lookup values from an external data source for each record in an incoming
stream. In Spark Streaming we currently read the external data, broa