ok so as expected the underlying database is Hive. Hive uses hdfs storage.
You said you encountered limitations on concurrent writes. The order and
limitations are introduced by Hive metastore so to speak. Since this is all
happening through Spark, by default implementation of the Hive metastore
<
Hi Mich and Pol,
Thanks for the feedback. The database layer is Hadoop 3.3.5. The cluster
restarted so I lost the stack trace in the application UI. In the snippets
I saved, it looks like the exception being thrown was from Hive. Given the
feedback you've provided, I suspect the issue is with how
Hi Patrick,
You can have multiple writers simultaneously writing to the same table in
HDFS by utilizing an open table format with concurrency control. Several
formats, such as Apache Hudi, Apache Iceberg, Delta Lake, and Qbeast
Format, offer this capability. All of them provide advanced features t
It is not Spark SQL that throws the error. It is the underlying Database or
layer that throws the error.
Spark acts as an ETL tool. What is the underlying DB where the table
resides? Is concurrency supported. Please send the error to this list
HTH
Mich Talebzadeh,
Solutions Architect/Engineeri
Hello,
I'm building an application on Spark SQL. The cluster is set up in
standalone mode with HDFS as storage. The only Spark application running is
the Spark Thrift Server using FAIR scheduling mode. Queries are submitted
to Thrift Server using beeline.
I have multiple queries that insert rows