Re: spark ETL and spark thrift server running together

Christophe Préaud Wed, 30 Mar 2022 06:01:13 -0700

Hi Alex,

As stated in the Hive documentation 
(https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration):


*An embedded metastore database is mainly used for unit tests. Only one process 
can connect to the metastore database at a time, so it is not really a 
practical solution but works well for unit tests.*


You need to set up a remote metastore database (e.g. MariaDB / MySQL) for 
production use.

Regards,
Christophe.

On 3/30/22 13:31, Alex Kosberg wrote:
>
> Hi,
>
> Some details:
>
> ·         Spark SQL (version 3.2.1)
>
> ·         Driver: Hive JDBC (version 2.3.9)
>
> ·         ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 
> with 5...500 worker threads
>
> ·         BI tool is connect via odbc driver
>
> After activating Spark Thrift Server I'm unable to run pyspark script using 
> spark-submit as they both use the same metastore_db
>
> error:
>
> Caused by: ERROR XJ040: Failed to start database 'metastore_db' with class 
> loader 
> org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@3acaa384, see 
> the next exception for details.
>
>         at org.apache.derby.iapi.error.StandardException.newException(Unknown 
> Source)
>
>         at 
> org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown
>  Source)
>
>         ... 140 more
>
> Caused by: ERROR XSDB6: Another instance of Derby may have already booted the 
> database /tmp/metastore_db.
>
>  
>
> I need to be able to run PySpark (Spark ETL) while having spark thrift server 
> up for BI tool queries. Any workaround for it?
>
> Thanks!
>
>  
>
>
> Notice: This e-mail together with any attachments may contain information of 
> Ribbon Communications Inc. and its Affiliates that is confidential and/or 
> proprietary for the sole use of the intended recipient. Any review, 
> disclosure, reliance or distribution by others or forwarding without express 
> permission is strictly prohibited. If you are not the intended recipient, 
> please notify the sender immediately and then delete all copies, including 
> any attachments.

Re: spark ETL and spark thrift server running together

Reply via email to