Hi Alex, As stated in the Hive documentation (https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration):
*An embedded metastore database is mainly used for unit tests. Only one process can connect to the metastore database at a time, so it is not really a practical solution but works well for unit tests.* You need to set up a remote metastore database (e.g. MariaDB / MySQL) for production use. Regards, Christophe. On 3/30/22 13:31, Alex Kosberg wrote: > > Hi, > > Some details: > > · Spark SQL (version 3.2.1) > > · Driver: Hive JDBC (version 2.3.9) > > · ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 > with 5...500 worker threads > > · BI tool is connect via odbc driver > > After activating Spark Thrift Server I'm unable to run pyspark script using > spark-submit as they both use the same metastore_db > > error: > > Caused by: ERROR XJ040: Failed to start database 'metastore_db' with class > loader > org.apache.spark.sql.hive.client.IsolatedClientLoader$$anon$1@3acaa384, see > the next exception for details. > > at org.apache.derby.iapi.error.StandardException.newException(Unknown > Source) > > at > org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown > Source) > > ... 140 more > > Caused by: ERROR XSDB6: Another instance of Derby may have already booted the > database /tmp/metastore_db. > > > > I need to be able to run PySpark (Spark ETL) while having spark thrift server > up for BI tool queries. Any workaround for it? > > Thanks! > > > > > Notice: This e-mail together with any attachments may contain information of > Ribbon Communications Inc. and its Affiliates that is confidential and/or > proprietary for the sole use of the intended recipient. Any review, > disclosure, reliance or distribution by others or forwarding without express > permission is strictly prohibited. If you are not the intended recipient, > please notify the sender immediately and then delete all copies, including > any attachments.
