Hi all, I set up my HiveServer2(version 1.2.1) with remote metastore database(mysql) and embedded metastore server(hive.metastore.uris remains empty). Strange things happened when I do ALTER TABLE operations: 1. Open Beeline and Hive CLI in two window. Beeline connected to localhost(on which HiveServer2 is running), and Hive CLI was opened on the same machine. 2. CREATE table with two columns in Beeline and DESC the table in both Beeline and Hive CLI. Got the correct result. 3. ALTER that table add another column in Beeline. DESC it in Beeline resulted in three columns, but in Hive CLI, no columns were showed! I looked at the COLUMNS_V2 table in mysql metastore, and there were three columns. I tried this because I’m working on a SparkSQL project, and I found similar problem when I run ALTER TABLE sentence in a sparksession and desc it in HiveCLI or Beeline, and vice versa. But when I start the metastore thrift service and set hive.metastore.uris in hive-site.xml, this problem no longer appears: No matter Hive CLI, Hive Beeline or SparkSession altered a table, others can see the correct new table schema.
I have two questions here: 1. According to https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin <https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin>, hive metastore is stateless, so I thought when HiveServer2/SparkSession has an embedded metastore, it just act like a metastore client and fetch metadata from mysql, but it seems that it’s not that simple. How does the embedded metastore work? Suppose it is stateful, for example metadata is cached, it cannot explain why DESC show no columns, instead of showing two original columns. 2. Why is even Hive Cli and Beeline on the same machine got different metadata? I know that Hive CLI is lagacy tool, but if the implementation, for example the way to fetch metadata is different, why are they seeing the same metadata after I start the metastore thrift service? Thanks, John.xu