Hi Valentine,
I think your issue is related to https://issues.apache.org/jira/browse/HIVE-28198 . IMO, some other upstrem engine(Trino/Spark) will also encounter this issue. You can try the workaround by setting the property metastore.metadata.transformer.class to empty to disable the transformer, and then the behavior(create table) of hive4 will be the same as hive3. | <property> <name>metastore.metadata.transformer.class/name> <value> </value> </property> | Thanks, Butao Zhang ---- Replied Message ---- | From | l<mr.tols...@gmail.com> | | Date | 12/7/2024 18:42 | | To | <dev@hive.apache.org> | | Subject | Transition from hive3 to hive4 | Thank you for developing hive. We have all been waiting for hive4 to come out for a long time. We have a huge dwh with hive3, in our assembly we are switching to hive4. We do not understand how it works, asf advised us to contact you. A little introduction, we do not have acid tables. We just have parquet tables that We create via create table, create external table. We have 2 problem areas: Hive DB: Hive3 is just location=hive.warehouse.dir (both managed and unmanaged tables are stored here) Hive4 has 2 paths 1 - location=hive.warehouse.external.dir for unmanaged tables? 2 - managelocation=hive.warehouse.dir for managed tables? If we already have storage on hive3, what should we do? As far as I understand, we should take the path from hive3 hive.warehouse.dir and write it to managelocation? and create a new path for location via hive.warehouse.external.dir? Next we want to understand by what principle and through what ddl constructions tables are written to the location, managelocation dir. Let me remind you that we do not have acid tables. Hive DDL: We noticed that all our tables that we created became inside create external table. The only difference in them is that some have translated_to_external=true external.table.purge=true And some tables do not. Those that do not have this (external.table.purge=true) do we consider them unmanaged? It does not matter whether we write create table or create external table inside all ddl show that they create external table and they are written exclusively to the location dir. We were unable to write the table to the managelocation dir. When we write create external table inside there is no external.table.purge=true And if we write create table inside there is external.table.purge=true. But there is not a single table, neither old nor new, that we create, that inside the dll it was create table. That is, all tables automatically become create external table, even new ones, although we did not ask for it. + some tables became manage, but by what principle is also unclear. We are completely confused and discouraged by the behavior of hive4. Guys, help us figure it out, we have a petabyte of data and thousands of users, but we cannot explain to them now how hive4 works. Valentine Smith Big Data Solution Architect