Hi Team,

With several team members we are working on optimizing ACID transaction related 
metastore calls.
Specifically aiming to have non-blocking openTxns calls (so 2 parallel openTxns 
can run without blocking each other) which could increase the throughput of 
Hive a lot.

Currently based on this document 
(https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration
 
<https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration>)
 we support:

MySQL   5.6.17  mysql    
Postgres        9.1.13  
postgres
 
Oracle  11g     oracle  hive.metastore.orm.retrieveMapNullsAsEmptyStrings 
<https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.orm.retrieveMapNullsAsEmptyStrings>
MS SQL Server   2008 R2 mssql

All of the databases above support one way or another to generate IDENTITY/AUTO 
INCREMENT values, but Oracle 11g. We could use different SQL queries for Oracle 
11g (like SEQUENCE + NEXT_VAL), but that would mean that we have to generate 
different queries for Oracle backward compatibility.

Oracle 11g Extended support due to expire in 31st December 2020, see: 
https://www.oracle.com/webfolder/community/oracle_database/3905940.html 
<https://www.oracle.com/webfolder/community/oracle_database/3905940.html>
I do not see too much overlap between Oracle 11g, and Hive 4.0.0.

Since Oracle 12c supports IDENTITY columns as well, I propose that for Hive 
4.0.0 we support only Oracle 12c instead of adding quickly outdated complexity.

Any thoughts, ideas are welcome.

Thanks,
Peter





Reply via email to