[ 
https://issues.apache.org/jira/browse/HIVE-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549918#comment-15549918
 ] 

Sergey Shelukhin edited comment on HIVE-14870 at 10/5/16 9:02 PM:
------------------------------------------------------------------

We only need very limited functionality compared to DN. The layer like this 
already exists in ACID so I don't see why it cannot be reused and augmented. 
The only changes needed would be the ability to replace some parts to optimize 
for Oracle (or other DBs), which will not be pretty but is imho preferable to 
the alternatives.

As I see it, I would be merely -0 on the thing in itself - it's bad enough to 
have 2.5 SQL "engines" (ORM, the one in acid, and directsql), to add the third 
and then another federation thing that is not hidden on a lower level like the 
direct sql one. The direct sql one caused (and will probably cause ;)) a few 
problems and special cases, simple as it is... plus the confusion with 
failures-that-are-not-really-failures, failure to fall back, sudden unexplained 
slowdowns when the fallback is successful, etc.).
There are probably all kinds of other issues; e.g. off the top of my head, how 
does this work with upgrade scripts - would we need to create and maintain 
another set? Would scripts to switch the schema between the old and the new 
always be the same, or would there need to be a back and forth script for every 
version eventually (I don't think one would ever need that but it is a 
possibility)? Etc.

However, my main meta concern is about the approach - what do we do if someone 
wants to have an optimized MySqlEngine, or MsSqlEngine, AzureEngine, etc? They 
would totally c/p the Oracle one, rewrite a few critical SQL queries, and 
submits a patch. That can quickly turn into a maintenance nightmare.

It appears to me that the existing custom-SQL layer in ACID could be reused, if 
desired (or used as inspiration) to make this store ANSI-ish (does it have any 
significant limitations currently?). That way we can keep query optimizations 
in a plugin (or even a switch statement if need be).
This also has an additional advantage of being able to deprecate and then ditch 
ORM altogether, which would simplify things instead of making them more complex.

Another alternative path (that could be pursued in parallel) is making RawStore 
pluggable so that such specific implementations could be used, while not being 
a supported part of Hive codebase.



was (Author: sershe):
We only need very limited functionality compared to DN. The layer like this 
already exists in ACID so I don't see why it cannot be reused and augmented. 
The only changes needed would be the ability to replace some parts to optimize 
for Oracle (or other DBs), which will not be pretty but is imho preferable to 
the alternatives.

As I see it, I would be merely -0 on the thing in itself - it's bad enough to 
have 2.5 SQL "engines" (ORM, the one in acid, and directsql), to add the third, 
plus another federation thing that is also not hidden on a lower level like the 
direct sql one - and that one caused enough problems and special cases, simple 
as it is... and confusion with the logs of failures that are not really 
failures and sudden unexplained slowdowns).
There are all kinds of issues e.g. how does it work with upgrade scripts - 
would we need to create and maintain the other set? Would scripts to switch the 
schema always be the same, or would there need to be a back and forth script 
for every version eventually? Etc.

However, what do we do if someone wants to have an optimized MySqlEngine, or 
MsSqlEngine, etc. and c/p-s the Oracle one, rewrites a few critical SQL 
queries, and submits a patch? That will quickly turn into a maintenance 
nightmare.

It appears to me that the existing custom-SQL layer in ACID is "good enough" 
and could be reused, if desired (or used as inspiration) to make this ANSI, and 
keep query optimizations to a plugin (or even a switch statement if need be).
This also has an additional advantage of being able to deprecate and then ditch 
ORM altogether, which would simplify things instead of making them more complex.

Another alternative path (that could be pursued in paralle) is making RawStore 
pluggable so that such specific implementations would not be a supported part 
of Hive codebase.


> OracleStore: RawStore implementation optimized for Oracle
> ---------------------------------------------------------
>
>                 Key: HIVE-14870
>                 URL: https://issues.apache.org/jira/browse/HIVE-14870
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Chris Drome
>            Assignee: Chris Drome
>         Attachments: OracleStoreDesignProposal.pdf
>
>
> The attached document is a proposal for a RawStore implementation which is 
> optimized for Oracle and replaces DataNucleus. The document outlines schema 
> changes, OracleStore implementation details, and performance tests against 
> ObjectStore, ObjectStore+DirectSQL, and OracleStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to