[ 
https://issues.apache.org/jira/browse/HIVE-14870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549918#comment-15549918
 ] 

Sergey Shelukhin edited comment on HIVE-14870 at 10/5/16 9:07 PM:
------------------------------------------------------------------

We only need very limited functionality compared to DN. The layer like this 
already exists in ACID so I don't see why it cannot be reused and augmented. 
The only changes needed would be the ability to replace some parts to optimize 
for Oracle (or other DBs), via some sort of a plugin option (or even a switch 
statement) which will not be pretty but is imho preferable to the alternatives.

As I see it, I would be merely -0 on the thing in itself - it's bad enough to 
have 2.5 SQL "engines" (ORM, the one in acid, and directsql), to add the third 
and then another federation thing that is not hidden on a lower level like the 
direct sql one. The direct sql one caused (and will probably cause ;)) a few 
problems and special cases, simple as it is... plus the confusion with 
failures-that-are-not-really-failures, failure to fall back, sudden unexplained 
slowdowns when the fallback is successful, etc.).
There are probably all kinds of other issues; e.g. off the top of my head, how 
does this work with upgrade scripts - would we need to create and maintain 
another set? Would scripts to switch the schema between the old and the new 
always be the same, or would there need to be a back and forth script for every 
version eventually (I don't think one would ever need that but it is a 
possibility)? Etc.

However, my main meta concern is about the approach - what do we do if someone 
wants to have an optimized MySqlEngine, or MsSqlEngine, AzureEngine, etc? They 
would totally c/p the Oracle one, rewrite a few critical SQL queries, and 
submits a patch. That can quickly turn into a maintenance nightmare.

It appears to me that the existing custom-SQL layer in ACID could be reused, if 
desired (or used as inspiration) to make this store ANSI-ish (does it have any 
significant limitations currently?). That way we can keep query optimizations 
in a plugin (or even a switch statement if need be).
This also has an additional advantage of being able to deprecate and then ditch 
ORM altogether, which would simplify things instead of making them more complex.

Another alternative path (that could be pursued in parallel) is making RawStore 
pluggable so that such specific implementations could be used, while not being 
a supported part of Hive codebase.

Perhaps if there is already a patch we can have a collective effort to do the 
ANSI SQL thing. Making an entirely SQL access layer is a very valuable thing to 
Hive community... however we want to make sure that we don't actually go in 
opposite direction with this effort.



was (Author: sershe):
We only need very limited functionality compared to DN. The layer like this 
already exists in ACID so I don't see why it cannot be reused and augmented. 
The only changes needed would be the ability to replace some parts to optimize 
for Oracle (or other DBs), via some sort of a plugin option (or even a switch 
statement) which will not be pretty but is imho preferable to the alternatives.

As I see it, I would be merely -0 on the thing in itself - it's bad enough to 
have 2.5 SQL "engines" (ORM, the one in acid, and directsql), to add the third 
and then another federation thing that is not hidden on a lower level like the 
direct sql one. The direct sql one caused (and will probably cause ;)) a few 
problems and special cases, simple as it is... plus the confusion with 
failures-that-are-not-really-failures, failure to fall back, sudden unexplained 
slowdowns when the fallback is successful, etc.).
There are probably all kinds of other issues; e.g. off the top of my head, how 
does this work with upgrade scripts - would we need to create and maintain 
another set? Would scripts to switch the schema between the old and the new 
always be the same, or would there need to be a back and forth script for every 
version eventually (I don't think one would ever need that but it is a 
possibility)? Etc.

However, my main meta concern is about the approach - what do we do if someone 
wants to have an optimized MySqlEngine, or MsSqlEngine, AzureEngine, etc? They 
would totally c/p the Oracle one, rewrite a few critical SQL queries, and 
submits a patch. That can quickly turn into a maintenance nightmare.

It appears to me that the existing custom-SQL layer in ACID could be reused, if 
desired (or used as inspiration) to make this store ANSI-ish (does it have any 
significant limitations currently?). That way we can keep query optimizations 
in a plugin (or even a switch statement if need be).
This also has an additional advantage of being able to deprecate and then ditch 
ORM altogether, which would simplify things instead of making them more complex.

Another alternative path (that could be pursued in parallel) is making RawStore 
pluggable so that such specific implementations could be used, while not being 
a supported part of Hive codebase.


> OracleStore: RawStore implementation optimized for Oracle
> ---------------------------------------------------------
>
>                 Key: HIVE-14870
>                 URL: https://issues.apache.org/jira/browse/HIVE-14870
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Chris Drome
>            Assignee: Chris Drome
>         Attachments: OracleStoreDesignProposal.pdf
>
>
> The attached document is a proposal for a RawStore implementation which is 
> optimized for Oracle and replaces DataNucleus. The document outlines schema 
> changes, OracleStore implementation details, and performance tests against 
> ObjectStore, ObjectStore+DirectSQL, and OracleStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to