Hello!

I've implemented simple solution with some hard code by now.
It's tested with oracle database.

{code:sql}
beeline> !connect jdbc:hive2://localhost:10000
Connecting to jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000:
Enter password for jdbc:hive2://localhost:10000:
Connected to: Apache Hive (version 2.2.0-SNAPSHOT)
Driver: Hive JDBC (version 2.2.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> SET 
hive.metastore.warehouse.dir=${env:HOME}/Documents/hive-warehouse;
No rows affected (0.158 seconds)
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> CREATE EXTERNAL TABLE books3 (
. . . . . . . . . . . . . . . .>  book_id            INT,
. . . . . . . . . . . . . . . .>  book_name          STRING,
. . . . . . . . . . . . . . . .>  author_name        STRING,
. . . . . . . . . . . . . . . .>  book_isbn          STRING
. . . . . . . . . . . . . . . .> )
. . . . . . . . . . . . . . . .> STORED BY 
"org.apache.hive.storagehandler.JDBCStorageHandler"
. . . . . . . . . . . . . . . .> TBLPROPERTIES (
. . . . . . . . . . . . . . . .>  "mapred.jdbc.driver.class" = 
"oracle.jdbc.OracleDriver",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.url" = 
"jdbc:oracle:thin:@//localhost:49161/XE",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.username" = "*",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.password" = "*",
. . . . . . . . . . . . . . . .>  "hive.jdbc.update.on.duplicate" = "true",
. . . . . . . . . . . . . . . .>  "mapreduce.jdbc.input.table.name" = "books"
. . . . . . . . . . . . . . . .> );
No rows affected (2.297 seconds)
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000>
0: jdbc:hive2://localhost:10000> select * from books3;
+-----------------+-------------------+---------------------+-------------------+
| books3.book_id  | books3.book_name  | books3.author_name  | books3.book_isbn  
|
+-----------------+-------------------+---------------------+-------------------+
| 124123          | name              | author              | 132321adsaf31     
|
| 13              | name2             | author2             | asd213fadsf       
|
| 2345236         | name3             | author3             | asdfds1234123     
|
+-----------------+-------------------+---------------------+-------------------+
3 rows selected (2.146 seconds)
0: jdbc:hive2://localhost:10000> explain select * from books3;
+----------------------------------------------------+
|                      Explain                       |
+----------------------------------------------------+
| STAGE DEPENDENCIES:                                |
|   Stage-0 is a root stage                          |
|                                                    |
| STAGE PLANS:                                       |
|   Stage: Stage-0                                   |
|     Fetch Operator                                 |
|       limit: -1                                    |
|       Processor Tree:                              |
|         TableScan                                  |
|           alias: books3                            |
|           Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE |
|           Select Operator                          |
|             expressions: book_id (type: string), book_name (type: string), 
author_name (type: string), book_isbn (type: string) |
|             outputColumnNames: _col0, _col1, _col2, _col3 |
|             Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column 
stats: NONE |
|             ListSink                               |
|                                                    |
+----------------------------------------------------+
17 rows selected (0.508 seconds)
{code}

This solution works with two steps:
        1. First grab all meta info from external table 
        2. Configure DBInputFormat, DBOutputFormat with table meta

What do you think about to ask user specify all needed information about 
columns and types inside serde properties section?

Smth like that:

0: jdbc:hive2://localhost:10000> CREATE EXTERNAL TABLE books3 (
. . . . . . . . . . . . . . . .>  book_id            INT,
. . . . . . . . . . . . . . . .>  book_name          STRING,
. . . . . . . . . . . . . . . .>  author_name        STRING,
. . . . . . . . . . . . . . . .>  book_isbn          STRING
. . . . . . . . . . . . . . . .> )
. . . . . . . . . . . . . . . .> STORED BY 
“org.apache.hive.storagehandler.JDBCStorageHandler"
                WITH SERDEPROPERTIES (
                                        "hive.jdbc.columns.mapping" = 
“book_id:int(32), book_name:varchar(20), author_name:varchar(20), 
book_isbn:varchar(20)")         
. . . . . . . . . . . . . . . .> TBLPROPERTIES (
. . . . . . . . . . . . . . . .>  "mapred.jdbc.driver.class" = 
"oracle.jdbc.OracleDriver",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.url" = 
"jdbc:oracle:thin:@//localhost:49161/XE",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.username" = "*",
. . . . . . . . . . . . . . . .>  "mapred.jdbc.password" = "*",
. . . . . . . . . . . . . . . .>  "hive.jdbc.update.on.duplicate" = "true",
. . . . . . . . . . . . . . . .>  "mapreduce.jdbc.input.table.name" = "books"
. . . . . . . . . . . . . . . .> );


 

Reply via email to