[ 
https://issues.apache.org/jira/browse/HIVE-26046?focusedWorklogId=780513&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780513
 ]

ASF GitHub Bot logged work on HIVE-26046:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Jun/22 14:32
            Start Date: 11/Jun/22 14:32
    Worklog Time Spent: 10m 
      Work Description: zhangbutao commented on code in PR #3276:
URL: https://github.com/apache/hive/pull/3276#discussion_r895032467


##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/dataconnector/jdbc/MySQLConnectorProvider.java:
##########
@@ -90,10 +90,20 @@ protected String getDataType(String dbDataType, int size) {
     // map any db specific types here.
     switch (dbDataType.toLowerCase())
     {
+    case "bit":
+      return toHiveBitType(size);
     default:
       mappedType = ColumnType.VOID_TYPE_NAME;
       break;
     }
     return mappedType;
   }
+
+  private String toHiveBitType(int size) {
+    if (size <= 1) {
+      return ColumnType.BOOLEAN_TYPE_NAME;
+    } else {
+      return ColumnType.BIGINT_TYPE_NAME;

Review Comment:
   @nrg4878 Sorry for not updating the issue for a long time. 
   In my opinion, we have two ways to get origina MySQL BITs values, that is 
b'111000' in MySQL will be read as '111000'(binary number) in hive.
   
   First way:
   In MySQL, we can get bit values using 'select bin(days) from work_calendar'. 
So i was thinking how we use MySQL native language to get bit values in hive. 
After some research, I found we could use jdbc table computation 
pushdown(hive.sql.query) to rewrite query sql. That is to say, when we query a 
bit type table using 'select days from work_calendar' in hive, we can rewrite 
the pushdown query to 'select bin(days) from work_calendar'. In this way, we 
may need check original MySQL data type and rewrite bit type query when 
pushdown. Maybe we should do some change in JdbcSerde.java.
   
https://cwiki.apache.org/confluence/display/Hive/JDBC+Storage+Handler#JDBCStorageHandler-ComputationPushdown
   
   Second way:
   I was also thinking using udf to to cast valus to BITs as you suggested. 
That is, 56(bigint) is cast to '111000'(binary number) in hive internal. 
Fortunately, I found hive already has this udf bin:
   
https://github.com/apache/hive/blob/f29cb2245c97102975ea0dd73783049eaa0947a0/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBin.java#L28-L37
   So, based on this pr, we can get bit values using hive query 'select 
bin(days) from work_calendar'. And we no need change current jdbc 
code(JdbcSerDe.java or others). I like this simple way as we can reuse bin udf 
to acheive our goal.
   
   wdyt? if you have better ways, feel free to come up with. thank you.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 780513)
    Time Spent: 2h 10m  (was: 2h)

> MySQL's bit datatype is default to void datatype in hive
> --------------------------------------------------------
>
>                 Key: HIVE-26046
>                 URL: https://issues.apache.org/jira/browse/HIVE-26046
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Standalone Metastore
>    Affects Versions: 4.0.0
>            Reporter: Naveen Gangam
>            Assignee: zhangbutao
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> describe on a table that contains a "bit" datatype gets mapped to void. We 
> need a explicit conversion logic in the MySQL ConnectorProvider to map it to 
> a suitable datatype in hive.
> {noformat}
> +-------------------------------+---------------------------------------------------+----------------------------------------------------+
> |           col_name            |                     data_type               
>       |                      comment                       |
> +-------------------------------+---------------------------------------------------+----------------------------------------------------+
> | tbl_id                        | bigint                                      
>       | from deserializer                                  |
> | create_time                   | int                                         
>       | from deserializer                                  |
> | db_id                         | bigint                                      
>       | from deserializer                                  |
> | last_access_time              | int                                         
>       | from deserializer                                  |
> | owner                         | varchar(767)                                
>       | from deserializer                                  |
> | owner_type                    | varchar(10)                                 
>       | from deserializer                                  |
> | retention                     | int                                         
>       | from deserializer                                  |
> | sd_id                         | bigint                                      
>       | from deserializer                                  |
> | tbl_name                      | varchar(256)                                
>       | from deserializer                                  |
> | tbl_type                      | varchar(128)                                
>       | from deserializer                                  |
> | view_expanded_text            | string                                      
>       | from deserializer                                  |
> | view_original_text            | string                                      
>       | from deserializer                                  |
> | is_rewrite_enabled            | void                                        
>       | from deserializer                                  |
> | write_id                      | bigint                                      
>       | from deserializer  
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to