[ https://issues.apache.org/jira/browse/SQOOP-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937896#comment-16937896 ]
Mahesh Balakrishnan edited comment on SQOOP-3451 at 9/25/19 5:17 PM: --------------------------------------------------------------------- [~dionusos], Using oracle dump command for float CREATE TABLE "SQOOP"."FLOATTEST" ( "IFLOAT" FLOAT(30) ) SELECT dump(IFLOAT) FROM SQOOP.FLOATTEST *OUTPUT* >>>> Typ=2 Len=2: 193,13,13 Then to understand more I looked into the Datatypes here: Oracle 11g: https://docs.oracle.com/cd/B28359_01/server.111/b28285/sqlqr06.htm#CHDBBHHE Oracle 12c: https://docs.oracle.com/database/121/SQLRF/sql_elements001.htm#SQLRF0021 Where they have mentioned that Code 2 is for both NUMBER and FLOAT but need to look at the scale which shows whether its a NUMBER or FLOAT ++++++++++++++++++++++++++++++++++++++++++++ The NUMERIC and DECIMAL datatypes can specify only fixed-point numbers. For those datatypes, the scale (s) defaults to 0. The FLOAT datatype is a floating-point number with a binary precision b. The default precision for this datatype is 126 binary, or 38 decimal. ++++++++++++++++++++++++++++++++++++++++++++ Since Sqoop is able to get the Scale of the datatype, is it possible to get the scale and then map it across as Float? I know it may take time to fix this but we need to document this so the users leveraging sqoop to load data from Oracle to Hive know what they need to do. was (Author: mbalakrishnan): [~dionusos], Using oracle dump command for float CREATE TABLE "SQOOP"."FLOATTEST" ( "IFLOAT" FLOAT(30) ) SELECT dump(IFLOAT) FROM SQOOP.FLOATTEST *OUTPUT* >>>> Typ=2 Len=2: 193,13,13 Then to understand more I looked into the Datatypes here: https://docs.oracle.com/cd/B28359_01/server.111/b28285/sqlqr06.htm#CHDBBHHE Where they have mentioned that Code 2 is for both NUMBER and FLOAT but need to look at the scale which shows whether its a NUMBER or FLOAT ++++++++++++++++++++++++++++++++++++++++++++ The NUMERIC and DECIMAL datatypes can specify only fixed-point numbers. For those datatypes, the scale (s) defaults to 0. The FLOAT datatype is a floating-point number with a binary precision b. The default precision for this datatype is 126 binary, or 38 decimal. ++++++++++++++++++++++++++++++++++++++++++++ Since Sqoop is able to get the Scale of the datatype, is it possible to get the scale and then map it across as Float? I know it may take time to fix this but we need to document this so the users leveraging sqoop to load data from Oracle to Hive know what they need to do. > Importing FLOAT from Oracle to Hive results in INTEGER > ------------------------------------------------------ > > Key: SQOOP-3451 > URL: https://issues.apache.org/jira/browse/SQOOP-3451 > Project: Sqoop > Issue Type: Bug > Components: codegen, connectors/oracle, hive-integration > Affects Versions: 1.4.7 > Reporter: Denes Bodo > Priority: Major > > We ran into an issue where there is a table created in Oracle 11g: > {noformat} > create table floattest (column1 float(30), column2 number(30,-127), column3 > number(30)); > {noformat} > We want to import date from Oracle to Hive: > {noformat} > sqoop import -D > mapred.child.java.opts='-Djava.security.egd=file:/dev/../dev/urandom' > -Dmapreduce.job.queuename=default --connect > "jdbc:oracle:thin:@DBHOST:1521/xe" --username sqoop --password sqoop --table > floattest --hcatalog-database default --hcatalog-table floattest > --create-hcatalog-table --hcatalog-external-table --hcatalog-storage-stanza > "stored as orc" -m 1 --columns COLUMN1,COLUMN2,COLUMN3 --verbose > {noformat} > In Sqoop logs we see the following: > {noformat} > 19/09/24 13:51:45 INFO manager.SqlManager: Executing SQL statement: SELECT > t.* FROM floattest t WHERE 1=0 > 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN1 of type [2, > 30, -127] > 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN2 of type [2, > 30, -84] > 19/09/24 13:51:45 DEBUG manager.SqlManager: Found column COLUMN3 of type [2, > 30, 0] > 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column names > projected : [COLUMN1, COLUMN2, COLUMN3] > 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Database column name - info > map : > COLUMN3 : [Type : 2,Precision : 30,Scale : 0] > COLUMN2 : [Type : 2,Precision : 30,Scale : -84] > COLUMN1 : [Type : 2,Precision : 30,Scale : -127] > 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: Creating HCatalog table > default.floattest for import > 19/09/24 13:51:45 INFO hcat.SqoopHCatUtilities: HCatalog Create table > statement: > create external table `default`.`floattest` ( > `column1` decimal(30), > `column2` decimal(30), > `column3` decimal(30)) > stored as orc > {noformat} > From this output we can see that Oracle states about column1 has Type=2 which > is NUMERIC (regarding to > https://docs.oracle.com/javase/7/docs/api/constant-values.html#java.sql.Types.FLOAT). > Sqoop translates NUMERIC to DECIMAL > (https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java#L1050L1107). > Due to Oracle uses {{scale=-127}} to sign about a NUMERIC that it is a FLOAT > instead of stating {{Type=6}}, Sqoop creates integers (decimal with 0 scale) > from NUMBER. > I think it is the fault of Oracle as it does not use Java Type=6 to sign type > of a float. What do you think? > ---- > Thank you for the details and investigation to [~mbalakrishnan] and Andrew > Miller -- This message was sent by Atlassian Jira (v8.3.4#803005)