[ https://issues.apache.org/jira/browse/SQOOP-3123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932989#comment-15932989 ]
ASF subversion and git services commented on SQOOP-3123: -------------------------------------------------------- Commit e280b47eacc3428040669df5f91cedccd5be7e46 in sqoop's branch refs/heads/trunk from [~maugli] [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=e280b47 ] SQOOP-3123: Introduce escaping logic for column mapping parameters (same what Sqoop already uses for the DB column names), thus special column names (e.g. containing '#' character) and mappings realted to those columns can be in the same format (thus not confusing the end users), and also eliminates the related AVRO format clashing issues. (Dmitry Zagorulkin via Attila Szabo) > Import from oracle using oraoop with map-column-java to avro fails if special > characters encounter in table name or column name > -------------------------------------------------------------------------------------------------------------------------------- > > Key: SQOOP-3123 > URL: https://issues.apache.org/jira/browse/SQOOP-3123 > Project: Sqoop > Issue Type: Bug > Components: codegen > Affects Versions: 1.4.6, 1.4.7 > Reporter: Dmitry Zagorulkin > Fix For: 1.4.7 > > Attachments: SQOOP_3123.patch > > > I'm trying to import data from oracle to avro using oraoop. > My table: > {code} > CREATE TABLE "IBS"."BRITISH#CATS" > ( "ID" NUMBER, > "C_CODE" VARCHAR2(10), > "C_USE_START#DATE" DATE, > "C_USE_USE#NEXT_DAY" VARCHAR2(1), > "C_LIM_MIN#DAT" DATE, > "C_LIM_MIN#TIME" TIMESTAMP, > "C_LIM_MIN#SUM" NUMBER, > "C_OWNCODE" VARCHAR2(1), > "C_LIMIT#SUM_LIMIT" NUMBER(17,2), > "C_L@M" NUMBER(17,2), > "C_1_THROW" NUMBER NOT NULL ENABLE, > "C_#_LIMITS" NUMBER NOT NULL ENABLE > ) SEGMENT CREATION IMMEDIATE > PCTFREE 70 PCTUSED 40 INITRANS 2 MAXTRANS 255 > NOCOMPRESS LOGGING > STORAGE(INITIAL 2097152 NEXT 524288 MINEXTENTS 1 MAXEXTENTS 2147483645 > PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 > BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT) > TABLESPACE "WORK" ; > {code} > My first script is: > {code} > ./sqoop import \ > > -Doraoop.timestamp.string=false \ > --direct \ > --connect jdbc:oracle:thin:@localhost:49161:XE \ > --username system \ > --password oracle \ > --table IBS.BRITISH#CATS \ > --target-dir /Users/Dmitry/Developer/Java/sqoop/bin/imported \ > --as-avrodatafile \ > --map-column-java > ID=String,C_CODE=String,C_USE_START#DATE=String,C_USE_USE#NEXT_DAY=String,C_LIM_MIN#DAT=String,C_LIM_MIN#TIME=String,C_LIM_MIN#SUM=String,C_OWNCODE=String,C_LIMIT#SUM_LIMIT=String,C_L_M=String,C_1_THROW=String,C_#_LIMITS=String > {code} > fails with > {code} > 2017-01-13 16:11:21,348 ERROR [main] tool.ImportTool > (ImportTool.java:run(625)) - Import failed: No column by the name > C_LIMIT#SUM_LIMITfound while importing data; expecting one of > [C_LIMIT_SUM_LIMIT, C_OWNCODE, C_L_M, C___LIMITS, C_LIM_MIN_DAT, C_1_THROW, > C_CODE, C_USE_START_DATE, C_LIM_MIN_SUM, ID, C_LIM_MIN_TIME, > C_USE_USE_NEXT_DAY] > {code} > After i've found that sqoop has replaced all special characters with > underscore. My second script is: > {code} > ./sqoop import \ > > -D oraoop.timestamp.string=false \ > --direct \ > --connect jdbc:oracle:thin:@localhost:49161:XE \ > --username system \ > --password oracle \ > --table IBS.BRITISH#CATS \ > --target-dir /Users/Dmitry/Developer/Java/sqoop/bin/imported \ > --as-avrodatafile \ > --map-column-java > ID=String,C_CODE=String,C_USE_START_DATE=String,C_USE_USE_NEXT_DAY=String,C_LIM_MIN_DAT=String,C_LIM_MIN_TIME=String,C_LIM_MIN_SUM=String,C_OWNCODE=String,C_LIMIT_SUM_LIMIT=String,C_L_M=String,C_1_THROW=String,C___LIMITS=String > \ > --verbose > {code} > Fails with: Caused by: org.apache.avro.UnresolvedUnionException: Not in union > ["null","long"]: 2017-01-13 11:22:53.0 > {code} > 2017-01-13 16:14:54,687 WARN [Thread-26] mapred.LocalJobRunner > (LocalJobRunner.java:run(560)) - job_local1372531461_0001 > java.lang.Exception: > org.apache.avro.file.DataFileWriter$AppendWriteException: > org.apache.avro.UnresolvedUnionException: Not in union ["null","long"]: > 2017-01-13 11:22:53.0 > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) > Caused by: org.apache.avro.file.DataFileWriter$AppendWriteException: > org.apache.avro.UnresolvedUnionException: Not in union ["null","long"]: > 2017-01-13 11:22:53.0 > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:308) > at > org.apache.sqoop.mapreduce.AvroOutputFormat$1.write(AvroOutputFormat.java:112) > at > org.apache.sqoop.mapreduce.AvroOutputFormat$1.write(AvroOutputFormat.java:108) > at > org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:655) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) > at > org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:73) > at > org.apache.sqoop.mapreduce.AvroImportMapper.map(AvroImportMapper.java:39) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) > at > org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.avro.UnresolvedUnionException: Not in union > ["null","long"]: 2017-01-13 11:22:53.0 > at > org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:709) > at > org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) > at > org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:150) > at > org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:153) > at > org.apache.avro.specific.SpecificDatumWriter.writeField(SpecificDatumWriter.java:90) > at > org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:182) > at > org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143) > at > org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73) > at > org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:150) > at > org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60) > at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:302) > ... 17 more > {code} > I've found that old problem and *oraoop.timestamp.string=false* must solve > it, but it does not. > What do you think? > Also please assign this problem to me. -- This message was sent by Atlassian JIRA (v6.3.15#6346)