[ https://issues.apache.org/jira/browse/SQOOP-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Samet Karadag updated SQOOP-3480: --------------------------------- Description: if enclosed-by and escaped-by characters are both double quote (\"). This causes duplicate escapes and thus duplicate characters in douple quotes. Example; gcloud dataproc jobs submit hadoop --cluster=sqoop --region=europe-west4 --class=org.apache.sqoop.Sqoop --jars=$libs – import -Dmapreduce.job.user.classpath.first=true --connect=jdbc:**** --target-dir=gs://my-oracle-extract/EMPLOYEES --table=HR.EMPLOYEES --enclosed-by '\"' --escaped-by \" --fields-terminated-by '|' --null-string '' --null-non-string '' --as-textfile causes this field; <test field " > to enclosed and escaped by this; <"test field """""> Which has 2 double quotes Bigquery requires double quotes as escap char. and field should be also enclosed by " for newlines. code should be change; in FieldFormatter.java; if (escapingLegal) { // escaping is legal. Escape any instances of the escape char itself. withEscapes = str.replace("" + escape, "" + escape + escape); } else { // no need to double-escape withEscapes = str; } // if we have an enclosing character, and escaping is legal, then the // encloser must always be escaped. if (escapingLegal) { withEscapes = withEscapes.replace("" + enclose, "" + escape + enclose); } to this; boolean alreadyEscaped=false if (escapingLegal and !alreadyEscaped) { // escaping is legal. Escape any instances of the escape char itself. withEscapes = str.replace("" + escape, "" + escape + escape); alreadyEscaped = true } else { // no need to double-escape withEscapes = str; } // if we have an enclosing character, and escaping is legal, then the // encloser must always be escaped. if (escapingLegal and !alreadyEscaped) { withEscapes = withEscapes.replace("" + enclose, "" + escape + enclose); } was: if enclosed-by and escaped-by characters are both double quote (\"). This causes duplicate escapes and thus duplicate characters in douple quotes. Example; gcloud dataproc jobs submit hadoop --cluster=sqoop --region=europe-west4 --class=org.apache.sqoop.Sqoop --jars=$libs -- import -Dmapreduce.job.user.classpath.first=true --connect=jdbc:**** --target-dir=gs://my-oracle-extract/EMPLOYEES --table=HR.EMPLOYEES --enclosed-by '\"' --escaped-by \" --fields-terminated-by '|' --null-string '' --null-non-string '' --as-textfile causes this field; <test field " > to enclosed and escaped by this; <"test field """""> Which has 2 double quotes Bigquery requires double quotes as escap char. and field should be also enclosed by " for newlines. > if enclosed-by and escaped-by characters are both double quote (\"). This > causes duplicate escapes and thus duplicate characters in douplequotes > ------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: SQOOP-3480 > URL: https://issues.apache.org/jira/browse/SQOOP-3480 > Project: Sqoop > Issue Type: Bug > Reporter: Samet Karadag > Priority: Blocker > > if enclosed-by and escaped-by characters are both double quote (\"). This > causes duplicate escapes and thus duplicate characters in douple quotes. > Example; > gcloud dataproc jobs submit hadoop --cluster=sqoop --region=europe-west4 > --class=org.apache.sqoop.Sqoop --jars=$libs – import > -Dmapreduce.job.user.classpath.first=true --connect=jdbc:**** > --target-dir=gs://my-oracle-extract/EMPLOYEES --table=HR.EMPLOYEES > --enclosed-by '\"' --escaped-by \" --fields-terminated-by '|' --null-string > '' --null-non-string '' --as-textfile > > causes this field; <test field " > > to enclosed and escaped by this; <"test field """""> > Which has 2 double quotes > Bigquery requires double quotes as escap char. and field should be also > enclosed by " for newlines. > > code should be change; > in FieldFormatter.java; > if (escapingLegal) { > // escaping is legal. Escape any instances of the escape char itself. > withEscapes = str.replace("" + escape, "" + escape + escape); > } else { > // no need to double-escape > withEscapes = str; > } > // if we have an enclosing character, and escaping is legal, then the > // encloser must always be escaped. > if (escapingLegal) { > withEscapes = withEscapes.replace("" + enclose, "" + escape + enclose); > } > > to this; > boolean alreadyEscaped=false > > if (escapingLegal and !alreadyEscaped) { > // escaping is legal. Escape any instances of the escape char itself. > withEscapes = str.replace("" + escape, "" + escape + escape); > alreadyEscaped = true > } else { > // no need to double-escape > withEscapes = str; > } > // if we have an enclosing character, and escaping is legal, then the > // encloser must always be escaped. > if (escapingLegal and !alreadyEscaped) { > withEscapes = withEscapes.replace("" + enclose, "" + escape + enclose); > } > -- This message was sent by Atlassian Jira (v8.3.4#803005)