[ https://issues.apache.org/jira/browse/SQOOP-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124564#comment-15124564 ]
Sqoop QA bot commented on SQOOP-2811: ------------------------------------- Testing file [SQOOP-2811.patch|https://issues.apache.org/jira/secure/attachment/12785295/SQOOP-2811.patch] against branch sqoop2 took 1:05:57.655409. {color:red}Overall:{color} -1 due to an error(s), see details below: {color:green}SUCCESS:{color} Clean was successful {color:green}SUCCESS:{color} Patch applied correctly {color:red}ERROR:{color} Patch does not add/modify any test case {color:green}SUCCESS:{color} License check passed {color:green}SUCCESS:{color} Patch compiled {color:green}SUCCESS:{color} All unit tests passed (executed 1676 tests) {color:orange}WARNING:{color} Test coverage has decreased ([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2153/artifact/patch-process/cobertura_report.txt]) * Package {{connector/connector-hdfs}} has lower test coverage: Line coverage decreased by 5% (from 80% to 75%), Branch coverage decreased by 0% (from 59% to 59%) {color:green}SUCCESS:{color} No new findbugs warnings ([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2153/artifact/patch-process/findbugs_report.txt]) {color:green}SUCCESS:{color} All integration tests passed (executed 190 tests) Console output is available [here|https://builds.apache.org/job/PreCommit-SQOOP-Build/2153/console]. This message is automatically generated. > Sqoop2: Extracting sequence files may result in duplicates > ---------------------------------------------------------- > > Key: SQOOP-2811 > URL: https://issues.apache.org/jira/browse/SQOOP-2811 > Project: Sqoop > Issue Type: Bug > Affects Versions: 1.99.6 > Reporter: Abraham Fine > Assignee: Abraham Fine > Attachments: SQOOP-2811.patch > > > In the hdfs extractor we use: > {code:java} > if (start > filereader.getPosition()) { > filereader.sync(start); // sync to start > } > {code} > to jump to the correct point in the sequence file that we want to extract. > If the sequence file is small, multiple start points may `sync` to the same > point and we could end up extracting the same record multiple times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)