[ https://issues.apache.org/jira/browse/FLINK-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14623977#comment-14623977 ]
ASF GitHub Bot commented on FLINK-1085: --------------------------------------- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/854#discussion_r34425008 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/operators/GroupReduceCombineDriver.java --- @@ -170,7 +171,14 @@ public void run() throws Exception { // write the value again if (!this.sorter.write(value)) { - throw new IOException("Cannot write record to fresh sort buffer. Record too large."); + if (oversizedRecordCount == Long.MAX_VALUE) { --- End diff -- I think we can skip this test. It's going to be very hard to process 2^63 oversized records in a single Flink thread. Might take a millennium or so ;-) > Unnecessary failing of GroupReduceCombineDriver > ----------------------------------------------- > > Key: FLINK-1085 > URL: https://issues.apache.org/jira/browse/FLINK-1085 > Project: Flink > Issue Type: Bug > Components: Local Runtime > Affects Versions: 0.6.1-incubating, 0.7.0-incubating > Reporter: Fabian Hueske > Labels: starter > > With a recent update (commit cbbcf7820885a8a9734ffeba637b0182a6637939) the > GroupReduceCombineDriver was changed to not use an asynchronous partial > sorter. Instead, the driver fills a sort buffer with records, sorts it, > combines them, clears the buffer, and continues to fill it again. > The GroupReduceCombineDriver fails if a record cannot be serialized into an > empty sort buffer, i.e., if the record is too large for the buffer. > Alternatively, we should emit a WARN message for the first record that is too > large and just forward all records which do not fit into the empty sort > buffer (maybe continue to count how many records were simply forwarded and > give a second WARN message with this statistic). -- This message was sent by Atlassian JIRA (v6.3.4#6332)