[ https://issues.apache.org/jira/browse/FLINK-1085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14612355#comment-14612355 ]
ASF GitHub Bot commented on FLINK-1085: --------------------------------------- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/854#discussion_r33809590 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/operators/GroupReduceCombineDriver.java --- @@ -170,7 +171,14 @@ public void run() throws Exception { // write the value again if (!this.sorter.write(value)) { - throw new IOException("Cannot write record to fresh sort buffer. Record too large."); + if (oversizedRecordCount == Long.MAX_VALUE) { + LOG.debug("Number of oversized record has exceeded MAX Long"); + } else { + ++oversizedRecordCount; + LOG.debug("Cannot write record to fresh sort buffer. Record too large. Oversized record count: {}", oversizedRecordCount); + } + // simply forward the record + this.output.collect((OUT)value); --- End diff -- I think this is fine, calling `out.collect((OUT) value)`. The generics are a bit confusing here, I think. OUT is for combine functions really the same thing as IN - they are not allowed to change the type. This should really be simplified, otherwise it is confusing, agreed. > Unnecessary failing of GroupReduceCombineDriver > ----------------------------------------------- > > Key: FLINK-1085 > URL: https://issues.apache.org/jira/browse/FLINK-1085 > Project: Flink > Issue Type: Bug > Components: Local Runtime > Affects Versions: 0.6.1-incubating, 0.7.0-incubating > Reporter: Fabian Hueske > Labels: starter > > With a recent update (commit cbbcf7820885a8a9734ffeba637b0182a6637939) the > GroupReduceCombineDriver was changed to not use an asynchronous partial > sorter. Instead, the driver fills a sort buffer with records, sorts it, > combines them, clears the buffer, and continues to fill it again. > The GroupReduceCombineDriver fails if a record cannot be serialized into an > empty sort buffer, i.e., if the record is too large for the buffer. > Alternatively, we should emit a WARN message for the first record that is too > large and just forward all records which do not fit into the empty sort > buffer (maybe continue to count how many records were simply forwarded and > give a second WARN message with this statistic). -- This message was sent by Atlassian JIRA (v6.3.4#6332)