[ https://issues.apache.org/jira/browse/BEAM-12754?focusedWorklogId=637548&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-637548 ]
ASF GitHub Bot logged work on BEAM-12754: ----------------------------------------- Author: ASF GitHub Bot Created on: 12/Aug/21 21:11 Start Date: 12/Aug/21 21:11 Worklog Time Spent: 10m Work Description: steveniemitz commented on a change in pull request #15327: URL: https://github.com/apache/beam/pull/15327#discussion_r688087440 ########## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/coders/RowCoderGenerator.java ########## @@ -316,27 +318,43 @@ static void encodeDelegate( // Encode the field count. This allows us to handle compatible schema changes. VAR_INT_CODER.encode(value.getFieldCount(), outputStream); - // Encode a bitmap for the null fields to save having to encode a bunch of nulls. - NULL_LIST_CODER.encode(scanNullFields(value, hasNullableFields), outputStream); - for (int encodingPos = 0; encodingPos < value.getFieldCount(); ++encodingPos) { - @Nullable Object fieldValue = value.getValue(encodingPosToIndex[encodingPos]); - if (fieldValue != null) { - coders[encodingPos].encode(fieldValue, outputStream); + + if (hasNullableFields) { + // If the row has null fields, extract the values out once so that both scanNullFields and + // the encoding can share it and avoid having to extract them twice. + + List<Object> fieldValues = value.getValues(); + // Encode a bitmap for the null fields to save having to encode a bunch of nulls. + NULL_LIST_CODER.encode(scanNullFields(fieldValues), outputStream); + for (int encodingPos = 0; encodingPos < fieldValues.size(); ++encodingPos) { + @Nullable Object fieldValue = fieldValues.get(encodingPosToIndex[encodingPos]); Review comment: ugh well that's very confusing. I'll change this to copy into an array instead then using `getValue`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 637548) Time Spent: 0.5h (was: 20m) > RowCoderGenerator calls getValue multiple times > ----------------------------------------------- > > Key: BEAM-12754 > URL: https://issues.apache.org/jira/browse/BEAM-12754 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core > Affects Versions: 2.31.0 > Reporter: Steve Niemitz > Assignee: Steve Niemitz > Priority: P2 > Time Spent: 0.5h > Remaining Estimate: 0h > > RowCoderGenerator.encodeDelegate calls getValue for each field on a row > twice, one to check if it is null in scanNullFields, and one to actually get > the value to be encoded. > If getValue is expensive (for example, it has to recursively adapt a type to > a beam Row), this causes unneeded extra work. > Instead we could call value.getValues to get all values once, then pass them > to scanNullFields and re-use them when encoding the values. -- This message was sent by Atlassian Jira (v8.3.4#803005)