Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21617 )
Change subject: IMPALA-13256: Support more than 2G rows for COUNT(*) on jdbc table ...................................................................... Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/21617/4/be/src/exec/data-source-scan-node.cc File be/src/exec/data-source-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/21617/4/be/src/exec/data-source-scan-node.cc@149 PS4, Line 149: DCHECK_LE(input_batch_->rows.num_rows, 0x7FFFFFFF); nit: std::numeric_limits<int32_t>::max() would also work, and be more standard. http://gerrit.cloudera.org:8080/#/c/21617/4/fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java File fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java: http://gerrit.cloudera.org:8080/#/c/21617/4/fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java@220 PS4, Line 220: numRows = totalNumberOfRecords_ - currRow_ <= Integer.MAX_VALUE ? Should this be enforced via a limit or something when making the query? It's not clear to me that we still get correct results if we do exceed Integer.MAX_VALUE. -- To view, visit http://gerrit.cloudera.org:8080/21617 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I47db58300cbe3270bab07da02c3fcde6d7072334 Gerrit-Change-Number: 21617 Gerrit-PatchSet: 4 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Pranav Lodha <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Yifan Zhang <[email protected]> Gerrit-Comment-Date: Wed, 31 Jul 2024 21:31:41 +0000 Gerrit-HasComments: Yes
