Zihao Ye has posted comments on this change. ( http://gerrit.cloudera.org:8080/21039 )
Change subject: IMPALA-12786: Optimize count(*) for JSON scans ...................................................................... Patch Set 4: (5 comments) Thanks for the review! http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc File be/src/exec/json/json-parser.cc: http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc@37 PS4, Line 37: This class is : /// essentially a simplified version of a rapidjson parser, removing specific data parsing : /// and copying operations, allowing for faster parsing of the number of JSON objects. > Can you put reference link to rapidjson code where this class is based from Done http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc@45 PS4, Line 45: class JsonSkipper { > Can you add more tests for this class? Done http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc@126 PS4, Line 126: bool SkipNumber() { : Consume('-'); > Can you add comment about valid number literal? What if it begin with '.' o Done http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc@182 PS4, Line 182: case 'n': RETURN_IF_FALSE(SkipNull()); break; : case 't': RETURN_IF_FALSE(SkipTrue()); break; : case 'f': RETURN_IF_FALSE(SkipFalse()); break; > Is s_ always lower case? Yes, comments have been added for explanation. http://gerrit.cloudera.org:8080/#/c/21039/4/testdata/workloads/functional-query/queries/QueryTest/malformed_json.test File testdata/workloads/functional-query/queries/QueryTest/malformed_json.test: http://gerrit.cloudera.org:8080/#/c/21039/4/testdata/workloads/functional-query/queries/QueryTest/malformed_json.test@27 PS4, Line 27: select count(*) from malformed_json : ---- TYPES : bigint : ---- RESULTS : 13 > Is the same query yield the same result before this patch? Yes, and the dimension of the query option 'disable_optimized_json_count_star' has been added to TestQueriesJsonTables to ensure that the results are consistent before and after the optimization is enabled. -- To view, visit http://gerrit.cloudera.org:8080/21039 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I97ff097661c3c577aeafeeb1518408ce7a8a255e Gerrit-Change-Number: 21039 Gerrit-PatchSet: 4 Gerrit-Owner: Zihao Ye <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Zihao Ye <[email protected]> Gerrit-Comment-Date: Wed, 27 Mar 2024 07:14:52 +0000 Gerrit-HasComments: Yes
