Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/21131 )
Change subject: IMPALA-11499: Refactor UrlEncode function to handle special characters ...................................................................... Patch Set 18: (2 comments) http://gerrit.cloudera.org:8080/#/c/21131/18/be/src/util/coding-util-test.cc File be/src/util/coding-util-test.cc: http://gerrit.cloudera.org:8080/#/c/21131/18/be/src/util/coding-util-test.cc@96 PS18, Line 96: TestUrl("Impala\x01""dev\x02""\tenv\v\x14\x12""for\x1D\x05\b\x03\x04" : "\x05\x06\x07\x09\x0C\r\x0E""beginners\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18" : "\x19\x10""Amuststart\x1B\x1C\x1D\x1E\x1F\x7F", : "Impala%01dev%02%09env%0b%14%12for%1d%05" : "%08%03%04%05%06%07%09%0c%0d%0ebeginners%0f%10%11%12%13%14%15%16%17%18%19%10Amust" : "start%1b%1c%1d%1e%1f%7f", false); : TestUrl("Impala\x01""dev\x02""\tenv\v\x14\x12""for\x1D\x05\b\x03\x04\x05\x06" : "\x07\x09\x0C\r\x0E""beginners\x0F\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19" : "\x10""Amuststart\x1B\x1C\x1D\x1E\x1F\x7F", : "Impala%01dev%02%09env%0b%14%12for%1d" : "%05%08%03%04%05%06%07%09%0c%0d%0ebeginners%0f%10%11%12%13%14%15%16%17%18%19%10" : "Amuststart%1b%1c%1d%1e%1f%7f", true); The tests look a bit messy for me. Not sure if all escaped characters are covered. Can we add a simple case that the input string just consists of all the escaped characters? Please extract the strings into variables to avoid writing them twice. If the following tests don't add more coverage, we can probably remove them. Just keep this one and add the simple case before this. If they do provide more coverage, please add some comments to explain it. http://gerrit.cloudera.org:8080/#/c/21131/15/be/src/util/coding-util.cc File be/src/util/coding-util.cc: http://gerrit.cloudera.org:8080/#/c/21131/15/be/src/util/coding-util.cc@66 PS15, Line 66: !isalnum(static_cast<unsigned char>(ch)) && !ShouldNotEscape(ch))) { > Done. We can write the test in python codes instead of using test files, e.g. add this in tests/query_test/test_insert.py under class TestInsertPartKey: def test_escaped_partition_values(self, unique_database): tbl = unique_database + ".tbl" self.execute_query("create table {}(i int) partitioned by(p string)".format(tbl)) import struct # TODO: add more values for b in range(1, 32): stmt = "insert into {} partition(p='{}') values (0)".format(tbl, struct.pack("B", b)) self.execute_query(stmt) res = self.execute_query("show partitions " + tbl) # TODO: verify more values for i in range(8): columns = res.data[i].split("\t") # Verify the partition value assert columns[0] == struct.pack("B", i+1) # Verify the partition location assert "p=%0" + str(i+1) in columns[8] assert len(res.data) == 32 Note that we still use python2 in tests so we need 'struct.pack("B", b)' to create a byte string. In python3, we can use bytes directly. The above test is incomplete. Please resolve the TODOs. -- To view, visit http://gerrit.cloudera.org:8080/21131 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb Gerrit-Change-Number: 21131 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zihao Ye <[email protected]> Gerrit-Comment-Date: Mon, 06 May 2024 07:15:04 +0000 Gerrit-HasComments: Yes
