Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21131 )

Change subject: IMPALA-11499: Refactor UrlEncode function to handle special 
characters
......................................................................


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/21131/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21131/6//COMMIT_MSG@8
PS6, Line 8: characters
nit: let's put the title in one line


http://gerrit.cloudera.org:8080/#/c/21131/6//COMMIT_MSG@13
PS6, Line 13: that accurately maps special characters to their URL-encoded 
forms.
It's unclear to me how the unicode characters are handled incorrectly. E.g. the 
one mentioned in the JIRA description is "?" which is encoded into 3 bytes in 
UTF-8: 0xe8 0xbf 0x90. Could you mention in the commit message how this example 
fails before and works now?

FWIW, there are online tools to convert Unicode to UTF-8, e.g. 
https://onlinetools.com/unicode/convert-unicode-to-utf8


http://gerrit.cloudera.org:8080/#/c/21131/6/be/src/util/coding-util.cc
File be/src/util/coding-util.cc:

http://gerrit.cloudera.org:8080/#/c/21131/6/be/src/util/coding-util.cc@80
PS6, Line 80: std::isalnum(ch)
It's an existing issue but I think we should cast 'ch' to unsigned char as 
memtioned in https://en.cppreference.com/w/cpp/string/byte/isalnum

> the behavior of std::isalnum is undefined if the argument's value is neither 
> representable as unsigned char nor equal to EOF. To use these functions 
> safely with plain chars (or signed chars), the argument should first be 
> converted to unsigned char:
> std::isalnum(static_cast<unsigned char>(ch));


http://gerrit.cloudera.org:8080/#/c/21131/6/be/src/util/coding-util.cc@85
PS6, Line 85: std::
nit: I think we can ignore explicitly using "std::" since std::uppercase is 
introduced at L37.

For std::hex, it's introduced in "common/names.h":
https://github.com/apache/impala/blob/b39cd79ae84c415e0aebec2c2b4d7690d2a0cc7a/be/src/common/names.h#L82

Maybe the same for "std::isalnum".



--
To view, visit http://gerrit.cloudera.org:8080/21131
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88c4aba5d811dfcec809583d0c16fcbc0ca730fb
Gerrit-Change-Number: 21131
Gerrit-PatchSet: 6
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Zihao Ye <[email protected]>
Gerrit-Comment-Date: Fri, 26 Apr 2024 09:03:05 +0000
Gerrit-HasComments: Yes

Reply via email to