On 2020/05/15 10:32, Daniel Shahaf wrote: > Yasuhito FUTATSUKI wrote on Thu, 14 May 2020 21:44 +00:00: >> Thank you for the review and lesson on my commit message. I intended >> to send this commit message with the patch before commit, but missed >> by mistakes. >> >> I added note about quotation characters which could also cause SyntaxError. >> >> Before replace this log message, I'd like to get a review again. >> > > Thanks for your diligence. > >> [[[ >> entries-dump: Escape string-typed attribute values when serializing >> them as Python string literals. >> >> Before this commit, a filesystem node named "foo\bar" (a single, >> 7-character path component) would cause "e.name = 'foo\bar'" to be >> emitted. The unescaped backslash would manifest as a test failure or >> a SyntaxError, depending on the following characters. >> > > *nod* Changing the Unicode quotes to double quotes is fine. > > (I generally use Unicode quotes so it's clear which quotes delimit code > from English prose and which quotes are literal parts of the code. > However, that's just me.) > >> Also, user names can contain "'" (a single quote character) and/or """ >> (a double quote character), which would potentially cause a SyntaxError >> even if we choose ether of them to quote string literals. > > Something is not clear to me in this paragraph. I do see that if we had > written «printf("'%s'", value)», SyntaxError's could still have resulted; > however, I don't see how that is a reason not to choose single quotes. > We _could_ still have chosen single quotes if we had escaped backslashes, > single quotes, and newlines in «value» before printing it, couldn't we? > > It's not clear to me what this paragraph is trying to explain: whether > it's trying to explain why the code generates a «bytes» object as > opposed to a «str» object, or to explain the choice to escape every byte > (even those that don't _require_ escaping according to Python's bytes > literals syntax), or something else. > > Also, s/ether/either/.
Thanks for the review, again. I intended to to explain there can be more case that we should use escape expression, and I'm sure it wasn't achieved. [[[ entries-dump: Escape string-typed attribute values when serializing them as Python string literals. Before this commit, a filesystem node named "foo\bar" (a single, 7-character path component) would cause "e.name = 'foo\bar'" to be emitted. The unescaped backslash would manifest as a test failure or a SyntaxError, depending on the following characters. This was triggered by update_tests.py 76 windows_update_backslash under Python 3 on Windows. There can be some other characters that should be escaped. For example, user names can contain "'" (a single quote character) and/or """ (a double quote character), which would potentially cause a SyntaxError even if we choose either of them to quote string literals. To avoid to overlook such potentially unsafe characters, I decide to use hex value escape for all characters. Furthermore, to ensure that values are decoded to Unicode as UTF-8 byte sequences when we use hex value escape under Python 3, we print them as bytes value and then encode it. * subversion/tests/cmdline/entries-dump.c (print_prefix): New function. (str_value): - Add argument to specify pool. - Print human readable value of "value" as is in comment, then set it as str value by using hex escaped bytes literal. (entries_dump): Add pool argument to str_value() calls. (main): - Print "Entry" class definition as prefix before entry_dump() or tree_dump() - Style fix on if statement (using blocks). (): Add include files for assert() and svn_xml_escape_attr_cstring() * subversion/tests/cmdline/svntest/main.py (run_entiresdump, run_entriesdump_tree): Move definition of "Entry" class into generated code by entries-dump execution. Found by: svn-windows-ra buildbot Review by: danielsh ]]] Thanks, -- Yasuhito FUTATSUKI <futat...@poem.co.jp>/<futat...@yf.bsdclub.org>