On Sat, May 16, 2026 at 3:06 PM Timofei Zhakov <[email protected]> wrote: > > On Fri, May 15, 2026 at 4:02 PM Jun Omae <[email protected]> wrote: > > > > On 2026/05/15 0:28, Timofei Zhakov wrote: > > > There is a test called basic_tests.py:argv_with_best_fit_chars. It > > > checks that svn rejects Unicode symbols. Functionality which was > > > illegal before changes introduced in that branch. > > > > In the branch, svn command receives the arguments as utf-8 bytes, but the > > output of the pipe is applied best-fit encoding conversion. > > > > [[[ > > diff --git a/subversion/tests/cmdline/basic_tests.py > > b/subversion/tests/cmdline/basic_tests.py > > index 88f43bfae7..edb697b795 100755 > > --- a/subversion/tests/cmdline/basic_tests.py > > +++ b/subversion/tests/cmdline/basic_tests.py > > @@ -3357,20 +3357,22 @@ def argv_with_best_fit_chars(sbox): > > yield chr(c), mbcs > > > > count = 0 > > - # E721113: Conversion from UTF-16 failed: No mapping for the Unicode > > - # character exists in the target multi-byte code page. > > - expected_stderr = 'svn: E721113: ' > > + # The argument is received as utf-8 bytes, but the output to the pipe > > + # is applied best-fit encoding conversion. > > for wc, mbcs in iter_bestfit_chars(): > > count += 1 > > logger.info('Code page %r - U+%04x -> 0x%s', codepage, ord(wc), > > mbcs.hex()) > > if mbcs == b'"': > > - svntest.actions.run_and_verify_svn2(None, expected_stderr, 1, 'help', > > + expected_stderr = r'^"foo" "bar": unknown command' > > + svntest.actions.run_and_verify_svn2(None, expected_stderr, 0, 'help', > > 'foo{0} {0}bar'.format(wc)) > > elif mbcs == b'\\': > > - svntest.actions.run_and_verify_svn2(None, expected_stderr, 1, 'help', > > + expected_stderr = r'^"foo\\" \\"bar": unknown command' > > + svntest.actions.run_and_verify_svn2(None, expected_stderr, 0, 'help', > > 'foo{0}" {0}"bar'.format(wc)) > > elif mbcs == b' ': > > - svntest.actions.run_and_verify_svn2(None, expected_stderr, 1, 'help', > > + expected_stderr = r'^"foo bar": unknown command' > > + svntest.actions.run_and_verify_svn2(None, expected_stderr, 0, 'help', > > 'foo{0}bar'.format(wc)) > > if count == 0: > > raise svntest.Skip('No best fit characters in code page %r' % codepage) > > ]]] > > I tested this patch and can confirm that it works. I don't know why > but as far as I remember I was doing exactly the same thing, but it > didn't work for me. > > I remember I once heard that "everything looks like physics if you > don't know magic". That's exactly the case. Sometimes we just need a > pair of fresh eyes. :-) > > +1 for the changes > > > Recently, I'm trying 1.14.x with utf-8 code page using activeCodePage > > manifest [1]. It almost works fine (e.g. add emoji filenames and checkout, > > ...) however output to stderr is garbled and not fixed yet. > > > > [1] > > https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page > > That sounds interesting. If as you are saying output is converted to > the local encoding, it introduces a lot of inconsistency and yeah we > have no emojis. > > Since it's almost always that the encoding is UTF-8 on the majority of > Unix systems, I think it makes a lot of sense to take the same > approach on Windows.
I saw you committed the patch in r1934334. Thank you so much! I guess the CI is back to green! I'll post an email on dev@ about merging the branch into trunk. -- Timofei Zhakov

