On 2020/09/19 5:26, Yasuhito FUTATSUKI wrote:
> Hi,
> 
> While I tried to make an example that escape_path() does not work as
> expected in specific locale such as ja_JP.SJIS or CP932 (suggested by
> Jun), I found it seems svn_cmdline__edit_file_externally() always 
> passes the file name as UTF-8 even if the LC_CTYPE is other than
> UTF-8.
> 
> I think it is also need svn_path_cstring_from_utf8() conversion for
> file_name in svn_cmdline__edit_file_externally().

As far as I read the code, svn_cmdline__edit_file_externally() is
called with paths come from svn_io_open_unique_file3() or
"local_abspath" from svn_client_conflict_t, so I believe it is
needed. 

I wrote a patch address it.
(attached fix-edit-file-externally-patch.txt)
[[[
Fix file name to edit from utf8 to local style.

* subversion/libsvn_subr/cmdline.c (svn_cmdline__edit_file_externally):
  Apply svn_path_cstring_from_utf8() to target file name.
]]]

> Here is a reproucing script. (Using ja_JP.UTF8 and ja_JP.SJIS locale).

As it is quite few system can use ja_JP.SJIS locale (I only know
FreeBSD and macOS) and there is also escape_path issue in SJIS locale,
I modified this script to use ja_JP.eucJP locale.

[[[
#!/bin/sh
# assuming UTF-8 encoding in this file

testdir=/tmp/svn-conflict-edit-filename-test

if [ ! -d ${testdir} ]; then
  mkdir -p ${testdir}
fi

reposdir=${testdir}/testrepo
reposurl=file://${reposdir}
:${svn:=svn}

svnadmin create ${reposdir}
cat > ${testdir}/record_filename.sh <<EOF
#!/bin/sh
LC_CTYPE=C; export LC_CTYPE
echo \$* > ${testdir}/svn-conflict-edit-file-name.txt
exit 0
EOF

LC_CTYPE=ja_JP.UTF-8 ; export LC_CTYPE

# add a file "予定表.txt" (it means schedule in Japanese)
# in UTF-8 working copy.
# "予定表.txt" represented in hex are followings:
#    e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 (UTF-8)
#    97 5c 92 e8 95 5c 2e 74 78 74          (SJIS; contains two '\'(== 0x5c))
#    cd bd c4 ea 9c bd 2e 74 78 74          (EUC-JP)
schedfn_utf8="予定表.txt"
schedfn_sjis=`echo ${schedfn_utf8} | iconv -f utf-8 -t sjis`
schedfn_eucjp=`echo ${schedfn_utf8} | iconv -f utf-8 -t euc-jp`

${svn} checkout ${reposurl} ${testdir}/wc-utf-8
cd ${testdir}/wc-utf-8

cat > ${schedfn_utf8} <<EOF
2020/09/19 foo
EOF

${svn} add ${schedfn_utf8}
${svn} commit -m 'add schedule memo.'

# prepare EUC-JP locale wc.
(LC_CTYPE=ja_JP.eucJP; export LC_CTYPE ; \
    ${svn} checkout $reposurl ${testdir}/wc-eucjp)

# update the file in UTF-8 wc and commit it
cat >> ${schedfn_utf8} <<EOF
2020/09/20 bar
EOF
${svn} commit -m 'add schedule at 2020/09/20' 

# add local modification in EUC-JP wc
LC_CTYPE=ja_JP.eucJP ; export LC_CTYPE
cd ${testdir}/wc-eucjp
cat >> ${schedfn_eucjp} <<EOF
2020/09/21 baz
EOF

${svn} update --force-interactive --accept edit \
  --editor-cmd "/bin/sh ${testdir}/record_filename.sh"

LC_CTYPE=C ; export LC_CTYPE
ls | od -t x1
od -t x1 ${testdir}/svn-conflict-edit-file-name.txt
]]]

The result with unpatched svn (last 4 lines):
[[[
0000000    cd  bd  c4  ea  c9  bd  2e  74  78  74  0a                    
0000013
0000000    e4  ba  88  e5  ae  9a  e8  a1  a8  2e  74  78  74  0a        
0000016
]]]

The result with patched svn (last 4 lines):
[[[
0000000    cd  bd  c4  ea  c9  bd  2e  74  78  74  0a                    
0000013
0000000    cd  bd  c4  ea  c9  bd  2e  74  78  74  0a                    
0000013
]]]

Cheers,
-- 
Yasuhito FUTATSUKI <futat...@yf.bsclub.org>
Index: subversion/libsvn_subr/cmdline.c
===================================================================
--- subversion/libsvn_subr/cmdline.c    (revision 1881848)
+++ subversion/libsvn_subr/cmdline.c    (working copy)
@@ -1400,6 +1400,7 @@
                                   apr_pool_t *pool)
 {
   const char *editor, *cmd, *base_dir, *file_name, *base_dir_apr;
+  const char *file_name_local;
   char *old_cwd;
   int sys_err;
   apr_status_t apr_err;
@@ -1423,9 +1424,11 @@
     return svn_error_wrap_apr
       (apr_err, _("Can't change working directory to '%s'"), base_dir);
 
+  SVN_ERR(svn_path_cstring_from_utf8(&file_name_local, file_name, pool));
   /* editor is explicitly documented as being interpreted by the user's shell,
      and as such should already be quoted/escaped as needed. */
-  cmd = apr_psprintf(pool, "%s %s", editor, escape_path(pool, file_name));
+  cmd = apr_psprintf(pool, "%s %s", editor,
+                     escape_path(pool, file_name_local));
   sys_err = system(cmd);
 
   apr_err = apr_filepath_set(old_cwd, pool);

Reply via email to