On Sun, Sep 20, 2020 at 11:06:21PM +0900, Yasuhito FUTATSUKI wrote:
> On 2020/09/19 5:26, Yasuhito FUTATSUKI wrote:
> > Hi,
> > 
> > While I tried to make an example that escape_path() does not work as
> > expected in specific locale such as ja_JP.SJIS or CP932 (suggested by
> > Jun), I found it seems svn_cmdline__edit_file_externally() always 
> > passes the file name as UTF-8 even if the LC_CTYPE is other than
> > UTF-8.
> > 
> > I think it is also need svn_path_cstring_from_utf8() conversion for
> > file_name in svn_cmdline__edit_file_externally().
> 
> As far as I read the code, svn_cmdline__edit_file_externally() is
> called with paths come from svn_io_open_unique_file3() or
> "local_abspath" from svn_client_conflict_t, so I believe it is
> needed. 

Your patch looks good. I have one question below:
 
> I wrote a patch address it.
> (attached fix-edit-file-externally-patch.txt)
> [[[
> Fix file name to edit from utf8 to local style.
> 
> * subversion/libsvn_subr/cmdline.c (svn_cmdline__edit_file_externally):
>   Apply svn_path_cstring_from_utf8() to target file name.
> ]]]
> 
> > Here is a reproucing script. (Using ja_JP.UTF8 and ja_JP.SJIS locale).
> 
> As it is quite few system can use ja_JP.SJIS locale (I only know
> FreeBSD and macOS) and there is also escape_path issue in SJIS locale,
> I modified this script to use ja_JP.eucJP locale.

It looks like apr_escape_shell() will escape 0x5c (backslash, \) when
looking for ASCII character bytes. When 0x5c is escaped this breaks the
SJIS-encoded byte sequence.

Have you already tried always using escape_path() on the UTF-8 version of
the string, and then converting the escaped path to the locale's encoding?
In other words: First use escape_path, then use svn_path_cstring_from_utf8?
Perhaps that will make SJIS work?

Cheers,
Stefan

> [[[
> #!/bin/sh
> # assuming UTF-8 encoding in this file
> 
> testdir=/tmp/svn-conflict-edit-filename-test
> 
> if [ ! -d ${testdir} ]; then
>   mkdir -p ${testdir}
> fi
> 
> reposdir=${testdir}/testrepo
> reposurl=file://${reposdir}
> :${svn:=svn}
> 
> svnadmin create ${reposdir}
> cat > ${testdir}/record_filename.sh <<EOF
> #!/bin/sh
> LC_CTYPE=C; export LC_CTYPE
> echo \$* > ${testdir}/svn-conflict-edit-file-name.txt
> exit 0
> EOF
> 
> LC_CTYPE=ja_JP.UTF-8 ; export LC_CTYPE
> 
> # add a file "予定表.txt" (it means schedule in Japanese)
> # in UTF-8 working copy.
> # "予定表.txt" represented in hex are followings:
> #    e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 (UTF-8)
> #    97 5c 92 e8 95 5c 2e 74 78 74          (SJIS; contains two '\'(== 0x5c))
> #    cd bd c4 ea 9c bd 2e 74 78 74          (EUC-JP)
> schedfn_utf8="予定表.txt"
> schedfn_sjis=`echo ${schedfn_utf8} | iconv -f utf-8 -t sjis`
> schedfn_eucjp=`echo ${schedfn_utf8} | iconv -f utf-8 -t euc-jp`
> 
> ${svn} checkout ${reposurl} ${testdir}/wc-utf-8
> cd ${testdir}/wc-utf-8
> 
> cat > ${schedfn_utf8} <<EOF
> 2020/09/19 foo
> EOF
> 
> ${svn} add ${schedfn_utf8}
> ${svn} commit -m 'add schedule memo.'
> 
> # prepare EUC-JP locale wc.
> (LC_CTYPE=ja_JP.eucJP; export LC_CTYPE ; \
>     ${svn} checkout $reposurl ${testdir}/wc-eucjp)
> 
> # update the file in UTF-8 wc and commit it
> cat >> ${schedfn_utf8} <<EOF
> 2020/09/20 bar
> EOF
> ${svn} commit -m 'add schedule at 2020/09/20' 
> 
> # add local modification in EUC-JP wc
> LC_CTYPE=ja_JP.eucJP ; export LC_CTYPE
> cd ${testdir}/wc-eucjp
> cat >> ${schedfn_eucjp} <<EOF
> 2020/09/21 baz
> EOF
> 
> ${svn} update --force-interactive --accept edit \
>   --editor-cmd "/bin/sh ${testdir}/record_filename.sh"
> 
> LC_CTYPE=C ; export LC_CTYPE
> ls | od -t x1
> od -t x1 ${testdir}/svn-conflict-edit-file-name.txt
> ]]]
> 
> The result with unpatched svn (last 4 lines):
> [[[
> 0000000    cd  bd  c4  ea  c9  bd  2e  74  78  74  0a                    
> 0000013
> 0000000    e4  ba  88  e5  ae  9a  e8  a1  a8  2e  74  78  74  0a        
> 0000016
> ]]]
> 
> The result with patched svn (last 4 lines):
> [[[
> 0000000    cd  bd  c4  ea  c9  bd  2e  74  78  74  0a                    
> 0000013
> 0000000    cd  bd  c4  ea  c9  bd  2e  74  78  74  0a                    
> 0000013
> ]]]
> 
> Cheers,
> -- 
> Yasuhito FUTATSUKI <futat...@yf.bsclub.org>

> Index: subversion/libsvn_subr/cmdline.c
> ===================================================================
> --- subversion/libsvn_subr/cmdline.c  (revision 1881848)
> +++ subversion/libsvn_subr/cmdline.c  (working copy)
> @@ -1400,6 +1400,7 @@
>                                    apr_pool_t *pool)
>  {
>    const char *editor, *cmd, *base_dir, *file_name, *base_dir_apr;
> +  const char *file_name_local;
>    char *old_cwd;
>    int sys_err;
>    apr_status_t apr_err;
> @@ -1423,9 +1424,11 @@
>      return svn_error_wrap_apr
>        (apr_err, _("Can't change working directory to '%s'"), base_dir);
>  
> +  SVN_ERR(svn_path_cstring_from_utf8(&file_name_local, file_name, pool));
>    /* editor is explicitly documented as being interpreted by the user's 
> shell,
>       and as such should already be quoted/escaped as needed. */
> -  cmd = apr_psprintf(pool, "%s %s", editor, escape_path(pool, file_name));
> +  cmd = apr_psprintf(pool, "%s %s", editor,
> +                     escape_path(pool, file_name_local));
>    sys_err = system(cmd);
>  
>    apr_err = apr_filepath_set(old_cwd, pool);

Reply via email to