On Fri, 19 Feb 2010 20:29:27 +0000, Andy Koppe <andy.ko...@gmail.com> wrote:
>Looks like there's some sort of GBK vs UTF-8 mixup going on, because >'鏂版煡鏂囩尞' is the same byte sequence in GBK as '新查文献' is in UTF-8: >\xE6\x96\xB0\xE6\x9F\xA5\xE6\x96\x87\xE7\x8C\xAE Wonderful analysis! Could you please give me some hints on the tools used by you to obtain this conclusion? > >I take it the actual directory name is '新查文献'? (Babelfish seems to be >able to make some sense of that one but not the other.) Yes, you're right. The actual directory name is '新查文献'. > >Do you know what the encoding of your batch file is? GB2312 > And have you got >any locale variables (LC_ALL, LC_CTYPE, LANG) set when invoking it? I'use the following settings in the same batch file: set LC_ALL=zh_CN.UTF-8 set LC_CTYPE="zh_CN.UTF-8" set LANG=zh_CN.UTF-8 > > >>>@echo off >>>C:\cygwin\bin\bash --login "%~dp0myscript" >> >> I've found a more strange thing: If I change the batch file into the >> following form, then it will be run smoothly: >> >> @echo off >> C:\cygwin\bin\bash --login %~dp0myscript >> >> The QUOTATION MARK in the former is used to deal with the whitespaces >> appearing in the myscript's pathname, though this is relatively rare >> case. ?But in the latter case, if there're whitespaces in the >> myscript's pathname, the batch will fail to run. > >Hmm, perhaps the argument mangling at program startup is using the >ANSI codepage (i.e. GBK in this case) when it should be using UTF-8? But, if I convert my batch file into UTF-8 (without BOM, CR/LF line endings) format, I'll meet the following error: /usr/bin/bash: "F:/zhaohs/Desktop/鏂版煡鏂囩尞/RestoreName4Elsevier.sh": No such file or directory -- .: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple