New submission from Jakub Wilk: This is how shell quoting in commands.mkarg() is implemented:
def mkarg(x): if '\'' not in x: return ' \'' + x + '\'' s = ' "' for c in x: if c in '\\$"`': s = s + '\\' s = s + c s = s + '"' return s This is unfortunately not compatible with the way bash splits arguments in some locales. The problem is that in a few East Asian encodings (at least BIG5, BIG5-HKSCS, GB18030, GBK), the 0x5C byte (backslash in ASCII) could be the second byte of a two-byte character; and bash apparently decodes the strings before splitting. PoC: $ sh --version | head -n1 GNU bash, version 4.3.22(1)-release (i486-pc-linux-gnu) $ LC_ALL=C python test-mkargs.py crw-rw-rw- 1 root root 1, 3 Aug 12 16:00 /dev/null ls: cannot access " ; python -c 'import this' | grep . | shuf | head -n1 | cowsay -y ; ": No such file or directory $ LC_ALL=zh_CN.GBK python test-mkargs.py crw-rw-rw- 1 root root 1, 3 8月 12 16:00 /dev/null ls: 无法访问乗: No such file or directory ________________________________ < Simple is better than complex. > -------------------------------- \ ^__^ \ (..)\_______ (__)\ )\/\ ||----w | || || sh: 乗: 未找到命令 ---------- components: Library (Lib) files: test-mkargs.py messages: 225235 nosy: jwilk priority: normal severity: normal status: open title: commands.mkarg() buggy in East Asian locales type: security versions: Python 2.7 Added file: http://bugs.python.org/file36359/test-mkargs.py _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue22187> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com