Further to my comment on http://bugs.php.net/bug.php?id=45132
Many servers appear to have LANG set to some non-UTF-8 character set. With ext/standard/exec.c version 1.127, released in PHP 5.2.6, this means that apps that try to send UTF-8 to shell arguments are broken. The "invalid" characters are removed. This is (probably) bug 45132. The standard workaround to this, promoted everywhere where I've seen this discussed, is to use something like setlocale(LC_CTYPE,'en_US.UTF-8'). This appears to break the security of escapeshellcmd(), back to how it was in PHP 5.2.5. This is because setlocale() does not set the LC_* environment variables, and so when the shell is spawned, it inherits the same locale as PHP had originally. I haven't found anything spelling out the attack that 1.127 was fixing, but I imagine it goes something like this: a sysadmin has Shift-JIS as their default character set in their terminal. They have a PHP/MySQL web app which stores arbitrary input from an attacker. They run a maintenance script from the command line which takes that arbitrary input, escapes it with escapeshellarg(), and passes it through as an argument to some command. Then an incomplete Shift-JIS character in the attacker's input eats the terminating quote from PHP, and the attacker has an arbitrary shell exploit. So if an application uses setlocale() without putenv("LC_CTYPE=..."), then they reopen the same vulnerability. A complete UTF-8 character can be an incomplete Shift-JIS character. Even if setlocale() set the environment variables by default, bug 45132 would still be a pain to work around. POSIX does not guarantee that en_US.UTF-8 is available, only that "C" is available. It would be nice if escapeshellarg/escapeshellcmd treated the single-byte character sets with an ASCII subset, such as the ISO 8859 family, as 8-bit clean, since there is no possibility in these character sets of either a partial character eating a metacharacter, or of a metacharacter being created during transcoding. Then applications wouldn't need any setlocale() hack in 99.9% of cases. And for watertight escapeshellcmd() operation in a nasty encoding like Shift-JIS, I would suggest something along the lines of: /* Save the old locale */ char * oldlocale = xstrdup(setlocale(LC_CTYPE, NULL)); /* Set the locale to match the LC_CTYPE environment variable */ setlocale(LC_CTYPE, ""); /* ... usual stuff ... */ /* Restore locale */ setlocale(LC_CTYPE, oldlocale); efree(oldlocale); If an application fiddles with LC_CTYPE between calling escapeshellcmd() and calling shell_exec(), they probably deserve what's coming to them. -- Tim Starling -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php