Edit report at http://bugs.php.net/bug.php?id=52333&edit=1
ID: 52333 Updated by: ahar...@php.net Reported by: a dot dobkin at drweb dot com Summary: Metacharacter \d in a regexp causes an error on some Russian letters -Status: Open +Status: Bogus Type: Bug Package: *Regular Expressions Operating System: Windows PHP Version: 5.3.2 New Comment: This is an encoding issue, rather than a bug in PHP itself: by default, preg_match() works like most things in PHP and just treats strings as a series of bytes. If ÐаÑилий is encoded in UTF-16, there are multiple bytes in the range that are digits in ASCII, so \d matches them. preg_match() does have support for Unicode text when it's encoded as UTF-8 via the /u modifier, so the right way to handle this would be using iconv() or mb_convert_encoding() to convert the string to UTF-8, then using a regex like "/[\d...@\#\%\$\^&*\(\)\~\=\/\|\"\'\?\:\;\/]+/u" to force UTF-8 mode. Previous Comments: ------------------------------------------------------------------------ [2010-07-14 07:32:43] a dot dobkin at drweb dot com OS 2003 Server R2 SP2 English x86 ------------------------------------------------------------------------ [2010-07-14 07:03:15] a dot dobkin at drweb dot com Description: ------------ Metacharacter \d in a regular expression causes an error on some Russian letters on OS Windows. Example script: $user_name_ru = "ÐаÑилий"; $regexp = "/[\d...@\#\%\$\^&*\(\)\~\=\/\|\"\'\?\:\;\/]+/"; if( preg_match( $regexp,$user_name_ru ) ) { echo 'ERR'; } else { echo 'OK'; } preg_match() return true if word contains one or more characters 'й', 'г', 'в'. If to delete metacharacter '\d' preg_match() returns false. If you are using php version 5.2.13 all works correctly. ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/bug.php?id=52333&edit=1