Zepp Lu wrote:
$ printf '\x53\xef' | grep -aoP '\x53\xef' (no output, returns 1) $ printf '\x53\xc3\xaf' | grep -aoP '\x53\xef' Sï $ printf '\x53\xc3\xef' | grep -aoP '\x53\xef' (no output, returns 1)
I don't see a bug here. PCRE patterns like \xef match code points, not bytes, so the PCRE notation differs from the shell printf notation. If your locale uses UTF-8, the PCRE pattern \xef matches the Unicode character U+00EF LATIN SMALL LETTER I WITH DIAERESIS, which is represented by the byte pair C3 AF.
If you want \xef to match a single byte, run grep in a single-byte locale, e.g., set LC_ALL=C in the environment.
grep (version 2.12-2) provided by Debian works just fine.
Actually, it's buggy in this area. Sometimes it can dump core.