On 10/02/2015 02:43 AM, Santiago Ruano Rincón wrote:
grep doesn't match characters with diacritical marks in ISO-8859 files, inside a Unicode enviroment
That is normal and expected behavior. In a UTF-8 locale, "á" is represented by the two bytes 0xC3 and 0xA1. In an ISO-8859 file, the same character is represented by the single byte 0xE1. The UTF-8 pattern won't match the ISO-8859 representation.
To avoid this problem, switch to an ISO-8859 locale before using grep to read ISO-8859 text files. This is true for pretty much any standard utility, not just grep. Alternatively, you can translate the text files from ISO-8859 to UTF-8, before giving the resulting text to grep or to other utilities.