Santiago Vila <[email protected]> wrote: > So: Is this a bug, or is the file supposed to be always in UTF-8? > (Is this documented?)
Hi, I think it is a bug because: 1) Standard 'grep' works just fine on the described examples 2) The behaviour also breaks the integration with other utilities, such as 'find' and 'ls', even in the UTF-8 environment. For instance, I've the following file names in a directory (attached you find the corresponding .tar archive containing those files): % ls (01) [Liszt] Annees de pelerinage Premiere annee Suisse 6 Vallee d'Obermann.flac (08) [Debussy] Images, Book2 1 Cloches ? travers les feuilles.flac (11) [Mozart] Fantasia in C minor, KV475.flac where the second file name contains a weird character (here shown as '?'). Then, the command % ls | tre-agrep flac returns just the first file name (the problematic one is the second file): (01) [Liszt] Annees de pelerinage Premiere annee Suisse 6 Vallee d'Obermann.flac while % ls | grep flac correctly returns all of them: (01) [Liszt] Annees de pelerinage Premiere annee Suisse 6 Vallee d'Obermann.flac (08) [Debussy] Images, Book2 1 Cloches � travers les feuilles.flac (11) [Mozart] Fantasia in C minor, KV475.flac The command % ls | tre-agrep feuilles returns nothing; neither does this one: % ls | tre-agrep KV475 -- Douglas A. Augusto
tre-agrep_bug-dir_example.tar
Description: Unix tar archive

