Hello, thanks for the information. I've tested since on SLES11SP2 and SP3 and found that the "bug" isn’t there anymore, that is: grep works as expected.
Best regards Andreas Bergen --- Andreas Bergen Solution Architect All for One Steeb AG Gottlieb-Manz-Straße 1 70794 Filderstadt T +49 711 78807-689 F +49 711 78807-92689 M +49 151 53824-689 andreas.ber...@all-for-one.com www.all-for-one.com -----Ursprüngliche Nachricht----- Von: Johannes Meixner [mailto:jsm...@suse.de] Gesendet: Donnerstag, 4. September 2014 10:29 An: Bergen, Andreas Cc: bug-grep@gnu.org Betreff: Re: bug#18398: Probably found a bug in grep Hello, On Sep 3 19:11 Bergen, Andreas wrote (excerpt): > I've probably found a bug in "grep". ... > testfile: UTF-8 Unicode text > testfile2: ASCII text ... > Name : grep > Version : 2.5.1a > Vendor: SUSE LINUX Products GmbH, Nuernberg, Germany > Build Date: Tue Apr 22 03:47:13 2008 > Install Date: Mon Jul 6 16:21:37 2009 > Source RPM: grep-2.5.1a-20.17.src.rpm This grep version is very old. I found grep version 2.5.1a only in SUSE Linux Enterprise Server 10. openSUSE distributions with such an old grep are no longer available. I do not know if that old grep version was really meant to support UTF-8 character encoding (multibyte characters) actually well because I find almost nothing about "UTF" (ignore case) in the grep-2.5.1a sources. There is some multibyte character support in grep-2.5.1a but I wonder to what extent it actually works. In contrast in the grep-2.7 sources that we provide since SUSE Linux Enterprise Server 11 Service Pack 2 (SLES11-SP2) there is a lot more about "UTF" (ignore case). In the RPM changelog of our grep RPM package for SLES11-SP2 there is in particular: ------------------------------------------------------------------ Version upgrade to grep-2.7 and reset to full compliance with upstream ... version upgrade to grep-2.6.3, which brings among various compile fixes vast improvements for UTF-8 / multibyte handling. ------------------------------------------------------------------ In general: Any issues with various "traditional" Unix/Linux tools that depend on the locale are very often no real bugs. For users it is crucial to understand that any kind of behaviour can depend on the locale (from keyboard input via program behaviour to what is shown on the screen). For basic information see http://en.opensuse.org/SDB:Plain_Text_versus_Locale When programs process "plain text files", the user who runs the program must set up the locale environment to match the encoding of the "plain text file" before he runs the program. When you like to process your "plain text files" as you did "since ever" with various "traditional" Unix/Linux tools, you must use the POSIX locale, otherwise you will get weird results and unexpected side-effects. See also http://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locales.html Kind Regards Johannes Meixner -- SUSE LINUX Products GmbH -- Maxfeldstrasse 5 -- 90409 Nuernberg -- Germany HRB 16746 (AG Nuernberg) GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer ________________________________ All for One Steeb AG, Sitz der Gesellschaft: Filderstadt. Amtsgericht Stuttgart: HRB 19 539, Vorstand: Lars Landwehrkamp (Sprecher), Stefan Land Vorsitzender des Aufsichtsrats: Peter Brogle Diese E-Mail (einschließlich aller Anhänge) kann Betriebs- oder Geschäftsgeheimnisse bzw. sonstige vertrauliche und/oder rechtlich geschützte Informationen enthalten. Sollten Sie diese E-Mail irrtümlich erhalten haben, ist Ihnen jede Kenntnisnahme des Inhalts, Nutzung, Vervielfältigung, oder Weitergabe der E-Mail (einschließlich aller Anhänge) ausdrücklich untersagt. Bitte benachrichtigen Sie uns umgehend und vernichten Sie die empfangene E-Mail. Vielen Dank. This e-mail (including any attachments) may contain business or trade secrets or other confidential and / or legally protected information. If you have received this e-mail in error, you are hereby notified that any review, use, copying, or distribution of it is strictly prohibited. Please inform us immediately and destroy this e-mail. Thank you.