Re: Dealing with character ranges in grep

2011-06-16 Thread Johannes Meixner
uild (--with-included-regex versus --without-included-regex). I think that those tools are so very basic tools, that consistent behaviour must have topmost priority because neither normal users understand inconsistent behaviour nor experts who work on various Linux systems like subtle inconsist

Re: Dealing with character ranges in grep

2011-06-16 Thread Johannes Meixner
that consistent behaviour has topmost priority. Kind Regards Johannes Meixner -- SUSE LINUX Products GmbH -- Maxfeldstrasse 5 -- 90409 Nuernberg -- Germany HRB 16746 (AG Nuernberg) GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer

Re: Dealing with character ranges in grep

2011-06-16 Thread Johannes Meixner
Hello, On Jun 16 15:51 Stanislav Brabec wrote: Johannes Meixner wrote: Again: I do not care if this or that special feature is supported or not because I think that consistent behaviour has topmost priority. Do you prefer "consistent behavior of regexp in all applications across the

Re: Dealing with character ranges in grep

2011-06-17 Thread Johannes Meixner
automatically prefer it when possible. For the record, at least Fedora's grep and sed both build --without-included-regex, so would be affected. Same for openSUSE and all the "Suse Linux Enterprise" products. Kind Regards Johannes Meixner -- SUSE LINUX Products GmbH -- Maxfeldstr

Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8

2012-06-12 Thread Johannes Meixner
[AAA][BB]" versus "[aa][bbb]" where [AAA] is a 3-byte upper-case character where [aa] is its 2-byte lower-case counterpart and [BB] is a 2-byte upper-case character where [bbb] is its 3-byte lower-case counterpart. Do such or similar kind of strings actually exist? If yes could such ki

Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8

2012-06-14 Thread Johannes Meixner
convert to lower case or is it actually implemented via "case folding"? FYI: http://www.unicode.org/versions/Unicode6.1.0/ch05.pdf describes in particular the "Turkish I" issue in detail... Kind Regards Johannes Meixner -- SUSE LINUX Products GmbH -- Maxfeldstrasse 5 -- 9040

Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8

2012-06-15 Thread Johannes Meixner
Hello, On Jun 14 07:44 Paul Eggert wrote (excerpt): On 06/14/2012 04:07 AM, Johannes Meixner wrote: Is grep's -i implemented via plain convert to lower case or is it actually implemented via "case folding"? I'm not sure which you mean by "plain convert" and by

Re: [bug #36567] grep -i (case-insensitive) is broken with UTF8

2012-06-15 Thread Johannes Meixner
Hello, On Jun 15 15:00 Johannes Meixner wrote (excerpt): ... not handled correctly in grep-2.7 Same with grep 2.12 - $ export LC_ALL=el_GR.utf8 ; export LANG=el_GR.utf8 $ echo -e '\0316\0243\0316\0243' >

[bug #36682] Ignore case handling of special unicode characters (case folding)

2012-06-19 Thread Johannes Meixner
URL: Summary: Ignore case handling of special unicode characters (case folding) Project: grep Submitted by: jsmeix Submitted on: Tue 19 Jun 2012 10:35:53 AM GMT Category: None

[bug #36682] Ignore case handling of special unicode characters (case folding)

2012-06-19 Thread Johannes Meixner
Follow-up Comment #1, bug #36682 (project grep): A typo: I worte --- but 'HEISS' could be also written as 'HEI[LATIN SMALL LETTER SHARP S]' --- which should be ---

Typo in grep-2.13 ChangeLog file

2012-07-05 Thread Johannes Meixner
tml>. -- because I am not a mercer ;-) Best Regards Johannes Meixner -- SUSE LINUX Products GmbH -- Maxfeldstrasse 5 -- 90409 Nuernberg -- Germany HRB 16746 (AG Nuernberg) GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer

Re: grep --ignore-case efficiency

2013-01-30 Thread Johannes Meixner
ironment that uses single byte character encoding? Kind Regards Johannes Meixner -- SUSE LINUX Products GmbH -- Maxfeldstrasse 5 -- 90409 Nuernberg -- Germany HRB 16746 (AG Nuernberg) GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer

bug#16812: Eszett handling

2014-02-20 Thread Johannes Meixner
t; in particular together with UTF-8. For example on http://lists.gnu.org/archive/html/bug-grep/2012-06/threads.html#00011 mail threads like "Ignore case handling of special unicode characters (case folding)" which is http://savannah.gnu.org/bugs/?36682 or the mail thread "gre

bug#17012: [3 PATCHES] Whitespace cleanup : Replace code-alignment tabs with spaces.

2014-03-18 Thread Johannes Meixner
t I do not understand the reason behind. By the way: I wonder why is there no INSTALL file in the coreutils git source repository at git://git.sv.gnu.org/coreutils I assume all this is intended and there are good reasons for it but from my point of view it looks somehow strange. Kind Regard

bug#18398: Probably found a bug in grep

2014-09-04 Thread Johannes Meixner
cess your "plain text files" as you did "since ever" with various "traditional" Unix/Linux tools, you must use the POSIX locale, otherwise you will get weird results and unexpected side-effects. See also http://www.gnu.org/software/gawk/manual/html_node/Ranges-and-Locale

bug#18398: AW: bug#18398: Probably found a bug in grep

2014-09-04 Thread Johannes Meixner
Hello, On Sep 4 08:28 Bergen, Andreas wrote (excerpt): ... the newest version available for Suse Linux (grep 2.7-5.7.1 in SLES 11 SP3) FWIW: The newest available grep versions for openSUSE are grep-2.14 for openSUSE:13.1 and grep-2.20 for openSUSE:Factory Kind Regards Johannes Meixner

bug#20526: BUG: text file is detected as binary

2015-05-08 Thread Johannes Meixner
://en.opensuse.org/SDB:Plain_Text_versus_Locale Kind Regards Johannes Meixner -- SUSE LINUX GmbH - GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton - HRB 21284 (AG Nuernberg)

bug#20826: SEEK_HOLE not supported for ext4 for kernel < 3.1

2015-06-16 Thread Johannes Meixner
OLE, it just ignores the flag and returns the current position as is. -- "SLE11" is "SUSE Linux Enterprise 11" which has kernel 3.0. Kind Regards Johannes Meixner -- SUSE LINUX GmbH - GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton - HRB 21284 (AG Nuernberg)

bug#20826: SEEK_HOLE not supported for ext4 for kernel < 3.1

2015-06-16 Thread Johannes Meixner
An addendum FYI: From our (SUSE) kernel developers I got the information that only ext4 has this problem and that kernel git commit c334b1138bd4 (handle SEEK_HOLE/SEEK_DATA generically in /fs/ext4/file.c) fixes it.

bug#20826: SEEK_HOLE not supported for ext4 for kernel < 3.1

2015-06-16 Thread Johannes Meixner
From one of our (SUSE) kernel developers I even got a proposal for a workaround in grep: --- a/src/grep.c +++ b/src/grep.c @@ -575,6 +575,17 @@ file_textbin (char *buf, size_t size, in off_t hole_start = lseek (fd, cur, SEEK_HOLE); if (0 <= hole_start) { +