https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241441

            Bug ID: 241441
           Summary: inconsistency between allowed empty regex for `awk -F`
                    and split()
           Product: Base System
           Version: 12.0-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: bin
          Assignee: b...@freebsd.org
          Reporter: free...@tim.thechases.com

I get an error when I try to use an empty regex for the field separator:

  $ echo hello | awk -F '' '{print $2}'
  awk: field separator FS is empty

but awk has no issues splitting things on an empty regex:

  $ awk 'BEGIN{s="hello"; split(s, a, ""); print a[1]}'
  h

Over on gawk, I get the expected behavior

  $ echo hello | awk -F '' '{print $1}'
  h

This is somewhat similar to #226112

  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226112

I get that awk uses EREs and `man re_format`  says that "A (modern [Extended])
RE is one or more non-empty branches, separated by '|'", but

1) that's not what split() does

2) it's not what gawk's -F parameter does

3) permitting an empty regex for splitting already seems supported in awk code
(as the split example shows) and shouldn't break any existing usage

4) as a non-workaround, `man re_format` says that the atom "()" matches the
null string, but

  $ echo hello | awk -F '()' '{print $1}'

doesn't split the row on the null regular expression (FWIW, gawk gives the same
results when using "()" as the split pattern).

In an ideal world, the behavior would match the behavior of gawk & the split()
function, splitting the record into each individual character.

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"

Reply via email to