> Under the ERE implementation in RELENG_8, I'm having > trouble figuring out how to group and backreference this. > Given a line, where: > If AAA is present, CCC will be too, and B may appear in between. If AAA is not present, neither CCC or B will be present. > DDDD is always present. > Junk may be present. > Match good lines and ouput in chunks. > echo junkAAAABCCCDDDDjunk | \ > This works as expected: > sed -E -n 's,^.*(AAAB?CCC)(DDDD).*$,1 \1 2 \2,p' > 1 AAABCCC 2 DDDD > But making the leading bits optional per spec does not work: > sed -E -n 's,^.*(AAAB?CCC)?(DDDD).*$,1 \1 2 \2,p' > 1 2 DDDD > Nor does adding the usual grouping parens: > sed -E -n 's,^.*((AAAB?CCC)?)(DDDD).*$,1 \1 2 \2,p' > 1 2 > How do I group off the leading bits? > Or is this a limitation of ERE's? > Or a bug? > Thanks.
Regular expressions are greedy by default. .* is matching "junkAAAABCCC" in your second and third example. Try `sed -E -n 's,^(.*)(AAAB?CCC)?(DDDD).*$,1 \1 2 \2 3 \3,p'` and you'll see what I mean. In perl I'd tell you to use .*? instead of .* but I have no idea what the posix equivalent is if it exists. Hope this helps. Joost Bekkers _______________________________________________ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"