[dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread FRIGN
Hello fellow hackers, the current tr(1)-implementation has really been giving me nightmares, so I rewrote it. Given POSIX really sucks in some areas, I went off the path at some areas, but not in a way that it would break scripts. Here's a comparison and you let me know what you prefer: 1) GNU co

Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread Nick
Quoth FRIGN: > - UTF-8: not allowed in POSIX, but in my opinion a must. This > finally allows you to work with UTF-8 streams without > problems or unexpected behaviour. I fully agree (unsurprisingly). Anything that relies on the POSIX behaviour to do weird things involving mu

Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread random832
On Fri, Jan 9, 2015, at 16:44, Nick wrote: > Quoth FRIGN: > > - UTF-8: not allowed in POSIX, but in my opinion a must. This > > finally allows you to work with UTF-8 streams without > > problems or unexpected behaviour. > > I fully agree (unsurprisingly). Anything that relies

Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread FRIGN
On Fri, 09 Jan 2015 17:41:19 -0500 random...@fastmail.us wrote: > Or, is it possible that FRIGN misinterpreted the prohibition on > "multi-character collating elements" ? Did you read what I said? I explicitly went away from POSIX in this regard, because no human would write ""tr '\303\266o' 'o\3

Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread random832
On Fri, Jan 9, 2015, at 17:48, FRIGN wrote: > Did you read what I said? I explicitly went away from POSIX in this > regard, > because no human would write ""tr '\303\266o' 'o\303\266'". POSIX doesn't require people to write it, it just requires that it works. POSIX has no problem with also allow

Re: [dev] [sbase] [PATCH-UPDATE] Rewrite tr(1) in a sane way

2015-01-09 Thread FRIGN
On Fri, 9 Jan 2015 20:39:48 +0100 FRIGN wrote: > sin just told me the patch was missing chartorunearr.c which in fact is the case. Here's an updated patch which should cleanly apply to a vanilla codebase at HEAD. Cheers FRIGN -- FRIGN >From f626eecfb757ab46cab7f16dc439258a6a497f1b Mon Sep

Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread FRIGN
On Fri, 09 Jan 2015 17:55:04 -0500 random...@fastmail.us wrote: > POSIX doesn't require people to write it, it just requires that it > works. POSIX has no problem with also allowing a literally typed > multibyte character to refer to itself. It's basically saying that if > someone _does_ write '\3

Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread random832
On Fri, Jan 9, 2015, at 18:08, FRIGN wrote: > > This is madness. If you want the bytes to be collated, I don't see where you're getting that either of us want the bytes to be collated. I don't even know what you mean by "collated", since collating is not what tr does, except when ordering ranges.

Re: [dev] [sbase] [PATCH] Rewrite tr(1) in a sane way

2015-01-09 Thread FRIGN
On Fri, 09 Jan 2015 18:24:46 -0500 random...@fastmail.us wrote: > Even if octal values could be more than three digits, I have no idea > what you think 50102 is. Its decimal value is 20546. Its hex value is > 0x5042. I have no idea what it has to do with character U+00F6 whose > UTF-8 representati

Re: [dev] [sbase] [PATCH-UPDATE] Rewrite tr(1) in a sane way

2015-01-09 Thread Dmitrij D. Czarkoff
FRIGN said: > +#define UPPER "A-Z" > +#define LOWER "a-z" > +#define PUNCT "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~" These definitions hugely misrepresent corresponding character classes. -- Dmitrij D. Czarkoff