Please find patch to enable character classes in 'tr' below.
On 2020-07-10 20:33, Rosen Penev wrote:
On Fri, Jul 10, 2020 at 5:15 PM Jordan Geoghegan <jor...@geoghegan.ca> wrote:
On 2020-07-10 16:59, Rosen Penev wrote:
On Fri, Jul 10, 2020 at 4:17 PM Jordan Geoghegan <jor...@geoghegan.ca> wrote:
On 2020-07-10 14:54, Rosen Penev wrote:
On Fri, Jul 10, 2020 at 2:29 PM Jordan Geoghegan <jor...@geoghegan.ca> wrote:
On 2020-07-10 14:15, Magnus Kroken wrote:
Hi Jordan
On 10.07.2020 22:45, Jordan Geoghegan wrote:
Hey folks,
Does the 'tr' utility support character classes in OpenWRT? I was
playing around with an OpenWRT x86_64 VM and I noticed that 'tr'
doesn't seem to support character classes.
The command " echo HELLO | tr '[:upper:]' '[:lower:]' " does not
convert to the text to lowercase as it should (and as required by
POSIX).
This would be expected behavior. OpenWrt disables tr character classes
in BusyBox by default, see [1]:
config BUSYBOX_DEFAULT_FEATURE_TR_CLASSES
bool
default n
config BUSYBOX_DEFAULT_FEATURE_TR_EQUIV
bool
default n
I don't know what the size cost in the BusyBox binary is, but that
will likely be the deciding factor for such a change.
1:
https://git.openwrt.org/?p=openwrt/openwrt.git;a=blob;f=package/utils/busybox/Config-defaults.in
Regards,
Magnus Kroken
Hi Magnus,
Thanks for confirming that so quickly.
I obviously understand that space saving is essential to OpenWRT, but
POSIX does require[1] that 'tr' support character classes:
awk '{print toupper($0)}' is an alternative.
Yes, but this means that any script expecting tr to work correctly could
explode, as tr silently ignores the character class and treats all the
characters literally.
git grep upper | grep tr\ | wc -l
3
In the packages feed. All those results are things that run on the
host, not on OpenWrt.
tr a-z A-Z works as an alternative and is used in many places.
tr a-z A-Z is bad practice as it can behave unexpectedly in different
locales; I've also heard tales of folks with Turkish locales having
issues with '0-9' for example.
Is a couple kb of space worth such a loss in portability (not to mention
deviating heavily from POSIX)?
Patches welcome to replace usage of tr with awk.
I don't think anyone runs OpenWrt with any locale other than the default.
I don't think it makes sense to replace usage of 'tr' with awk, it makes
more sense to just make tr work correctly. As requested, here's a patch
below
:class:
Represents all characters belonging to the defined character
class, as defined by the current setting of the LC_CTYPE locale cate-
gory. The following character class names shall be accepted
when specified in string1:
alnum blank digit lower punct upper
alpha cntrl graph print space xdigit
1: https://www.unix.com/man-page/posix/1posix/tr/
Regards,
Jordan
--- Config-defaults.in.orig Fri Jul 10 21:03:57 2020
+++ Config-defaults.in Fri Jul 10 21:03:22 2020
@@ -837,7 +837,7 @@
default y
config BUSYBOX_DEFAULT_FEATURE_TR_CLASSES
bool
- default n
+ default y
config BUSYBOX_DEFAULT_FEATURE_TR_EQUIV
bool
default n
_______________________________________________
openwrt-devel mailing list
openwrt-devel@lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel