wacko LC_ALL=zh_TW.utf8 egrep -i
$ printf Me\\nji\\n|LC_ALL=zh_TW.utf8 egrepMe\|ji Me ji $ printf Me\\nji\\n|LC_ALL=zh_TW.utf8 egrep -i Me\|ji ji $ printf Me\\nji\\n|LC_ALL=zh_TW.utf8 egrep -i Me Me $ printf Me\\nji\\n|LC_ALL=zh_TW.utf8 egrep -i me\|ji ji GNU grep 2.5.3 ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: wacko LC_ALL=zh_TW.utf8 egrep -i
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to [EMAIL PROTECTED] on 8/29/2007 6:13 AM: > $ printf Me\\nji\\n|LC_ALL=zh_TW.utf8 egrep -i me\|ji > ji > > GNU grep 2.5.3 Wrong list. Coreutils does not provide grep. - -- Don't work too hard, make some time for fun as well! Eric Blake [EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG1Wh+84KuGfSFAYARAqnTAJ48RQLzdI8r0jQ7tp2dzD6OWXWjTQCfbURC zqOXwJeZRlhH2/gBkSk/vus= =TNtL -END PGP SIGNATURE- ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: [PATCH] Command line parsing of ls with genparse
On Tue, Aug 28, 2007 at 09:08:51PM -0600, Eric Blake wrote: > According to Michael Geng on 8/28/2007 12:33 PM: > > > > In the present version of genparse new strings are always printed > > in new lines. For example (also from the ls commmand): > > > > d / directory flag"list directory entries instead > > of contents," > > " and do not dereference symbolic > > links" > > Why not make genparse a bit smarter, and let the user supply free-form > text as the option description. Genparse should then wrap it to fit an > 80-column screen before generating the resulting usage() in the .c file. > Then the above example would simply be: > > d / directory flag \ > "list directory entries instead of contents, and do not dereference > symbolic links" > > with the __GNU_GLOSSARY__(29) being the formatting hint of where the > auto-wrapping should occur in the output English text. I think that's a good idea. How about adding a --linebreak[=width] command line switch to genparse which enables breaking lines on the help screen automatically to the specified width or 80 columns if --linebreak is given without argument? Michael ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: [PATCH] Command line parsing of ls with genparse
On Tue, Aug 28, 2007 at 01:21:46PM -0700, Eric Blake-1 wrote: > > > 2. ls.c depends ls-clp.h (the generated parser) > >ls-clp.h depends on ls.gp (the genparse file) > >ls.gp depends on ls.c because ls.gp is embedded as a comment in ls.c > >-> There is a circular dependency! > > That seems wrong to me. Isn't it really: > > ls$(EXEEXT) directly depends on ls.o and ls-clp.o > ls.o directly depends on ls.c and ls-clp.h > ls-clp.o directly depends on ls-clp.c and ls-clp.h > ls-clp.c directly depends on ls.gp > ls-clp.h directly depends on ls.gp > ls.gp directly depends on ls.c > > No cycle there, even though ls.c is an indirect > dependency of ls$(EXEEXT) through more than one > leg of the transitive closure. You are right. I verified this and it builds properly with the following modifications on src/Makefile.am: --- coreutils-6.9.orig/src/Makefile.am 2007-03-20 08:24:27.0 +0100 +++ coreutils-6.9/src/Makefile.am 2007-08-29 21:14:29.0 +0200 @@ -48,7 +48,7 @@ EXTRA_DIST = dcgen dircolors.hin tac-pipe.c \ groups.sh wheel-gen.pl extract-magic c99-to-c89.diff BUILT_SOURCES = -CLEANFILES = $(SCRIPTS) su +CLEANFILES = $(SCRIPTS) su *.gp *-clp.c *-clp.h AM_CPPFLAGS = -I$(top_srcdir)/lib @@ -185,14 +185,16 @@ __SOURCES = lbracket.c cp_SOURCES = cp.c copy.c cp-hash.c -dir_SOURCES = ls.c ls-dir.c -vdir_SOURCES = ls.c ls-vdir.c -ls_SOURCES = ls.c ls-ls.c +dir_SOURCES = ls.c ls-dir.c ls-clp.c ls-clp.h +vdir_SOURCES = ls.c ls-vdir.c ls-clp.c ls-clp.h +ls_SOURCES = ls.c ls-ls.c ls-clp.c ls-clp.h chown_SOURCES = chown.c chown-core.c chgrp_SOURCES = chgrp.c chown-core.c mv_SOURCES = mv.c copy.c cp-hash.c remove.c rm_SOURCES = rm.c remove.c +tail_SOURCES = tail.c tail-clp.c +wc_SOURCES = wc.c wc-clp.c md5sum_SOURCES = md5sum.c md5sum_CPPFLAGS = -DHASH_ALGO_MD5=1 $(AM_CPPFLAGS) @@ -363,3 +365,14 @@ | grep -Ev -f $$t &&\ { echo 'the above variables should have static scope' 1>&2; \ exit 1; } || : + +ls.$(OBJEXT): ls-clp.c ls-clp.h +tail.$(OBJEXT): ls-clp.c tail-clp.h +wc.$(OBJEXT): ls-clp.c wc-clp.h + +%-clp.c %-clp.h: %.gp + genparse --longmembers --internationalize -o $(*F)-clp $< + +%.gp: %.c + sed -n -e '/genparse file starts here/,/genparse file ends here/p' < $(*F).c | \ + sed -e '/genparse file ends here/d' -n -e '2,$$p' > $@ Michael ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: [PATCH] Command line parsing of ls with genparse
On Tue, Aug 28, 2007 at 01:21:46PM -0700, Eric Blake-1 wrote: > > +++ coreutils-6.9/src/ls.c 2007-08-26 19:58:20.0 +0200 > > @@ -76,7 +76,6 @@ > > # define SA_RESTART 0 > > #endif > > > > -#include "system.h" > > #include > > Why are you deleting this include? Without it, how do you ensure > that is pulled in before anything else? If you intend for > ls-clp.h to fill this role, then it must be included before any > system files. Also, are you sure you are not falling foul of > any 'make distcheck' rules in Makefile.maint? I need the following definitions in ls-clp.c: 1. the i18n macro _() 2. the definition of PACKAGE_BUGREPORT 3. the definition of true and false I got everything by including system.h in ls-clp.c. Unfortunately I had to exclude it from ls.c then because there were duplicate definitons. I entered this when I wrote the patch for the wc command and at that time I was happy to get it compiled, that's all. I'm sure there is a better solution. Maybe I have to include other files. I agree that this has to be fixed. > > + Cmdline(&cmdline, argc, argv); > > GNU coding standards want a space between the function > name and open (. Right, thanks. > > +/* Extract the following section an process it with genparse > > + (see http://genparse.sourceforge.net) in order to generate a parser > > + for the command line arguments and a usage function for printing a > > help > > + screen. */ > > + > > +/* genparse file starts here > > +#include > > +#include "system.h" > > +#include "ls.h" > > + > > +#exit_value LS_FAILURE > > I know the C standard requires this, but in practice, are all > C preprocessors tolerant of comments that contain lines > that look like preprocessing directives but which are not? That's potentially another drawback of embedding the genparse files in the C sources. > > +NONE / helpflag"display this help and exit" > > +NONE / version flag"output version information and > > exit" > > It looks like one drawback of using genparse is that you lose > the system.h magic that ensures consistency between all > the apps with --help and --version, since you can't really > use the preprocessor macros *_HELP_OPTION_* here. I could imagine that this can be solved by adding the capability to include parameter definitions in a genparse file, i.e. include genparse files in other genparse files. There could be a shared genparse file with the parameter definitions for help and version which could be included by all other genparse files. > > +Report bugs to <__STRING__(PACKAGE_BUGREPORT)>. > > What happened to the TRANSLATOR comment that reminds > them to add a second line, including the address to report > translation bugs to? Also, it isn't very obvious how this > will affect xgettext extraction of strings that need > translation. Are you sure you haven't broken things > for other locales? Would the generated ls-clp.c need > to be added to POTFILES.in, or is your intent still to > have all translatable strings reside in ls.c? If I understood the i18n mechanism right then the C preprocesor is needed for the _() macros to take effect. So the genparse files can't be translated directly, even if they are embedded in C files because they are still inside of a comment. So I think ls-clp.c would have to be added to POTFILES.in. I haven't investigated how a genparse based solution affects i18n and I generally have very view experiance with i18n. I would expect that problems are caused by different partitioning of text. In the present version of ls the usage() function calls fputs() several times. The genparse version prints everything in 1 single call to printf(). So the usage() text in the present ls.c is split into multiple _() macros, whereas ls-clp.c uses 1 single _() macro for the whole help screen. Do you agree that this is the main source of trouble? Do you see other problems? I haven't fully thought this through but I think I could change genparse such that the user can control when a new print command should start thus giving control of partitioning translatable text to the user. Michael ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: [PATCH] Command line parsing of ls with genparse
[EMAIL PROTECTED] (Michael Geng) wrote: ... > of text. In the present version of ls the usage() function calls > fputs() several times. The genparse version prints everything in > 1 single call to printf(). So the usage() text in the present ls.c > is split into multiple _() macros, whereas ls-clp.c uses 1 single _() > macro for the whole help screen. Consider separating it into strings no longer than 509 bytes each and printing them separately. That's a portability limitation imposed by some c89 compilers. (see gcc's -Woverlength-strings) If you were to run "make distcheck", this and some other problems would be exposed. For example, you added at least one function that was not declared static. With your changes does "make check" still pass? ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: Minutes of the August 28th 2007 teleconference
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Andrew Josey on 8/29/2007 12:16 AM: > Austin Group Minutes of the 28th August 2007 Teleconference Austin-379 Page 1 > of 1 > XCU ERN 165 mv Accept > > Send down the interpretations track. > The standard is clear, the standard is wrong , concerns > are being forwarded to the sponsor. > > XCU ERN 166 mv Accept as marked below > > Send down the interpretations track. > The standard is ambiguous, no conformance distinctions can be > made about different implementations , concerns > are being forwarded to the sponsor. We should consider editing both of these interpretations to also apply to ln. The coreutils list noted, just this month, that 'ln a/f b/f c && rm -Rf a b' risks losing user data; and 'ln a /' should not attempt to create '//a'. - -- Don't work too hard, make some time for fun as well! Eric Blake [EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG1gV+84KuGfSFAYARAiqoAKC5z3Zk3q/89pp1kPRHV4D5D6OHpACgqZeo Jy+ZY/4PcOP4CzgCuX/PaQ4= =p9ry -END PGP SIGNATURE- ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: Minutes of the August 28th 2007 teleconference
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Eric Blake on 8/29/2007 5:47 PM: > We should consider editing both of these interpretations to also apply to > ln. The coreutils list noted, just this month, that > 'ln a/f b/f c && rm -Rf a b' > risks losing user data; That example should have read: 'ln -f a/f b/f c && rm -Rf a b' - -- Don't work too hard, make some time for fun as well! Eric Blake [EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG1g9e84KuGfSFAYARAiSIAJ9KlxCvA3bJEQrklAR+LGTLwCUttACgvI8R KooGPUK41dCjmhApjepE/CI= =5Nbi -END PGP SIGNATURE- ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: [PATCH] Command line parsing of ls with genparse
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Michael Geng on 8/29/2007 2:29 PM: > On Tue, Aug 28, 2007 at 01:21:46PM -0700, Eric Blake-1 wrote: >>> +++ coreutils-6.9/src/ls.c 2007-08-26 19:58:20.0 +0200 >>> @@ -76,7 +76,6 @@ >>> # define SA_RESTART 0 >>> #endif >>> >>> -#include "system.h" >>> #include >> Why are you deleting this include? Without it, how do you ensure >> that is pulled in before anything else? If you intend for >> ls-clp.h to fill this role, then it must be included before any >> system files. Also, are you sure you are not falling foul of >> any 'make distcheck' rules in Makefile.maint? > > I need the following definitions in ls-clp.c: My complaint was not that you moved #include "system.h" to ls-clp.h (via the genparse chunk), but that you forgot to put #include "ls-clp.h" first, prior to . Remember, in gnulib-based projects, absolutely has to be included prior to any system headers, because we provide replacement system headers (such as a replacement ), and our replacements sometimes depend on the contents of (although we are trying to fix those cases where we can). > > 1. the i18n macro _() > 2. the definition of PACKAGE_BUGREPORT > 3. the definition of true and false Shouldn't true and false just come from C99? Or even the gnulib module? Here's a case where providing your own definition in ls-clp.h is liable to break if you first include "system.h" (which picks up the C99 or gnulib ). >> What happened to the TRANSLATOR comment that reminds >> them to add a second line, including the address to report >> translation bugs to? Also, it isn't very obvious how this >> will affect xgettext extraction of strings that need >> translation. Are you sure you haven't broken things >> for other locales? Would the generated ls-clp.c need >> to be added to POTFILES.in, or is your intent still to >> have all translatable strings reside in ls.c? > > If I understood the i18n mechanism right then the C preprocesor > is needed for the _() macros to take effect. Not quite - xgettext is not a C preprocessor, rather it is a regular expression matcher. It recognizes C comments, and normally will not extract any language comments inside them, but on the other hand, the gettext manual recommends teaching xgettext the patterns to look for so that translations can be grabbed from original source files rather than from generated byproducts (ie. POTFILES.in would list getdate.y, not the bison-generated getdate.c, if getdate has translatable strings). Really, the role of the C preprocessor here is to make typing gettext () shorter, as in _(); provided that xgettext is told that _ marks a translatable string. > So the genparse files > can't be translated directly, even if they are embedded in C files > because they are still inside of a comment. So I think ls-clp.c > would have to be added to POTFILES.in. I think that is true, unless you can teach xgettext to look inside comments. > > I haven't investigated how a genparse based solution affects > i18n and I generally have very view experiance with i18n. I > would expect that problems are caused by different partitioning > of text. In the present version of ls the usage() function calls > fputs() several times. Not only because of the 509-character string literal limit that Jim mentioned, but also because gettext recommends providing no more than about 6 or 7 lines of text to the translator at a time. The more lines there are to translate in one go, the harder it is for a translator to spot the minor change embedded in those lines when all you do is edit one word in the string. The gettext manual talks more about this. > The genparse version prints everything in > 1 single call to printf(). So the usage() text in the present ls.c > is split into multiple _() macros, whereas ls-clp.c uses 1 single _() > macro for the whole help screen. Do you agree that this is the main > source of trouble? Yes, that's definitely part of the problem. The other part is that the _() macro only works if xgettext was able to extract the string to begin with. > Do you see other problems? I haven't fully > thought this through but I think I could change genparse such that > the user can control when a new print command should start thus > giving control of partitioning translatable text to the user. Sounds like it would definitely be needed before coreutils could consider switching to genparse. By the way, thanks for your efforts in trying to improve all of this. Even if Jim doesn't accept your code, it is making genparse better, and it is finding areas in coreutils that could use improvement regardless of how option-parsing code is generated. - -- Don't work too hard, make some time for fun as well! Eric Blake [EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.m