05/05/2012 11:20 PM, intrigeri: > anonym wrote (03 May 2012 12:44:47 GMT) : >> See the attached patch. The is_in_list() function therein: >> * interprets white-spaces and newlines as separators; all other >> chacaters form "words", so there should be no more surprises. > I find this design decision sane to make this function useful and as > robust as possible in the context of live-boot. > > However, the function semantics is different from how argument lists > work in shell usually: the shell has a slightly more elaborate vision > of what a list is, e.g. it supports list elements that contain whitespace.
Right, but such lists are fragile, error prone and confusing, and I think (hope?) we all agree that they should be avoided. In fact, that has been avoided in several parts of the code already. For instance, snapshot data are returned as a shell list of colon-separated "lists" in probe_for_filename(), even though shell lists of shell lists could have been used. > Therefore, I think the function's name should reflect the "list" it's > about is *not* a "normal" shell list. Perhaps something like > is_in_whitespace_separated_list. Yes, that's ugly. But with the > current function name, while I am in a shell programming mindset, the > way it is implemented would probably surprise me more than once. Sure. I'll pick is_in_space_sep_list() for something with a bit more brevity. What you suggest is also necessary because we also need a is_in_comma_sep_list() for comma separated lists. >> * doesn't depend on quoting input to form "lists". > What do you mean exactly? You don't have to do quote the lists elements to form the list parameter to send to is_in_list(). >> * doesn't use any fancy grep functionality (no -w, no \< or \>), >> just basic regex stuff that ought to be correctly implemented >> in busybox. > Just to be sure, have you verified it works properly with > busybox grep? Of course. >> * makes the code much more readable. > Sure. > > > Also, I was surprised to discover that `is_in_list a b a` is wrong in > dash, while it's true in bash and in busybox sh. After `local l=${@}' > is evaluated, $l contains "b" in dash, while it contains "b a" in bash > and busybox sh. The rest of the code (where you migrate a few `grep > "$list"' to `is_in_list elt $list', removing quotes along the way) > assumes bash and busybox behaviour. For what it's worth, replacing > that line with `local l="${*}"' makes all these three shells think, in > perfect agreement, that `is_in_list a b a' is true, which I would find > much saner, given the idea is to avoid a certain class of subtle, hard > to understand future bugs :) Thanks! >> + echo ${l} | grep -qe "^\(.*[[:space:]]\)\?${e}\([[:space:]].*\)\?$" > If the regexp language in use knows about them, these should really be > non-capturing, grouping parenthesis -- e.g. in Perl regexp syntax, one > would write the first group like this: (?:.*[[:space:]])? > > (It looks like the busybox package only installed busybox(1) in terms > of busybox documentation on my Debian sid, and alas, this manpage does > not document what kind of pattern is supported by their grep, so > I can't check myself right now.) Non-capturing groups are not part of POSIX regular expressions so they shouldn't be implemented in busybox grep (which btw also lacks GNU grep's -P parameter for perl regular expressions, which otherwise could've been used). Cheers!
signature.asc
Description: OpenPGP digital signature