* Yves-Alexis Perez [2012-04-15 09:18 +0200]: > On ven., 2012-03-23 at 23:39 +0100, Carsten Hey wrote: > > I think we should drop ftpmaster from CC in further mails. > > Maybe, since they don't seem to care about this.
They provided an IMHO acceptable, but not ideal, way (because there there does not seem to be an ideal way) to handle this. I suggested dropping them from CC because there is nothing relevant yet they could comment on, which presumably is also the reason they did not comment up to now ;) > Well, parsing python might not be an option, but what about: > > egrep -a "^C[1|2]='..'" waf > C1='#*' > C2='#%' We need to be able to repack a changed wafadmin directory into an existing waf script to gain anything. To repack, C1 and C2 need to be adapted. If adapting C1 and C2 is done via regular expressions, it would fail, possibly without being noticed, if, for example, the variable names in future waf versions change or if the character ' is part of this variable and you did not handle this in your regular expression. All in all, this is a rather natural approach for this problem, but it is all but robust. It could be done using regular expressions, but I assume that the effort required to ensure that it works correctly and to update it is way more than the effort to just shipping an unpacked waf in every waf using package. Besides this, the probability of unnoticed related errors is presumably unreasonably high. A way to handle this that would possibly make everybody happy would require to convince waf upstream to adapt waf. As already mentioned, the reason that we are not able to repack waf scripts in a reasonable way using only essential tools is that waf scripts are not clearly divided into a data part and an non-data part, i.e., C1 and C2 contain information that one would expect to be in a header and not in a script. If waf script's would instead of the variables C1 and C2 contain a header like the one below, and would parse the header itself to figure out which replacements it should do, then tools that unpack and/or repack waf scripts in a reliable way could easily be written. #=== # Waf-Data-Format: 1.0 # Waf-Archive-Type: tar.gz # Waf-Archive-Base-Directory: wafadmin # Waf-Line-Feed-Replacement: ab # Waf-Carriage-Return-Replacement: xy #==> #... #<== If such a header would be used by waf upstream, it would be important that there is exactly one space between the colon after the field name and the field's data. The reason for this is that a replacement string could begin with a space character. Introducing a way to escape some characters would IMO be too over-engineered. Alternatively, the (uppercased) hex values could be used instead of the real string, i.e., ' m' would be written as 206D in the header. Reasons to brute-force unused sequences instead of simply prefixing all line feeds and all carriage returns with a numbersign are: * Kepp the size of the encoded string as small as possible. Prefixing two of the possible 256 characters would enlarge the encoded string on average by 2/256 or 0.78%, given that the compression method is reasonable. * Some editors do not wrap lines by default. One could consider displaying just one long unwrapped line instead of multiple lines (on average size/128 lines) if a waf script is opened in an editor to be more beautiful. * The data part ends before a line that only contains the string '#<=='. If you would encode an archive of infinite size by the described prefixing, it would also contain this line _in_ the data part. A way to fix this it to additionally prefix the equal sign with a number sign. A presumably better way it to interpret the semantic of '#<==' as "the data part ends before the _last_ equal line in a comment block" and not "... before the _first_ equal line ...". Perl one-liner filters to encode and decode the data part using the described prefixing are: perl -e '$_ = do { local $/ = <> }; s/\n/\n#/sg; s/\r/\r#/sg; print "#", $_, "\n"' perl -e '$_ = do { local $/ = <> }; $_ = substr($_, 1, -1); s/\r#/\r/sg; s/\n#/\n/sg; print' They can be used in the same way as all other filters: cat file | filter > result With this approach, the need for C1 and C2 (or the according header fields) would vanish. The header would still be very useful, though. The remaining non-trivial part, which I will not do since I think the existing solution (shipping waf unpacked) is ugly but sufficient and I don't even use waf, is to try to convince waf's upstream to add such a header. With such a header and the according scripts, changes between different Debian revisions would still not be reviewable as easy as running "zrun interdiff *.diff.gz", but I don't think that this is a blocker, as long as README.source contains easy recipes for changing waf and reviewing these changes. > Well, when needed because we need to patch the build script (like for > the hppa issue) we can do that. Being able to do something doesn't necessarily mean that it can be done in an easy way. Regards Carsten P.S.: Do whatever you want to with this mail's content. If anything in it I wrote (everything that is not quoted from your previous mail) is copyrightable, which I doubt, then it is licensed under terms of the practically public domain equivalent license WTFPL 2.0 P.P.S.: If you want to test if the above can be embedded into a python script, set the script's encoding to latin-1, as described in PEP 0263 - or just copy the second line of an existing waf script. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org