On Sat, Dec 11, 2010 at 02:26:57AM +0900, Osamu Aoki wrote:
> Hi, maintainer of uscan code:
> 
> In short, let me say that this documentation change caused by the
> original bug report adds no value and should be reverted.

I have to disagree.  I wouldn't have committed the code otherwise.

> I do not understand why this patch was accepted.  Situation can be
> summarized as:
> 
>  * The original documentation with .* was correct documentation and style 
>    is typical one for such cases.

Typical does not mean correct.  It is always best with regular
expressions to be as exact as possible to ensure you're matching only
what is expected.  We have had various people report "problems" with
uscan which were actually due to poorly written watch files.

In most cases, the watch file *should* be using “.+” since the person
expects the grouping to match at least one character.  If they are ok
with matching zero characters, then they can make that informed decision
when writing their watch file.

Our documentation is just a recommendation based on previous experience,
not the One True Way.  Each person needs to write their watch file
understanding what it is they are intending to match.

>  * The changed documentation with .+ is also correct in the sense it works.

It's also correct in the sense that it better expresses the intended
behavior: match 1 or more characters.

>  * The many REGEXs of actual uscan code use ".*" where ".+" may be used 
>    but these are not asked to be changed nor changed.

This is because those regexes are intended to match 0 or more
characters.  Using “.+” would not be correct.

> My thought is "why we make this pedantic useless cosmetic changes with
> uncommon syntax.  It adds confusion and creates resource drains."
> I see no practical benefits of change from .* to .+ in the documentation.

This syntax is not uncommon.  It is very normal and only adds confusion
if someone isn't familiar with PCRE syntax.  Knowledge of PCRE syntax is
required to create a watch file that properly expresses what the watch
file should be matching.

> Seriously, if such null matching is the REAL problem, we have many such
> REGEX in uscan code itself.  I have seen Perl REGEX ".*" in uscan.  I
> have also seen Perl REGEX like ".*?" in uscan which looks even funnier
> than use of ".*" since it is the same as ".*" and not so common one to
> me.

No, “.*” and “.*?” are not the same.  The former is greedy, the latter
is not.  This is an important difference and one which we took into
account when writing uscan.

> If Ben is in mission to spread the use of ".+" as much as possible,
> these look to me better ones to work on.  (I am not suggesting to change
> code here.  The original author may have deep thought behind his choice
> and I see no benefit of using ".+".) 
> 
> Let's at least keep coding style the same between code of uscan and
> documentation.  So this change is not a good idea for uscan.

The type of regular expressions used internal to uscan has no bearing on
the type of regular expressions used in watch files.  They're being used
for different purposes.


On Sat, Dec 11, 2010 at 07:49:17AM +1100, Ben Finney wrote:
> On 11-Dec-2010, Osamu Aoki wrote:
> > Anyway, that is not going to block us.  We pick latest
> > version which is one with version number (null string should be oldest
> > one).  So the use of ".*" is not bug here.
> 
> A Debian package with an empty upstream version string violates Policy
> §5.6.12:

This has no bearing on upstream versioning.  They can use whatever
format they like.  We would adjust it to fit Debian's requirements on
the packaging side.

> Packages which use ‘(.*)’ in the ‘debian/watch’ file to match the
> upstream version string have a watch file that unnecessarily allows a
> Policy-violating version string to be created.
> 
> The documentation should not recommend that. That's the entire scope
> of this bug report.

I'd say it's more about having correct regular expressions.  In all of
the cases that were fixed, we're expecting at least 1 character to be
matched so “.+” should be used.

-- 
James
GPG Key: 1024D/61326D40 2003-09-02 James Vega <james...@debian.org>

Attachment: signature.asc
Description: Digital signature

Reply via email to