Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints

Alexis Ballier Sat, 08 Jul 2017 13:35:10 -0700

On Sat, 08 Jul 2017 20:44:24 +0200
Michał Górny <mgo...@gentoo.org> wrote:


> On sob, 2017-07-08 at 16:12 +0200, Alexis Ballier wrote:
> > On Sat, 08 Jul 2017 11:43:39 +0200
> > Michał Górny <mgo...@gentoo.org> wrote:
> >   
> > > Hi, everyone.
> > > 
> > > I think the affairs have settled enough and I've finished filling
> > > in the pre-GLEP for REQUIRED_USE auto-enforcing. It's got all
> > > the algorithms, rationale and separated reference implementation.
> > > 
> > > If there are no major concerns raised, I will soon start working
> > > on writing an optimized implementation for pkgcore/pkgcheck
> > > and integrating the verification algos with the CI.
> > > 
> > > The pre-GLEP for review is here:
> > > 
> > > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:ReqUse  
> > 
> > 
> > Constraint group reordering algorithm
> > 
> > I really think we should only consider REQUIRED_USE with
> > forced/masked useflags instantiated there. And ban (in repoman)
> > REQUIRED_USE that contain some "False": "a? ( b )" with 'a' free
> > and 'b' masked is perfectly ok now but it hides a serious problem
> > in the package/profile. Instantiating this would give: "a? ( False
> > )" and be an error just like we have depend.bad & co. This is
> > independent of auto solving or not, it's already wrong.  
> 
> As I've already explained you multiple times, this obtains *exactly
> the same* effect. However, it's much simpler when it's done like this
> because it makes it possible to reuse the already defined algorithms
> instead of having to precisely define how to preprocess REQUIRED_USE
> for this and cover all the corner cases.

Simpler??? I don't think so. What I wrote clearly pinpoints that:
When you'll write the algorithm for "Verifying that the constraints do
not alter immutable flags" you'll notice this is exactly that and can
be put as a preprocessing step and then you can drop all the corner
cases considerations for immutable flags. I never understood why you're
insisting that much on immutables: they're really not natural, not
simple, not standard, and carrying them all over seems to be a burden
to me.

> > Reordering is a dangerous path as we've already seen since it can
> > create unexpected loops for the solver.  
> 
> Freeform reordering is dangerous, and I've removed that already.
> Reordering restricted to immutables can not cause any issues that any
> other solution wouldn't cause.

You're very likely right there. Any proof? Esp. any proof that the
checker still guarantees the existence of a solution in all cases?
I'm not asking for a formal proof, but simply a bit more details than
just an assertion saying it's fine.

> > Working on instantiated REQUIRED_USE constraints would probably
> > simplify quite a bit your GLEP too: you already have the "Verifying
> > that the constraints do not alter immutable flags" part that roughly
> > does the same thing as instantiating, except if you assume it's
> > already true you can skip the reordering.  
> 
> Except that the reordering can be described in 2 points, and so can be
> the immutability verification. Please prove that you can provide
> a simpler explanation that doesn't fail in any of the corner cases.

Except reordering is an invention and immutable checking is simply
applying boolean logic rules to your implication and check that no
"False" can appear. You can simply start by applying boolean logic and
forget about reordering.


> > Concept for transforming REQUIRED_USE into implications
> > 
> > Ok, now I probably understand better the concept of common prefix.
> > I'm definitely biased here, but I would feel better with a more
> > recursive presentation of it. Assume we have 'validate(list of
> > clauses)'; basically, the common prefix idea is that for an
> > implication 'foo? ( consequences = list of clauses )' you first
> > validate the consequences as if it were a REQUIRED_USE (so that the
> > subtree rooted at foo is not self-conflicting) and then consider
> > the whole thing as a clause. The idea would then be to have similar
> > checks as to what you've written but working on trees (ASTs)
> > instead of flattened clauses. This would avoid having to deal with
> > unique identities (these would come for free) and IMHO would be
> > easier to understand. I'm not sure how to do this though, I'll ping
> > you when I have some idea.  
> 
> Well, the problem of common prefix is quite complex, and I'm not even
> sure if it's really worth more consideration. After all, we're
> prettych much talking about doing:
> 
>   a? ( !a ... )
> 
> which has extremely low usability and even lower likeness of
> occurring.


Hmm. I think you came up with more valid cases. Can't remember a
precise one though.


> > One big advantage of working on ASTs is that it becomes trivial to
> > suggest a proper reordering.  
> 
> Reordering is never a trivial problem. Unless I'm missing something,
> it is technically possible that a 'reordering' will actually require
> a sub- item being moved out of the containing group.

Not if done at the AST level.

> And to be honest, I find the output of the verification script in this
> regard quite useful. That is, it saying 'X affects Y, so it needs to
> go before it' is quite clear to me. I don't think most developers
> would actually need to script to pinpoint a specific location for
> every single constraint.

In most cases this is sufficient.
Think of a more complex case:
A -> B
B -> C
A -> D
D -> C

   |-> B -|
A -|      |->C
   |-> D -|

It's starting to be a more complex mental exercise to get the proper
ordering when given the 1st form only.


Actually, considering people rant against git merges because they want
linear history in the graph log but fail to understand 'git log' is
precisely about displaying such a linear ordering, I'm ready to bet
someone will rant :)


> > -------
> > 
> > Restrictions on REQUIRED_USE format
> > 
> > I still fail to see the point here. One can simply apply the
> > rewriting you suggest below and be done with it. Rationale is not
> > very convincing to me:
> > 
> > - avoiding unpredictable results of automatic flag adjustments:
> >     A deterministic algorithm is, by definition, predictable.  
> 
> s/unpredictable/surprising/?
> 
> The goal is for it do something that the developer *not reading
> the spec* could reasonably predict happening.


There is a huge gap between failing to solve a constraint voluntarily
because it has one too much nesting level and having a repoman warning
telling 'it is not recommended you write it that way'. The latter
ensures said developer is able to predict what is happening, the former
just adds annoyance onto users for no real reason.


> > - improving readability of REQUIRED_USE constraints:
> >     No need for a restriction for that. If people want to shoot
> >     themselves in the foot, it is not a PMS problem. I see that
> >     like proposing death penalty for those who commit
> > suicide :)  
> 
> This is not PMS. This is a GLEP which serves both the purpose of
> a technical specification with extended rationale and a policy
> document.

Isn't the goal of it to have it in a future EAPI ?


> > - keeping the specification and implementation relatively simple:
> >     You already define everything for working without
> > restriction. Plus, unlimited implication nesting has the same
> > complexity.  
> 
> No, I don't. I don't cover the meaning of nested groups and things
> just explode when they come into game.


You mostly do with the rewriting part, you're only refusing to admit
it :)


> > -------
> > 
> > 
> > Do you have numbers on the checker run on all inputs from
> > gentoo-x86 ? Since we're dealing with heuristics those are
> > particularly important to validate we're not rejecting too many
> > constructs.  
> 
> I think I've mentioned that a few times in the GLEP. The validation
> and verification bits were tested on all REQUIRED_USE cases from a few
> days ago. For the former, all cases are listed in rationale. For
> the latter, the reference implementation has test cases for every
> construct that triggered an issue, including the false positives that
> were fixed. I've only skipped a few that were completely redundant
> (e.g. different versions of the same package that added 1-2
> irrelevant items).

Unless I'm missing something, rationale seems more about cases rejected
by the restricted syntax. Numbers I'm talking about is the # of rejected
constraints vs accepted (and assumed solvable) ones.


> The solver was tested on all combinations of most of the corner
> cases. I have only skipped those that would require humongous number
> of combinations.

That's great!

> I haven't tested all combinations of masked/forced flags from
> the profiles though. I plan to do this by implementing it in pkgcheck.
> I will also have better performance reports based on optimized
> algorithms and real use.

Makes sense.


Alexis.

Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints

Reply via email to