Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints

Alexis Ballier Sun, 09 Jul 2017 02:30:12 -0700

On Sat, 08 Jul 2017 23:56:07 +0200
Michał Górny <mgo...@gentoo.org> wrote:


> On sob, 2017-07-08 at 22:34 +0200, Alexis Ballier wrote:
> > On Sat, 08 Jul 2017 20:44:24 +0200
> > Michał Górny <mgo...@gentoo.org> wrote:
> >   
> > > On sob, 2017-07-08 at 16:12 +0200, Alexis Ballier wrote:  
> > > > On Sat, 08 Jul 2017 11:43:39 +0200
> > > > Michał Górny <mgo...@gentoo.org> wrote:
> > > >     
> > > > > Hi, everyone.
> > > > > 
> > > > > I think the affairs have settled enough and I've finished
> > > > > filling in the pre-GLEP for REQUIRED_USE auto-enforcing. It's
> > > > > got all the algorithms, rationale and separated reference
> > > > > implementation.
> > > > > 
> > > > > If there are no major concerns raised, I will soon start
> > > > > working on writing an optimized implementation for
> > > > > pkgcore/pkgcheck and integrating the verification algos with
> > > > > the CI.
> > > > > 
> > > > > The pre-GLEP for review is here:
> > > > > 
> > > > > https://wiki.gentoo.org/wiki/User:MGorny/GLEP:ReqUse    
> > > > 
> > > > 
> > > > Constraint group reordering algorithm
> > > > 
> > > > I really think we should only consider REQUIRED_USE with
> > > > forced/masked useflags instantiated there. And ban (in repoman)
> > > > REQUIRED_USE that contain some "False": "a? ( b )" with 'a' free
> > > > and 'b' masked is perfectly ok now but it hides a serious
> > > > problem in the package/profile. Instantiating this would give:
> > > > "a? ( False )" and be an error just like we have depend.bad &
> > > > co. This is independent of auto solving or not, it's already
> > > > wrong.    
> > > 
> > > As I've already explained you multiple times, this obtains
> > > *exactly the same* effect. However, it's much simpler when it's
> > > done like this because it makes it possible to reuse the already
> > > defined algorithms instead of having to precisely define how to
> > > preprocess REQUIRED_USE for this and cover all the corner cases.  
> > 
> > Simpler??? I don't think so. What I wrote clearly pinpoints that:
> > When you'll write the algorithm for "Verifying that the constraints
> > do not alter immutable flags" you'll notice this is exactly that
> > and can be put as a preprocessing step and then you can drop all
> > the corner cases considerations for immutable flags. I never
> > understood why you're insisting that much on immutables: they're
> > really not natural, not simple, not standard, and carrying them all
> > over seems to be a burden to me.  
> 
> I wrote the algorithms, and they're simple. This specific check is
> the combination of three simple steps:
> 
> a. reordering the groups based on immutables,
> 
> b. transforming the AST into flat form,
> 
> c. verifying each flat constraint.
> 
> The first step is trivial -- it's basically 'move true to front, false
> to back'. The second step is more complex but it's needed anyway,
> and quite well-defined, especially with the assumption that all
> the groups always have at least one flag inside. The third step is
> trivial again because it's just checking the conditions and
> constraints against a list.
> 
> The alternative to reordering is altering the groups. Altering means
> we need to have separate logic for every type of group while sorting
> works the same in all of them. Altering means we need to explicitly
> special case forcing 1 and >1 items, and masking all items, for each
> group. Again, sorting does not need to be concerned about that because
> the check following it (also trivial) will catch it anyway.

You don't seem to get how normalizing and constant
propagation/elimination works.

Basically, reordering would be:
And(list) -> And( forced(list) + free(list) + masked(list) )
Or(list)  -> Or( ... . )
etc.

While normalizing is:
And(list) -> False if False appears in a normalize(x) for x in list,
             True if normalize(x) for x in list is empty or all True,
             And(normalize(x) for x in list if != True and != False)
etc.

That's described in one point: Apply boolean algebra rules.

What I don't like about reordering is that it is too tightly coupled to
the following solving algorithm and the restricted syntax, while really,
having REQUIRED_USE constraints asking you to enable a masked flag is
something we ought to kill even without solving as they hide broken
deps and make all our QA checks less useful.


Finally, reordering, being essentially a local thing, does not have the
proper behavior in a general AST:
'|| ( ( a b ) c )' with 'a' and 'b' masked will be left invariant by
reordering and the resulting expanded form will be rejected while
constant propagation/elimination will reduce that to 'c' and be good.
Hence, the reordering check cannot be used as a good input for checking
for broken REQUIRED_USE: I really think things like 'vulkan? ( ||
( video_cards_i965 video_cards_radeonsi ) )' should be a repoman error
on stable profiles where both those video cards are masked and vulkan is
not. For that, we need to support the whole PMS-defined syntax, not a
reduced one.
Basically, this duplicates the code and work on that side and makes you
having to carry the reordering burden all along your GLEP :)


> Of course, you could say that you will get a little better error
> message, like 'all flags inside || are masked' instead of '!b -> a'
> will alter immutable flag. But that's purely an implementation
> detail. It's not worth making the reference algorithms longer.

Well, as noted above, I don't think this is related to any solver.

> > > > Reordering is a dangerous path as we've already seen since it
> > > > can create unexpected loops for the solver.    
> > > 
> > > Freeform reordering is dangerous, and I've removed that already.
> > > Reordering restricted to immutables can not cause any issues that
> > > any other solution wouldn't cause.  
> > 
> > You're very likely right there. Any proof? Esp. any proof that the
> > checker still guarantees the existence of a solution in all cases?
> > I'm not asking for a formal proof, but simply a bit more details
> > than just an assertion saying it's fine.  
> 
> The case for checker is just the same as with any other kind of
> immutability processing. We need to run the reordering, transform
> and verification separately for every possible combination of
> immutable flags.
> 
> The reordering explicitly alters the results of the transform, and
> with the trivial implication form of the flattened constraints
> the verification stage checks will find any problems that may arise
> from it, just like they would find any problem from doing a similar
> thing verbatim.

Ok that works. I don't see much difference between
'check(reorder(formula,immutables))' and
'check(reduce(formula,immutables))'... except the latter does not
have to deal with any immutable anymore.

> >   
> > > > Working on instantiated REQUIRED_USE constraints would probably
> > > > simplify quite a bit your GLEP too: you already have the
> > > > "Verifying that the constraints do not alter immutable flags"
> > > > part that roughly does the same thing as instantiating, except
> > > > if you assume it's already true you can skip the reordering.    
> > > 
> > > Except that the reordering can be described in 2 points, and so
> > > can be the immutability verification. Please prove that you can
> > > provide a simpler explanation that doesn't fail in any of the
> > > corner cases.  
> > 
> > Except reordering is an invention and immutable checking is simply
> > applying boolean logic rules to your implication and check that no
> > "False" can appear. You can simply start by applying boolean logic
> > and forget about reordering.  
> 
> No, it is not. You do not have the values of all the items inside
> the group, just some of them. Depending on how many of them do you
> have and what are them, you need to transform the group
> appropriately, e.g. by removing items, replacing the group or failing
> entirely.

Yes, that's boolean algebra rules. You propagate constants from leaves
to root in the AST and if some 'False' appears in your AST when you've
reached the root you fail. I agree one needs some practice on recursive
structures to understand that quickly and easily though.


> > > > One big advantage of working on ASTs is that it becomes trivial
> > > > to suggest a proper reordering.    
> > > 
> > > Reordering is never a trivial problem. Unless I'm missing
> > > something, it is technically possible that a 'reordering' will
> > > actually require a sub- item being moved out of the containing
> > > group.  
> > 
> > Not if done at the AST level.
> >   
> > > And to be honest, I find the output of the verification script in
> > > this regard quite useful. That is, it saying 'X affects Y, so it
> > > needs to go before it' is quite clear to me. I don't think most
> > > developers would actually need to script to pinpoint a specific
> > > location for every single constraint.  
> > 
> > In most cases this is sufficient.
> > Think of a more complex case:
> > A -> B
> > B -> C
> > A -> D
> > D -> C
> > 
> >    |-> B -|
> > A -|      |->C
> >    |-> D -|
> > 
> > It's starting to be a more complex mental exercise to get the proper
> > ordering when given the 1st form only.
> > 
> > 
> > Actually, considering people rant against git merges because they
> > want linear history in the graph log but fail to understand 'git
> > log' is precisely about displaying such a linear ordering, I'm
> > ready to bet someone will rant :)  
> 
> We can discuss this when you have a working algorithm. Right now, it's
> a purely theoretical exercise unless someone can come up with
> a reasonable way of implementing it.

Hmmm. lol ?
May I suggest you spend 30 minutes on wikipedia about what topological
sorting is, what it does and what purpose it serves?
It's a bit annoying to see you completely lost every time it comes up.

In a few words, from a list of 'X affects Y' it gives you a compatible
linear ordering. While it can be done by hand on the easy cases, that's
something computers are much better at doing on the more complex cases.


> > > > -------
> > > > 
> > > > Restrictions on REQUIRED_USE format
> > > > 
> > > > I still fail to see the point here. One can simply apply the
> > > > rewriting you suggest below and be done with it. Rationale is
> > > > not very convincing to me:
> > > > 
> > > > - avoiding unpredictable results of automatic flag adjustments:
> > > >         A deterministic algorithm is, by definition,
> > > > predictable.    
> > > 
> > > s/unpredictable/surprising/?
> > > 
> > > The goal is for it do something that the developer *not reading
> > > the spec* could reasonably predict happening.  
> > 
> > 
> > There is a huge gap between failing to solve a constraint
> > voluntarily because it has one too much nesting level and having a
> > repoman warning telling 'it is not recommended you write it that
> > way'. The latter ensures said developer is able to predict what is
> > happening, the former just adds annoyance onto users for no real
> > reason.  
> 
> Except it doesn't because it's extremely uncommon (and even unlikely)
> and I am successfully exterminating the last occurrences. Implementing
> support for something that will be never used is a waste of time.

Except you're already wasting your time exterminating some cases for
which support is already written. You still don't know whether your
restricted syntax will be accepted in PMS (which mostly depends if
people feel comfortable with its expressivity), so you don't know if
you won't have to plug support for it later. May I remind you the
numbers I ran? Among all the rejected "too complex" usages of requse,
only *1* could have led to an invalid solution by the solver.

Feel free to do so, I mean, it cannot hurt to simplify requse in the
tree, but keep in mind that maybe support for more syntaxes will still
be needed. Esp. since I learnt recently that PMS factors the definition
of ||, &&, ^^ and co with all the other usages of it (DEPEND, etc.).

Anyway, I'm thinking it's more important to move on with this, and
iterate later, than getting everything perfect beforehand.


> > > > - improving readability of REQUIRED_USE constraints:
> > > >         No need for a restriction for that. If people want to
> > > > shoot themselves in the foot, it is not a PMS problem. I see
> > > > that like proposing death penalty for those who commit
> > > > suicide :)    
> > > 
> > > This is not PMS. This is a GLEP which serves both the purpose of
> > > a technical specification with extended rationale and a policy
> > > document.  
> > 
> > Isn't the goal of it to have it in a future EAPI ?  
> 
> I don't see how that's relevant. Nobody will be copying the whole GLEP
> verbatim into the PMS.

Unless you're writing a throwaway GLEP, the syntax restrictions are
supposed to go to PMS since that's what defines it...


> > > > - keeping the specification and implementation relatively
> > > > simple: You already define everything for working without
> > > > restriction. Plus, unlimited implication nesting has the same
> > > > complexity.    
> > > 
> > > No, I don't. I don't cover the meaning of nested groups and things
> > > just explode when they come into game.  
> > 
> > 
> > You mostly do with the rewriting part, you're only refusing to admit
> > it :)  
> 
> You mean the transform? It doesn't cover the possibility of those
> groups containing anything but plain flags. As we've already
> established, the results become non-trivial when they do and those
> cases are certainly not covered here.

That's why I said mostly. The missing parts are really small, trust me.


[...]

Alexis.

Re: [gentoo-dev] Pre-GLEP RFC: Automated enforcing of REQUIRED_USE constraints

Reply via email to