On Sep 15, 2013, at 8:55 PM, Henrik Lindberg <[email protected]> 
wrote:

> On 2013-16-09 5:41, Luke Kanies wrote:
>> Hi Henrik,
>> 
>> I know we have some users who just batch all package installs up front.  
>> It'd be interesting to see if that was a feasible solution.  it would by 
>> pass the graph entirely, which I'm sure could have problems, but it would, 
>> at least, be easy to build and understand.  Would that suffice for a 
>> sufficient number of cases?
>> 
> Well, it naturally misses the optimization opportunity and is obviously 
> difficult to maintain for users since they then have to compose the set of 
> packages manually without the help of the graph / catalog.

They don't have to compose the set of packages; we'd provide a hook that pulled 
all of the packages out of the catalog and ran them in a batch.

I'm not sure which optimization you mean, though.

> The explicit batching being discusses would allow users to partially do this 
> - there seems to be use cases where it is really important to be able to do 
> this (when a package manager must be given a certain combination of 
> packages/versions to do the right thing).
> 
> I think we should be able to come up with an optimization that works in the 
> general case. The next step is to write a simple utility to get metrics from 
> real large scale system deployments to better understand the value of the 
> opportunity.
> 
> - henrik
> 
>> On Sep 13, 2013, at 9:52 AM, Henrik Lindberg 
>> <[email protected]> wrote:
>> 
>>> Hi,
>>> Ideas regarding a potential performance boost that can be gained by 
>>> performing batch processing of package installs/operations has been 
>>> floating around in the Puppet echo system for quite some time.
>>> 
>>> There is a discussion (and a somewhat dated implementation/proposal) in 
>>> http://projects.puppetlabs.com/issues/2198 which is good background reading 
>>> for this topic.
>>> 
>>> In issue #2198 (if you skipped reading it ;-)), the idea is that Puppet 
>>> should have the feature to install a list of packages given by the user.
>>> 
>>> It seems doable to generalize this idea and let puppet automatically 
>>> optimize package installs under certain conditions. Performing individual 
>>> package installs is quite expensive and even if the optimization 
>>> opportunities may not be extensive (e.g. say that 20% (number completely 
>>> made up) of packages could at least be paired with one other package) this 
>>> is still a worth while activity.
>>> 
>>> To kick this off, we need to do some research and design. So, here is an 
>>> attempt to get this started by asking a bunch of questions.
>>> 
>>> Under what conditions can two (or more) packages operations be batched?
>>> -----------------------------------------------------------------------
>>> As an example, say that a class contains a series of package resources 
>>> without any explicit dependencies between them. The idea is that this could 
>>> be optimized. Are there any conditions that makes this impossible?
>>> 
>>> What if the resources are chained with explicit dependencies? (Guess is 
>>> that the dependencies were added for a reason, and should be done as 
>>> individual dependencies).
>>> 
>>> What if the list of packages are of different type? Is an chain of implicit 
>>> dependencies between packages of the same type required to make it possible 
>>> to batch them? (Does it depend on the policy for implicit dependencies; 
>>> parse-order, random, etc.)?
>>> 
>>> What if there are other implicit dependencies. Can it be deduced that an 
>>> intermixed resource has no effect on the outcome of a following package 
>>> operation? (Exec's can for sure do things).
>>> 
>>> Is it possible to optimize across class boundaries?
>>> 
>>> Is it enough to look at the queue of actions in the "planned catalog" and 
>>> simply look-ahead for packages handled by the same provider. An unbroken 
>>> chain of operations handled by the same provider is collected and then 
>>> handed off to the provider? Does this provide enough optimization, or are 
>>> we then likely to miss optimization opportunities?
>>> 
>>> How can we collect metrics for this?
>>> 
>>> What needs to be done to providers?
>>> -----------------------------------
>>> Clearly the capability to handle multiple requests must be implemented for 
>>> package managers that support this. What should the API look like?
>>> 
>>> What needs to be done to the Package type?
>>> ------------------------------------------
>>> Is it all an ordering issue and handing off resources to the provider, or 
>>> do we need to do things to the Package type as well?
>>> 
>>> Are there situations were it is of value to veto batching per resource?
>>> (depending on how much optimization than can be deduced by looking at 
>>> resource-dependencies).
>>> 
>>> Explicit group/list?
>>> --------------------
>>> If we want users to be able to explicitly give a group of packages to 
>>> manage - how should that work? A new resource type? An attribute on 
>>> Package? A defined type?
>>> 
>>> If we cannot optimize across classes, can we support explicit 
>>> grouping/batch operation? (Seems complex with yet another containment 
>>> hierarchy - or can this be done by introspecting a dependency chain of 
>>> custom resources/classes perhaps used specifically for this purpose?
>>> 
>>> - henrik
>>> 
>>> 
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "Puppet Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/puppet-dev.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>> 
>> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/puppet-dev.
> For more options, visit https://groups.google.com/groups/opt_out.


-- 
Luke Kanies | http://about.me/lak | http://puppetlabs.com/ | +1-615-594-8199

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/puppet-dev.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to