On Sep 15, 2013, at 8:55 PM, Henrik Lindberg <[email protected]> wrote:
> On 2013-16-09 5:41, Luke Kanies wrote: >> Hi Henrik, >> >> I know we have some users who just batch all package installs up front. >> It'd be interesting to see if that was a feasible solution. it would by >> pass the graph entirely, which I'm sure could have problems, but it would, >> at least, be easy to build and understand. Would that suffice for a >> sufficient number of cases? >> > Well, it naturally misses the optimization opportunity and is obviously > difficult to maintain for users since they then have to compose the set of > packages manually without the help of the graph / catalog. They don't have to compose the set of packages; we'd provide a hook that pulled all of the packages out of the catalog and ran them in a batch. I'm not sure which optimization you mean, though. > The explicit batching being discusses would allow users to partially do this > - there seems to be use cases where it is really important to be able to do > this (when a package manager must be given a certain combination of > packages/versions to do the right thing). > > I think we should be able to come up with an optimization that works in the > general case. The next step is to write a simple utility to get metrics from > real large scale system deployments to better understand the value of the > opportunity. > > - henrik > >> On Sep 13, 2013, at 9:52 AM, Henrik Lindberg >> <[email protected]> wrote: >> >>> Hi, >>> Ideas regarding a potential performance boost that can be gained by >>> performing batch processing of package installs/operations has been >>> floating around in the Puppet echo system for quite some time. >>> >>> There is a discussion (and a somewhat dated implementation/proposal) in >>> http://projects.puppetlabs.com/issues/2198 which is good background reading >>> for this topic. >>> >>> In issue #2198 (if you skipped reading it ;-)), the idea is that Puppet >>> should have the feature to install a list of packages given by the user. >>> >>> It seems doable to generalize this idea and let puppet automatically >>> optimize package installs under certain conditions. Performing individual >>> package installs is quite expensive and even if the optimization >>> opportunities may not be extensive (e.g. say that 20% (number completely >>> made up) of packages could at least be paired with one other package) this >>> is still a worth while activity. >>> >>> To kick this off, we need to do some research and design. So, here is an >>> attempt to get this started by asking a bunch of questions. >>> >>> Under what conditions can two (or more) packages operations be batched? >>> ----------------------------------------------------------------------- >>> As an example, say that a class contains a series of package resources >>> without any explicit dependencies between them. The idea is that this could >>> be optimized. Are there any conditions that makes this impossible? >>> >>> What if the resources are chained with explicit dependencies? (Guess is >>> that the dependencies were added for a reason, and should be done as >>> individual dependencies). >>> >>> What if the list of packages are of different type? Is an chain of implicit >>> dependencies between packages of the same type required to make it possible >>> to batch them? (Does it depend on the policy for implicit dependencies; >>> parse-order, random, etc.)? >>> >>> What if there are other implicit dependencies. Can it be deduced that an >>> intermixed resource has no effect on the outcome of a following package >>> operation? (Exec's can for sure do things). >>> >>> Is it possible to optimize across class boundaries? >>> >>> Is it enough to look at the queue of actions in the "planned catalog" and >>> simply look-ahead for packages handled by the same provider. An unbroken >>> chain of operations handled by the same provider is collected and then >>> handed off to the provider? Does this provide enough optimization, or are >>> we then likely to miss optimization opportunities? >>> >>> How can we collect metrics for this? >>> >>> What needs to be done to providers? >>> ----------------------------------- >>> Clearly the capability to handle multiple requests must be implemented for >>> package managers that support this. What should the API look like? >>> >>> What needs to be done to the Package type? >>> ------------------------------------------ >>> Is it all an ordering issue and handing off resources to the provider, or >>> do we need to do things to the Package type as well? >>> >>> Are there situations were it is of value to veto batching per resource? >>> (depending on how much optimization than can be deduced by looking at >>> resource-dependencies). >>> >>> Explicit group/list? >>> -------------------- >>> If we want users to be able to explicitly give a group of packages to >>> manage - how should that work? A new resource type? An attribute on >>> Package? A defined type? >>> >>> If we cannot optimize across classes, can we support explicit >>> grouping/batch operation? (Seems complex with yet another containment >>> hierarchy - or can this be done by introspecting a dependency chain of >>> custom resources/classes perhaps used specifically for this purpose? >>> >>> - henrik >>> >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "Puppet Developers" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at http://groups.google.com/group/puppet-dev. >>> For more options, visit https://groups.google.com/groups/opt_out. >> >> > > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/puppet-dev. > For more options, visit https://groups.google.com/groups/opt_out. -- Luke Kanies | http://about.me/lak | http://puppetlabs.com/ | +1-615-594-8199 -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/puppet-dev. For more options, visit https://groups.google.com/groups/opt_out.
