Heh, I can do either one. The former will happen when I get free time :-|. The latter can happen whenever anyone else has any.
Trevor On Thu, Dec 18, 2014 at 4:18 PM, Michael Smith <[email protected] > wrote: > > Just to be clear, are you going to try that approach out yourself, or are > you asking for help implementing it? > > On Thu, Dec 18, 2014 at 7:07 AM, Trevor Vaughan <[email protected]> > wrote: > >> +1 to all of this. It all makes sense and I think it'll solve all of the >> use cases that I can think of. >> >> On Wed, Dec 17, 2014 at 8:34 PM, Michael Smith < >> [email protected]> wrote: >> >>> Ok, after some discussions with Josh and Andy (Andy's below), came up >>> with a proposal for how one might write a stash for re-using data. Just for >>> clarification, in what sense do you mean a 'queueing' mechanism? >>> >>> Create a Stash class of some sort, probably in Puppet::Util, that's a >>> simple key/value store. That class can be instantiated in specific >>> resources where it's needed, assuming the resource is a class with a >>> sufficiently long lifetime. We can also instantiate a global stash, which >>> is created in lib/puppet/configurer.rb as part of push_context when we're >>> setting up a run. The Stash class could have a static member that's queried >>> to get the global version in push_context (if it's available); the parsed >>> data from /proc/mounts can be added to the context instance of the Stash. >>> >>> Andy and my discussion on #puppet-dev today: >>> >>>> [16:43:15] *<MichaelSmith>* *+zaphod42*: There's a mailing list thread >>>> on PUP-3116 that tries to cache the result of reading /prod/mounts >>>> [16:44:06] *<MichaelSmith>* I'm trying to explore whether there are >>>> any existing patterns for caching data we re-use during a catalog run. >>>> [16:45:05] *<MichaelSmith>* Puppet::Util::Storage kind of covers that, >>>> with the added benefit of logging the cached data, but also the cost of >>>> writing to PuppetDB. >>>> [16:46:02] *<MichaelSmith>* And also doesn't work with puppet apply, >>>> so that's problematic. >>>> [16:46:51] *<+zaphod42>* Puppet::Util::Storage writes to puppetdb? I >>>> thought it just wrote to a local file >>>> [16:47:40] *<+zaphod42>* I think henrik's concern about memory leaks >>>> really just is about the problems we encounter when the cache is never >>>> flushed >>>> [16:47:58] *<+zaphod42>* the data really just needs to have a clear >>>> lifetime >>>> [16:48:09] *<MichaelSmith>* Oh, I may be confused about >>>> Puppet::Util::Storage then. >>>> [16:48:31] *<+zaphod42>* and based on what I'm seeing, is this really >>>> a cache? or is it really just about having some "stash" where providers can >>>> store data during a run? >>>> [16:49:28] *<MichaelSmith>* It would potentially be refreshed if the >>>> /proc/mounts gets updated, but that's up to the provider. So just a stash >>>> makes sense. >>>> [16:49:37] *<+zaphod42>* MichaelSmith: yeah, Storage just writes to a >>>> local file >>>> https://github.com/puppetlabs/puppet/blob/master/lib/puppet/util/storage.rb#L86 >>>> [16:50:36] *<MichaelSmith>* Is using Storage to stash data used during >>>> a run something that's been discouraged in the past? >>>> [16:50:44] *<+zaphod42>* MichaelSmith: in which case, I would think >>>> about it as providing a "stash" method for providers. A very simple thing >>>> would be it just returns a hash that can be manipulated by the provider >>>> [16:50:55] *<+zaphod42>* the hash needs to be stored somewhere >>>> [16:51:15] *<+zaphod42>* that can be handled by the Transaction and it >>>> can just throw all of the contents away at the end of a run >>>> [16:51:54] *<MichaelSmith>* Yeah, sounds like a reasonable API to >>>> write. Puppet::Util::Stash, that's cleared after a run and only stored >>>> in-memory. >>>> [16:51:57] *<+zaphod42>* there is also the question about what is the >>>> scope of the data. Does just one resource get to see its own data, is it >>>> shared across all resources of the same provider, all of the same type, or >>>> all of the same run >>>> [16:52:45] *<MichaelSmith>* Do you have ideas how to enforce those >>>> types of restrictions? >>>> [16:53:43] *<+zaphod42>* Have different stashes for each set? So for >>>> every resource it has its own stash, the type has a stash, and the >>>> transaction has a stash and they are all accessed independently >>>> [16:54:14] *<+zaphod42>* the biggest problem is threading it through >>>> the APIs. Ideally they would be something that fits in nicely, but I have a >>>> feeling it will just be another global somewhere >>>> [16:54:52] *<MichaelSmith>* I think the tricky part becomes how to >>>> clear them when we have many isolated stashes. >>>> [16:54:59] *<MichaelSmith>* So they have to register themselves >>>> globally somewhere. >>>> [16:56:05] *<+zaphod42>* or they live as instance variables on some >>>> objects that get thrown away >>>> [16:56:18] *<+zaphod42>* so the resource stash is just an instance >>>> variable on a resource >>>> [16:56:26] *<+zaphod42>* provider stash is on a provider >>>> [16:56:41] *<+zaphod42>* (there is a problem there that every resource >>>> is an instance of a provider) >>>> [16:56:52] *<+zaphod42>* there isn't a shared provider instance across >>>> the resources >>>> [16:58:13] *<+zaphod42>* so one way to do it is have a Stashs object >>>> that is pushed into the context by the transaction and popped when the >>>> transaction is done >>>> [16:58:32] *<MichaelSmith>* This particular example is being used in a >>>> type, and I don't yet see where it creates a persistent instance object. >>>> The lifetime might be too short to be useful. >>>> [16:58:39] *<+zaphod42>* the stashes object holds all of the stashes >>>> for all of the resources, types, etc (whatever scopes are deemed correct) >>>> [16:59:18] *<+zaphod42>* in a type....Types are tricky because they >>>> are shared between the master and the agent >>>> [17:01:44] *<MichaelSmith>* I'm not quite sure of the implications of >>>> that. I guess that means lifetime on the master is different. >>>> [17:05:37] *<+zaphod42>* yeah, how types are used on the master versus >>>> the agent is different. I can't ever remember all of the details though >>>> [17:06:40] *<+zaphod42>* but if you put all of the stashes in a >>>> Stashes instance and put that instance in the Context and then use >>>> context_push (or better context_override), then it should be fine and not >>>> have a memory leak >>>> [17:07:15] *<+zaphod42>* however, it will end up holding onto data >>>> during a transaction longer than it may need to, thus increasing memory >>>> usage >>>> [17:07:23] *<+zaphod42>* but I'm not sure how much of a problem that >>>> would be >>>> [17:07:37] *<+zaphod42>* so long as there is some point at which the >>>> objects will be cleaned up >>>> [17:08:01] *<MichaelSmith>* Is there any advantage of having a Stashes >>>> instance that's added via push_context, vs just pushing your hash directly >>>> to it? >>>> [17:08:22] *<MichaelSmith>* I guess the ability to add arbitrary keys >>>> after starting. >>>> [17:08:44] *<+zaphod42>* push_context would just be where some >>>> collection of stashes would be held and other things can get to (a global, >>>> but with more control) >>>> [17:09:12] *<+zaphod42>* you should still provide an API on the >>>> resources to get to the stashes, instead of having authors go directly to >>>> Puppet.lookup >>>> [17:09:29] *<MichaelSmith>* Yeah, makes sense. >>>> [17:09:55] *<+zaphod42>* and the other part of the context is that it >>>> controls the lifetime of the stashes >>>> [17:10:16] *<+zaphod42>* once the context is popped, the stashes >>>> disappear >>>> [17:10:51] *<+zaphod42>* I'd much rather have instances of resources >>>> and such hold onto their own stashes, but it might be difficult >>>> [17:11:28] *<+zaphod42>* however, I think you should look into that. >>>> Only use the context system if there isn't a more local way of controlling >>>> it >>>> [17:11:33] *<MichaelSmith>* Yeah... not everything seems to have an >>>> instance. >>>> [17:12:13] *<+zaphod42>* which is the sad making part :( >>> >>> >>> On Wed, Dec 17, 2014 at 3:53 PM, Michael Smith < >>> [email protected]> wrote: >>>> >>>> I'm doing my own digging to figure out what seems to make sense. >>>> >>>> Josh had mentioned Puppet::push_context, set in the configurer. We push >>>> and pop context for each apply run; however that's a private API that >>>> doesn't seem to be meant for general use. Piggybacking on it looks like it >>>> would get messy. >>>> >>>> There's also Puppet::Util::Storage, which superficially looks >>>> appropriate for this kind of caching ( >>>> http://www.rubydoc.info/gems/puppet/Puppet/Util/Storage). I'm still >>>> trying to wrap my head around what side-effects might occur. >>>> >>>> >>>> On Tue, Dec 16, 2014 at 6:27 PM, Trevor Vaughan <[email protected] >>>> > wrote: >>>>> >>>>> Part of my other heartburn with using a file was revisited hard upon >>>>> me as I recalled the original extdata function implementation. >>>>> >>>>> In the case of extdata, one large extdata file + a lot of extlookups = >>>>> massive catalog compile times on the server. >>>>> >>>>> So, every time I want to call the cache, across potentially large >>>>> numbers of providers and/or other things requiring state, I *really* don't >>>>> want to read a file. Particularly, when I don't know what's going to be in >>>>> it. >>>>> >>>>> In this case, we would have to contend with slower client run times >>>>> and more CPU overhead as well as disk I/O requirements. Indicating that >>>>> people should change the way their OS is configured inasmuch as using >>>>> tmpfs >>>>> when they may not have this choice does not seem ideal unless, of course, >>>>> it ships with puppet and doesn't require a system reboot. If, for some >>>>> reason, I have 50 providers that want to use this, this is 50 file reads >>>>> and writes that could be avoided. >>>>> >>>>> Giving people the choice of Disk vice Memory overhead would be ideal >>>>> if you want both for some reason. >>>>> >>>>> I'm honestly not seeing what would be so bad about scope.cache where >>>>> cache is some top level Puppet::Cache object that holds hashes that expire >>>>> at the end of a run. You would have to do things very politely in terms of >>>>> namespacing but you have to do that anyway. >>>>> >>>>> I am, of course, not opposed to saving cache state to disk for >>>>> debugging purposes, and think that should be an option when the --debug >>>>> flag is used. >>>>> >>>>> Trevor >>>>> >>>>> Trevor >>>>> >>>>> On Tue, Dec 16, 2014 at 7:37 PM, Felix Frank < >>>>> [email protected]> wrote: >>>>>> >>>>>> Hey, >>>>>> >>>>>> good points - state retention at whatever granular level would be a >>>>>> good general purpose tool to have. If it's built in a pervasive fashion >>>>>> (i.e., any provider might use the cache for whetever it deems >>>>>> appropriate), >>>>>> it gains added visibility and becomes more opaque to the user - which is >>>>>> a >>>>>> good thing, and addresses one of the major concerns I'm having with this. >>>>>> The other being that it needs to be tunable for the user in some fashion. >>>>>> >>>>>> I have no qualms about disk I/O - after all, the user can choose >>>>>> whatever block backend they want. Users who depend on low latency or need >>>>>> to save IOPS can employ a tmpfs, for example. >>>>>> >>>>>> Cheers, >>>>>> Felix >>>>>> >>>>>> On 12/17/2014 12:56 AM, Trevor Vaughan wrote: >>>>>> >>>>>> I'm happy with catalog lifetime. >>>>>> >>>>>> I'm really not happy with doing anything that involves disk I/O. >>>>>> >>>>>> This would be key to getting providers to be able to save state in a >>>>>> non-hacky way as well. >>>>>> >>>>>> Trevor >>>>>> >>>>>> On Tue, Dec 16, 2014 at 6:45 PM, Michael Smith < >>>>>> [email protected]> wrote: >>>>>>> >>>>>>> I don't like any of the ideas I raised, but this will take some >>>>>>> digging. We need to determine what life-time the cache should have, and >>>>>>> what interface. I'm leaning towards either a cached read API in the >>>>>>> FileSystem utilities, or a cache tied to the catalog lifetime. >>>>>>> >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Puppet Developers" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/puppet-dev/5490D048.7020702%40Alumni.TU-Berlin.de >>>>>> <https://groups.google.com/d/msgid/puppet-dev/5490D048.7020702%40Alumni.TU-Berlin.de?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> >>>>> -- >>>>> Trevor Vaughan >>>>> Vice President, Onyx Point, Inc >>>>> (410) 541-6699 >>>>> [email protected] >>>>> >>>>> -- This account not approved for unencrypted proprietary information -- >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Puppet Developers" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUCo4FmT9QGk_P1kYg0CdEWA9pqhU%3D6jeXjBAr9z7fD9w%40mail.gmail.com >>>>> <https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUCo4FmT9QGk_P1kYg0CdEWA9pqhU%3D6jeXjBAr9z7fD9w%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Puppet Developers" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/puppet-dev/CABy1mMJigXCzOi1P1wD4G8kb6Ec3gS3y%2Bw_aANpkdu5s2gOWkw%40mail.gmail.com >>> <https://groups.google.com/d/msgid/puppet-dev/CABy1mMJigXCzOi1P1wD4G8kb6Ec3gS3y%2Bw_aANpkdu5s2gOWkw%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> -- >> Trevor Vaughan >> Vice President, Onyx Point, Inc >> (410) 541-6699 >> [email protected] >> >> -- This account not approved for unencrypted proprietary information -- >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Puppet Developers" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoV9hwQFm8GO7Oxt8VjpDu%2BxDS24z4nSj1LPDo4hkmDTcA%40mail.gmail.com >> <https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoV9hwQFm8GO7Oxt8VjpDu%2BxDS24z4nSj1LPDo4hkmDTcA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google Groups > "Puppet Developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/puppet-dev/CABy1mM%2B7z-EBJq94t8cRY9B_JJrQrfQ8%2BbM9TEzv_D2wgKdPGA%40mail.gmail.com > <https://groups.google.com/d/msgid/puppet-dev/CABy1mM%2B7z-EBJq94t8cRY9B_JJrQrfQ8%2BbM9TEzv_D2wgKdPGA%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- Trevor Vaughan Vice President, Onyx Point, Inc (410) 541-6699 [email protected] -- This account not approved for unencrypted proprietary information -- -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoU3FCfSKmETn19Uuhf_hmtE2Ge4ARd1nng6Z8DzVdYBpA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
