Re: [Puppet-dev] PUP-3116 AKA Global Queues

Trevor Vaughan Thu, 18 Dec 2014 14:24:57 -0800

Heh, I can do either one. The former will happen when I get free time :-|.
The latter can happen whenever anyone else has any.


Trevor

On Thu, Dec 18, 2014 at 4:18 PM, Michael Smith <[email protected]
> wrote:
>
> Just to be clear, are you going to try that approach out yourself, or are
> you asking for help implementing it?
>
> On Thu, Dec 18, 2014 at 7:07 AM, Trevor Vaughan <[email protected]>
> wrote:
>
>> +1 to all of this. It all makes sense and I think it'll solve all of the
>> use cases that I can think of.
>>
>> On Wed, Dec 17, 2014 at 8:34 PM, Michael Smith <
>> [email protected]> wrote:
>>
>>> Ok, after some discussions with Josh and Andy (Andy's below), came up
>>> with a proposal for how one might write a stash for re-using data. Just for
>>> clarification, in what sense do you mean a 'queueing' mechanism?
>>>
>>> Create a Stash class of some sort, probably in Puppet::Util, that's a
>>> simple key/value store. That class can be instantiated in specific
>>> resources where it's needed, assuming the resource is a class with a
>>> sufficiently long lifetime. We can also instantiate a global stash, which
>>> is created in lib/puppet/configurer.rb as part of push_context when we're
>>> setting up a run. The Stash class could have a static member that's queried
>>> to get the global version in push_context (if it's available); the parsed
>>> data from /proc/mounts can be added to the context instance of the Stash.
>>>
>>> Andy and my discussion on #puppet-dev today:
>>>
>>>> [16:43:15] *<MichaelSmith>* *+zaphod42*: There's a mailing list thread
>>>> on PUP-3116 that tries to cache the result of reading /prod/mounts
>>>> [16:44:06] *<MichaelSmith>* I'm trying to explore whether there are
>>>> any existing patterns for caching data we re-use during a catalog run.
>>>> [16:45:05] *<MichaelSmith>* Puppet::Util::Storage kind of covers that,
>>>> with the added benefit of logging the cached data, but also the cost of
>>>> writing to PuppetDB.
>>>> [16:46:02] *<MichaelSmith>* And also doesn't work with puppet apply,
>>>> so that's problematic.
>>>> [16:46:51] *<+zaphod42>* Puppet::Util::Storage writes to puppetdb? I
>>>> thought it just wrote to a local file
>>>> [16:47:40] *<+zaphod42>* I think henrik's concern about memory leaks
>>>> really just is about the problems we encounter when the cache is never
>>>> flushed
>>>> [16:47:58] *<+zaphod42>* the data really just needs to have a clear
>>>> lifetime
>>>> [16:48:09] *<MichaelSmith>* Oh, I may be confused about
>>>> Puppet::Util::Storage then.
>>>> [16:48:31] *<+zaphod42>* and based on what I'm seeing, is this really
>>>> a cache? or is it really just about having some "stash" where providers can
>>>> store data during a run?
>>>> [16:49:28] *<MichaelSmith>* It would potentially be refreshed if the
>>>> /proc/mounts gets updated, but that's up to the provider. So just a stash
>>>> makes sense.
>>>> [16:49:37] *<+zaphod42>* MichaelSmith: yeah, Storage just writes to a
>>>> local file
>>>> https://github.com/puppetlabs/puppet/blob/master/lib/puppet/util/storage.rb#L86
>>>> [16:50:36] *<MichaelSmith>* Is using Storage to stash data used during
>>>> a run something that's been discouraged in the past?
>>>> [16:50:44] *<+zaphod42>* MichaelSmith: in which case, I would think
>>>> about it as providing a "stash" method for providers. A very simple thing
>>>> would be it just returns a hash that can be manipulated by the provider
>>>> [16:50:55] *<+zaphod42>* the hash needs to be stored somewhere
>>>> [16:51:15] *<+zaphod42>* that can be handled by the Transaction and it
>>>> can just throw all of the contents away at the end of a run
>>>> [16:51:54] *<MichaelSmith>* Yeah, sounds like a reasonable API to
>>>> write. Puppet::Util::Stash, that's cleared after a run and only stored
>>>> in-memory.
>>>> [16:51:57] *<+zaphod42>* there is also the question about what is the
>>>> scope of the data. Does just one resource get to see its own data, is it
>>>> shared across all resources of the same provider, all of the same type, or
>>>> all of the same run
>>>> [16:52:45] *<MichaelSmith>* Do you have ideas how to enforce those
>>>> types of restrictions?
>>>> [16:53:43] *<+zaphod42>* Have different stashes for each set? So for
>>>> every resource it has its own stash, the type has a stash, and the
>>>> transaction has a stash and they are all accessed independently
>>>> [16:54:14] *<+zaphod42>* the biggest problem is threading it through
>>>> the APIs. Ideally they would be something that fits in nicely, but I have a
>>>> feeling it will just be another global somewhere
>>>> [16:54:52] *<MichaelSmith>* I think the tricky part becomes how to
>>>> clear them when we have many isolated stashes.
>>>> [16:54:59] *<MichaelSmith>* So they have to register themselves
>>>> globally somewhere.
>>>> [16:56:05] *<+zaphod42>* or they live as instance variables on some
>>>> objects that get thrown away
>>>> [16:56:18] *<+zaphod42>* so the resource stash is just an instance
>>>> variable on a resource
>>>> [16:56:26] *<+zaphod42>* provider stash is on a provider
>>>> [16:56:41] *<+zaphod42>* (there is a problem there that every resource
>>>> is an instance of a provider)
>>>> [16:56:52] *<+zaphod42>* there isn't a shared provider instance across
>>>> the resources
>>>> [16:58:13] *<+zaphod42>* so one way to do it is have a Stashs object
>>>> that is pushed into the context by the transaction and popped when the
>>>> transaction is done
>>>> [16:58:32] *<MichaelSmith>* This particular example is being used in a
>>>> type, and I don't yet see where it creates a persistent instance object.
>>>> The lifetime might be too short to be useful.
>>>> [16:58:39] *<+zaphod42>* the stashes object holds all of the stashes
>>>> for all of the resources, types, etc (whatever scopes are deemed correct)
>>>> [16:59:18] *<+zaphod42>* in a type....Types are tricky because they
>>>> are shared between the master and the agent
>>>> [17:01:44] *<MichaelSmith>* I'm not quite sure of the implications of
>>>> that. I guess that means lifetime on the master is different.
>>>> [17:05:37] *<+zaphod42>* yeah, how types are used on the master versus
>>>> the agent is different. I can't ever remember all of the details though
>>>> [17:06:40] *<+zaphod42>* but if you put all of the stashes in a
>>>> Stashes instance and put that instance in the Context and then use
>>>> context_push (or better context_override), then it should be fine and not
>>>> have a memory leak
>>>> [17:07:15] *<+zaphod42>* however, it will end up holding onto data
>>>> during a transaction longer than it may need to, thus increasing memory
>>>> usage
>>>> [17:07:23] *<+zaphod42>* but I'm not sure how much of a problem that
>>>> would be
>>>> [17:07:37] *<+zaphod42>* so long as there is some point at which the
>>>> objects will be cleaned up
>>>> [17:08:01] *<MichaelSmith>* Is there any advantage of having a Stashes
>>>> instance that's added via push_context, vs just pushing your hash directly
>>>> to it?
>>>> [17:08:22] *<MichaelSmith>* I guess the ability to add arbitrary keys
>>>> after starting.
>>>> [17:08:44] *<+zaphod42>* push_context would  just be where some
>>>> collection of stashes would be held and other things can get to (a global,
>>>> but with more control)
>>>> [17:09:12] *<+zaphod42>* you should still provide an API on the
>>>> resources to get to the stashes, instead of having authors go directly to
>>>> Puppet.lookup
>>>> [17:09:29] *<MichaelSmith>* Yeah, makes sense.
>>>> [17:09:55] *<+zaphod42>* and the other part of the context is that it
>>>> controls the lifetime of the stashes
>>>> [17:10:16] *<+zaphod42>* once the context is popped, the stashes
>>>> disappear
>>>> [17:10:51] *<+zaphod42>* I'd much rather have instances of resources
>>>> and such hold onto their own stashes, but it might be difficult
>>>> [17:11:28] *<+zaphod42>* however, I think you should look into that.
>>>> Only use the context system if there isn't a more local way of controlling
>>>> it
>>>> [17:11:33] *<MichaelSmith>* Yeah... not everything seems to have an
>>>> instance.
>>>> [17:12:13] *<+zaphod42>* which is the sad making part :(
>>>
>>>
>>> On Wed, Dec 17, 2014 at 3:53 PM, Michael Smith <
>>> [email protected]> wrote:
>>>>
>>>> I'm doing my own digging to figure out what seems to make sense.
>>>>
>>>> Josh had mentioned Puppet::push_context, set in the configurer. We push
>>>> and pop context for each apply run; however that's a private API that
>>>> doesn't seem to be meant for general use. Piggybacking on it looks like it
>>>> would get messy.
>>>>
>>>> There's also Puppet::Util::Storage, which superficially looks
>>>> appropriate for this kind of caching (
>>>> http://www.rubydoc.info/gems/puppet/Puppet/Util/Storage). I'm still
>>>> trying to wrap my head around what side-effects might occur.
>>>>
>>>>
>>>> On Tue, Dec 16, 2014 at 6:27 PM, Trevor Vaughan <[email protected]
>>>> > wrote:
>>>>>
>>>>> Part of my other heartburn with using a file was revisited hard upon
>>>>> me as I recalled the original extdata function implementation.
>>>>>
>>>>> In the case of extdata, one large extdata file + a lot of extlookups =
>>>>> massive catalog compile times on the server.
>>>>>
>>>>> So, every time I want to call the cache, across potentially large
>>>>> numbers of providers and/or other things requiring state, I *really* don't
>>>>> want to read a file. Particularly, when I don't know what's going to be in
>>>>> it.
>>>>>
>>>>> In this case, we would have to contend with slower client run times
>>>>> and more CPU overhead as well as disk I/O requirements. Indicating that
>>>>> people should change the way their OS is configured inasmuch as using 
>>>>> tmpfs
>>>>> when they may not have this choice does not seem ideal unless, of course,
>>>>> it ships with puppet and doesn't require a system reboot. If, for some
>>>>> reason, I have 50 providers that want to use this, this is 50 file reads
>>>>> and writes that could be avoided.
>>>>>
>>>>> Giving people the choice of Disk vice Memory overhead would be ideal
>>>>> if you want both for some reason.
>>>>>
>>>>> I'm honestly not seeing what would be so bad about scope.cache where
>>>>> cache is some top level Puppet::Cache object that holds hashes that expire
>>>>> at the end of a run. You would have to do things very politely in terms of
>>>>> namespacing but you have to do that anyway.
>>>>>
>>>>> I am, of course, not opposed to saving cache state to disk for
>>>>> debugging purposes, and think that should be an option when the --debug
>>>>> flag is used.
>>>>>
>>>>> Trevor
>>>>>
>>>>> Trevor
>>>>>
>>>>> On Tue, Dec 16, 2014 at 7:37 PM, Felix Frank <
>>>>> [email protected]> wrote:
>>>>>>
>>>>>>  Hey,
>>>>>>
>>>>>> good points - state retention at whatever granular level would be a
>>>>>> good general purpose tool to have. If it's built in a pervasive fashion
>>>>>> (i.e., any provider might use the cache for whetever it deems 
>>>>>> appropriate),
>>>>>> it gains added visibility and becomes more opaque to the user - which is 
>>>>>> a
>>>>>> good thing, and addresses one of the major concerns I'm having with this.
>>>>>> The other being that it needs to be tunable for the user in some fashion.
>>>>>>
>>>>>> I have no qualms about disk I/O - after all, the user can choose
>>>>>> whatever block backend they want. Users who depend on low latency or need
>>>>>> to save IOPS can employ a tmpfs, for example.
>>>>>>
>>>>>> Cheers,
>>>>>> Felix
>>>>>>
>>>>>> On 12/17/2014 12:56 AM, Trevor Vaughan wrote:
>>>>>>
>>>>>>  I'm happy with catalog lifetime.
>>>>>>
>>>>>> I'm really not happy with doing anything that involves disk I/O.
>>>>>>
>>>>>>  This would be key to getting providers to be able to save state in a
>>>>>> non-hacky way as well.
>>>>>>
>>>>>> Trevor
>>>>>>
>>>>>> On Tue, Dec 16, 2014 at 6:45 PM, Michael Smith <
>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>> I don't like any of the ideas I raised, but this will take some
>>>>>>> digging. We need to determine what life-time the cache should have, and
>>>>>>> what interface. I'm leaning towards either a cached read API in the
>>>>>>> FileSystem utilities, or a cache tied to the catalog lifetime.
>>>>>>>
>>>>>>
>>>>>>  --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Puppet Developers" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/puppet-dev/5490D048.7020702%40Alumni.TU-Berlin.de
>>>>>> <https://groups.google.com/d/msgid/puppet-dev/5490D048.7020702%40Alumni.TU-Berlin.de?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Trevor Vaughan
>>>>> Vice President, Onyx Point, Inc
>>>>> (410) 541-6699
>>>>> [email protected]
>>>>>
>>>>> -- This account not approved for unencrypted proprietary information --
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Puppet Developers" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUCo4FmT9QGk_P1kYg0CdEWA9pqhU%3D6jeXjBAr9z7fD9w%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoUCo4FmT9QGk_P1kYg0CdEWA9pqhU%3D6jeXjBAr9z7fD9w%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "Puppet Developers" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/puppet-dev/CABy1mMJigXCzOi1P1wD4G8kb6Ec3gS3y%2Bw_aANpkdu5s2gOWkw%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/puppet-dev/CABy1mMJigXCzOi1P1wD4G8kb6Ec3gS3y%2Bw_aANpkdu5s2gOWkw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>> --
>> Trevor Vaughan
>> Vice President, Onyx Point, Inc
>> (410) 541-6699
>> [email protected]
>>
>> -- This account not approved for unencrypted proprietary information --
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Puppet Developers" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoV9hwQFm8GO7Oxt8VjpDu%2BxDS24z4nSj1LPDo4hkmDTcA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoV9hwQFm8GO7Oxt8VjpDu%2BxDS24z4nSj1LPDo4hkmDTcA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "Puppet Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/puppet-dev/CABy1mM%2B7z-EBJq94t8cRY9B_JJrQrfQ8%2BbM9TEzv_D2wgKdPGA%40mail.gmail.com
> <https://groups.google.com/d/msgid/puppet-dev/CABy1mM%2B7z-EBJq94t8cRY9B_JJrQrfQ8%2BbM9TEzv_D2wgKdPGA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Trevor Vaughan
Vice President, Onyx Point, Inc
(410) 541-6699
[email protected]

-- This account not approved for unencrypted proprietary information --

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/CANs%2BFoU3FCfSKmETn19Uuhf_hmtE2Ge4ARd1nng6Z8DzVdYBpA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] PUP-3116 AKA Global Queues

Reply via email to