Re: [Puppet-dev] RFC - A specification for module schemas

John Bollinger Mon, 01 Feb 2016 09:04:56 -0800


On Sunday, January 31, 2016 at 2:37:53 PM UTC-6, Corey Osman wrote:


> I think we have strayed off topic here. Being able to validate hiera 
> should be something that can easily be done by anyone no matter which 
> version of puppet they use.
>


I agree that being able to validate Hiera data would be useful for 
everyone, no matter what version of Puppet they rely upon.  I have no beef 
at all with anyone who wants to write tools that have broader version 
support, as opposed to narrower.  I am quite open to discussing what such 
tools might look like, how they might work, and what their inputs and 
outputs might be.

 

>   The core problem is bad data going into hiera and then into puppet.  The 
> consensus is that we all know this is problem.   While my primary goal was 
> to validate hiera, I think there are other use cases for having an 
> intermediate serialization format of the module's interfaces stored in a 
> file or retrieved dynamically with a puppet face.  
>
>

I agree that bad data is a problem, and a widely recognized one.  Tools and 
procedures for validating Hiera data are an excellent idea, and I am open 
to the possibility that a module schema such as you describe might have 
useful broader applications.  Allowing for such schemata to be obtained 
dynamically seems the forward-looking approach, but it does not have to be 
exclusive of static schemata.

Pragmatically, targeting static schemata first may be the best way to get 
such an effort off the ground. If we sacrifice "good" on the altar of 
"best" then we stand a good chance of being eternally stuck at "meh".

 

> To summarize some of the points discussed:
>
> Building a schema:
>   -  We need a higher level API for gathering module types, parameters, 
> and default values given a module, file, class or parameter
>      - Puppet should provide a way to output this information in a 
> serialized format and pure ruby objects
>         - format should be pluggable with customizable formats (JSON, 
> YAML, Module Schema, .hiera data schema, ..)
>         - should leverage puppet's built in datatypes  
>         - build a hiera data schema based on all the modules in puppet's 
> modules path specific for each puppet environment
>
>

I agree that it would be useful for there to be a mechanism for gathering 
such information from Puppet manifests.  To whatever extent that needs to 
be built in to Puppet itself, it seems unlikely that such a feature would 
appear in any version of Puppet older than the development tip.

As far as pluggable formats go, if you mean *output* formats then I'm 
unconvinced.  Or perhaps I would just componentize differently.  It seems 
to me that a single, flexible form that can serve as a *lingua franca* 
should be the immediate target, and I guess I would choose a Ruby object 
form for that.  If the result is wanted in one or more external formats 
then defining and emitting the needed outputs is a separate, problem, and 
likely a much simpler one.

As far as *input* formats go, I already opined that the best starting point 
would probably be a static, external schema format, at least for schemata 
that are not prepared programmatically in object format from the 
beginning.  There is perhaps room to support more input formats, but I'm 
not immediately seeing why such support would be more than a tiny win.

 

> Validating data
>   -  Given a hiera data schema, hiera should be able to validate its data, 
> implemented by each backend provider
>       - hiera data schemas are unique to every user
>
>

It's unclear to me how building validation directly into Hiera would gain 
anything if the idea is to rely on schemata gleaned dynamically from 
manifests in the first place.  I don't see how Hiera could be any more 
effective than the catalog builder at detecting bad data at runtime if the 
two are relying on the same (meta)data.  If it isn't any better then 
putting validation into Hiera would just move the point at which certain 
data errors are detected, at the cost of additional processing overhead.

On the other hand, I do think that validating on top of hiera is better 
than validating the underlying data directly.  Puppet sees the data only 
through the lens of Hiera, and if one is validating for Puppet then one 
wants to rely on the same view of the data that Puppet has.  Moreover, 
validating on top of Hiera is independent of any particular Hiera back 
end.  It may be that endowing Hiera with one or two new capabilities would 
facilitate offline data validation.  For example, one might want to request 
a full dump of all data, so as to look for extraneous / misspelled keys.

 

> Help not force people to use puppet 4
>   -  Given a module schema, retrofit puppet 3 code with puppet 4 data 
> types into the module's source code
>      - swagger like functionality, with the exception that its updating 
> code
>      - This helps people move from puppet 3 to puppet 4 
>   - Folks who cannot move to puppet 4 immediately can get the best of both 
> worlds with a easier way to migrate to puppet 4
>


Isn't this what P3's future parser is for?  I could see the value of 
validating data against a more detailed schema than can be extracted from 
P3 manifests, but I don't see it as migration assistance.  If the data are 
wrong then that's an inherent problem, not a migration issue.  Migration is 
in fact a solution, of sorts, to that problem, inasmuch as manifests 
written with P4 explicit data types can do a better job of validating data 
and therefore detecting data problems themselves.

 

>
> Module Schema
>   - This was never discussed, what should this look like?  Schemas are 
> necessary whether they are statically or dynamically generated. 
>      example: 
> https://github.com/logicminds/puppet_module_schemas/blob/master/apache_schema.yaml
>


I think an information model would be a better starting place than a 
physical example of a possible schema manifestation.  What kinds of objects 
must the schema be able to represent?  You mentioned several, but it seems 
that only a couple of them are represented in the YAML data you linked.  
What attributes must each type of object have?


Overall, I think this idea has considerable potential, but I am concerned 
that it is somewhat unfocused once one goes beyond the central ideas, and 
that no path to full implementation is mapped out.  I'm inclined to think 
that the most promising way forward would be to embark on the path that 
Hiera itself took: (1) build a tool; (2) prove it useful; (3) get it 
integrated; (4) expand from there.  The "persuade PL up front that it 
should be done" option that you seem to be on now is commendably audacious, 
and it may yet bear fruit, but it seems like a low-percentage play.  If you 
want to continue along that route, though, then it seems like the next step 
might be to prepare an ARM.


John

-- 
You received this message because you are subscribed to the Google Groups 
"Puppet Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to puppet-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/puppet-dev/b708d2e2-ece2-4011-898d-7b869a933cee%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [Puppet-dev] RFC - A specification for module schemas

Reply via email to