On Sunday, January 31, 2016 at 2:37:53 PM UTC-6, Corey Osman wrote:
> I think we have strayed off topic here. Being able to validate hiera > should be something that can easily be done by anyone no matter which > version of puppet they use. > I agree that being able to validate Hiera data would be useful for everyone, no matter what version of Puppet they rely upon. I have no beef at all with anyone who wants to write tools that have broader version support, as opposed to narrower. I am quite open to discussing what such tools might look like, how they might work, and what their inputs and outputs might be. > The core problem is bad data going into hiera and then into puppet. The > consensus is that we all know this is problem. While my primary goal was > to validate hiera, I think there are other use cases for having an > intermediate serialization format of the module's interfaces stored in a > file or retrieved dynamically with a puppet face. > > I agree that bad data is a problem, and a widely recognized one. Tools and procedures for validating Hiera data are an excellent idea, and I am open to the possibility that a module schema such as you describe might have useful broader applications. Allowing for such schemata to be obtained dynamically seems the forward-looking approach, but it does not have to be exclusive of static schemata. Pragmatically, targeting static schemata first may be the best way to get such an effort off the ground. If we sacrifice "good" on the altar of "best" then we stand a good chance of being eternally stuck at "meh". > To summarize some of the points discussed: > > Building a schema: > - We need a higher level API for gathering module types, parameters, > and default values given a module, file, class or parameter > - Puppet should provide a way to output this information in a > serialized format and pure ruby objects > - format should be pluggable with customizable formats (JSON, > YAML, Module Schema, .hiera data schema, ..) > - should leverage puppet's built in datatypes > - build a hiera data schema based on all the modules in puppet's > modules path specific for each puppet environment > > I agree that it would be useful for there to be a mechanism for gathering such information from Puppet manifests. To whatever extent that needs to be built in to Puppet itself, it seems unlikely that such a feature would appear in any version of Puppet older than the development tip. As far as pluggable formats go, if you mean *output* formats then I'm unconvinced. Or perhaps I would just componentize differently. It seems to me that a single, flexible form that can serve as a *lingua franca* should be the immediate target, and I guess I would choose a Ruby object form for that. If the result is wanted in one or more external formats then defining and emitting the needed outputs is a separate, problem, and likely a much simpler one. As far as *input* formats go, I already opined that the best starting point would probably be a static, external schema format, at least for schemata that are not prepared programmatically in object format from the beginning. There is perhaps room to support more input formats, but I'm not immediately seeing why such support would be more than a tiny win. > Validating data > - Given a hiera data schema, hiera should be able to validate its data, > implemented by each backend provider > - hiera data schemas are unique to every user > > It's unclear to me how building validation directly into Hiera would gain anything if the idea is to rely on schemata gleaned dynamically from manifests in the first place. I don't see how Hiera could be any more effective than the catalog builder at detecting bad data at runtime if the two are relying on the same (meta)data. If it isn't any better then putting validation into Hiera would just move the point at which certain data errors are detected, at the cost of additional processing overhead. On the other hand, I do think that validating on top of hiera is better than validating the underlying data directly. Puppet sees the data only through the lens of Hiera, and if one is validating for Puppet then one wants to rely on the same view of the data that Puppet has. Moreover, validating on top of Hiera is independent of any particular Hiera back end. It may be that endowing Hiera with one or two new capabilities would facilitate offline data validation. For example, one might want to request a full dump of all data, so as to look for extraneous / misspelled keys. > Help not force people to use puppet 4 > - Given a module schema, retrofit puppet 3 code with puppet 4 data > types into the module's source code > - swagger like functionality, with the exception that its updating > code > - This helps people move from puppet 3 to puppet 4 > - Folks who cannot move to puppet 4 immediately can get the best of both > worlds with a easier way to migrate to puppet 4 > Isn't this what P3's future parser is for? I could see the value of validating data against a more detailed schema than can be extracted from P3 manifests, but I don't see it as migration assistance. If the data are wrong then that's an inherent problem, not a migration issue. Migration is in fact a solution, of sorts, to that problem, inasmuch as manifests written with P4 explicit data types can do a better job of validating data and therefore detecting data problems themselves. > > Module Schema > - This was never discussed, what should this look like? Schemas are > necessary whether they are statically or dynamically generated. > example: > https://github.com/logicminds/puppet_module_schemas/blob/master/apache_schema.yaml > I think an information model would be a better starting place than a physical example of a possible schema manifestation. What kinds of objects must the schema be able to represent? You mentioned several, but it seems that only a couple of them are represented in the YAML data you linked. What attributes must each type of object have? Overall, I think this idea has considerable potential, but I am concerned that it is somewhat unfocused once one goes beyond the central ideas, and that no path to full implementation is mapped out. I'm inclined to think that the most promising way forward would be to embark on the path that Hiera itself took: (1) build a tool; (2) prove it useful; (3) get it integrated; (4) expand from there. The "persuade PL up front that it should be done" option that you seem to be on now is commendably audacious, and it may yet bear fruit, but it seems like a low-percentage play. If you want to continue along that route, though, then it seems like the next step might be to prepare an ARM. John -- You received this message because you are subscribed to the Google Groups "Puppet Developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-dev/b708d2e2-ece2-4011-898d-7b869a933cee%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.