On 23 March 2018 at 22:14, Shane Curcuru <a...@shanecurcuru.org> wrote:
> A question was raised today about how we check if bits of organizational
> data that Whimsy processes are consistent or valid.
>
> Obviously Whimsy itself is not the canonical source of data; we usually
> suck it in from Infra-supported tools and simply cache it in more
> convenient forms.  But it might be interesting - and allow for
> experimentation - to add some data integrity checking into whimsy
> tooling.  This would be a best-effort warning of data issues, not a
> comprehensive solution.
>
> Is this interesting enough that people want to work on it?  If so, what
> would be a minimum interesting check to add for our main data?
>
>   https://whimsy.apache.org/public/
>
> The framework that occurs to me is to add any simple data check methods
> inside the various /www/roster/public*.rb scripts that are the cron jobs
> that create /public/*.json files.
>
> We could add a validate_data(json) method to most that - after the
> normal processing is complete - could do any checking desired.  If a
> problem is found, then call a variant of public_json_common.rb
> sendMail() that sends an alert about the issue.
>
> Sound useful?

Some of the public json jobs already do some checks.
For example, that LDAP group members are in the LDAP people group.
Here is a sample warning:

https://lists.apache.org/thread.html/3a0fd03a64cee0c9f5773b17d749d5e3fe33ea8d6c9e75d3372fe13c@%3Cnotifications.whimsical.apache.org%3E

It's not always easy to separate the checks from the processing, so
having a separate validate_data function may not always be the best
solution.

In the case of public_ldap_auth_groups, the check is done at the end
of the run, but only if the output has changed.
This avoids generating too many warnings.


> --
>
> - Shane
>   Director & Member
>   The Apache Software Foundation

Reply via email to