We have a lot of different tools generating and using various consolidated JSON files (and lib/whimsy) that are useful representations of underlying ASF organizational data. But... it's not clear what data comes from where and what the formats expected are.
Many of the generation scripts include documentation in the code, but this is suboptimal for other tool developers who want to figure out the best place to find a list of X or the condensed source for Y. What would be the best framework to document both the data sources and formats, what specific ASF data they mirror, and how the various lib/whimsy models expose this data? Even when a lib/whimsy model provides the data, some tool writers (and the projects.a.o website) will still use the raw /public/*.json files. A high-level overview is done, but it doesn't provide enough details to allow new tool writers to figure out what to use without digging into several different code files: https://whimsy.apache.org/test/dataflow.cgi It would be great to expose the format (array of hashes of hashes, whatever) for each JSON, along with the specific way the data is collected from different sources in a way we can store the data with the scripts, but expose them all in one place. Is RDoc worth configuring, or just build a simple source tree scanner for just a specific tag within .rb files to pull out just "data format and sources description"? For example, for my brand work, I need a list of the names of all software product releases made by TLPs, including the name of the PMC for each for sorting and linking - but it's not obvious which datafile is easiest to get this from. - Shane -- - Shane https://www.apache.org/foundation/marks/resources