>When you say "output is parsed", how is that exactly? We don't have any >scripts in the repository to do this yet (I have some comments on this below). >We also have to collect machine information and insert that into the database. >From my >perspective we have quite a bit of engineering work on this topic >("benchmark execution and data collection") to do. Yes I wrote one as a test. Then it can do POST to the needed endpoint the JSON structure. Everything else will be done in the
>My team and I have some physical hardware (including an Aarch64 Jetson TX2 >machine, might be interesting to see what the ARM64 results look like) where >we'd like to run benchmarks and upload the results also, so we need to write >some documentation about how to add a new machine and set up a cron job of >some kind. If it can run Linux, then we can setup it. >I'd like to eventually have a bot that we can ask to run a benchmark >comparison versus master. Reporting on all PRs automatically might be quite a >bit of work (and load on the machines) You should be able to choose the comparison between any two points: master-PR, master now - master yesterday, etc. >I thought the idea (based on our past e-mail discussions) was that we would >implement benchmark collectors (as programs in the Arrow git repository) for each benchmarking framework, starting with gbenchmark and expanding to include ASV (for Python) and then others I'll open a PR and happy to put it into Arrow. >It seems like writing the benchmark collector script that runs the benchmarks, >collects machine information, and inserts data into an instance of the >database is the next milestone. Until that's done it seems difficult to do >much else Ok, will update the Jira 5070 and link the 5071. Thanks.