I see that Francois is doing some work related to this in https://github.com/apache/arrow/pull/4077
On Fri, Mar 29, 2019 at 11:20 AM Wes McKinney <wesmck...@gmail.com> wrote: > > hi, > > After doing a little research I took a closer look at the shell scripts in > > https://github.com/apache/arrow/tree/master/dev/benchmarking > > While these may work for importing the gbenchmark data, the general > approach seems inflexible to me, and I would recommend rewriting them > as Python programs to enable better extensibility, finer grained > control (e.g. to refine and manipulate the output to be "nicer"), and > make it easier to support importing output from different kinds of > benchmark output. > > - Wes > > On Fri, Mar 29, 2019 at 10:06 AM Wes McKinney <wesmck...@gmail.com> wrote: > > > > hi Areg, > > > > On Fri, Mar 29, 2019 at 1:25 AM Melik-Adamyan, Areg > > <areg.melik-adam...@intel.com> wrote: > > > > > > Back to the benchmarking per commit. > > > > > > So currently I have fired a community TeamCity Edition here > > > http://arrow-publi-1wwtu5dnaytn9-2060566241.us-east-1.elb.amazonaws.com > > > and dedicated pool of two Skylake bare metal machines (Intel(R) Core(TM) > > > i7-6700 CPU @ 3.40GHz) This can go to up to 4 if needed. > > > Then the machines are prepared for benchmarking in the following way: > > > - In BIOS/Setup power saving features are disabled > > > - Machines are locked for access using pam_access > > > - Max frequency is set through cpupower and in /etc/sysconfig/cpupower > > > - All services that are not needed switched off: > uptime 23:15:17 up 26 > > > days, 23:24, 1 user, load average: 0.00, 0.00, 0.00 > > > - Transparent huge pages set on demand cat > > > /sys/kernel/mm/transparent_hugepage/enabled > > > always [madvise] never > > > - audit control switched off auditctl -e 0 > > > - Memory clean added to launch scripts echo 3 > /proc/sys/vm/drop_caches > > > - pstate=disable added to the kernel config > > > > > > This config is giving relatively clean and not noisy machine. > > > Commits in master trigger build and ctest -L benchmarks. Output is parsed. > > > > When you say "output is parsed", how is that exactly? We don't have > > any scripts in the repository to do this yet (I have some comments on > > this below). We also have to collect machine information and insert > > that into the database. From my perspective we have quite a bit of > > engineering work on this topic ("benchmark execution and data > > collection") to do. > > > > My team and I have some physical hardware (including an Aarch64 Jetson > > TX2 machine, might be interesting to see what the ARM64 results look > > like) where we'd like to run benchmarks and upload the results also, > > so we need to write some documentation about how to add a new machine > > and set up a cron job of some kind > > > > > > > > What is missing: > > > * Where should our Codespeed database reside? I can fire-up a VM and put > > > it there, or if you have other preferences let's discuss. > > > > Since this isn't ASF-owned infrastructure, it can go anywhere. It > > would be nice to make backups publicly available > > > > > * What address should it have? > > > > The address can be anything really > > > > > * How to make it available to all developers? Do we want to integrate > > > into CI or not? > > > > I'd like to eventually have a bot that we can ask to run a benchmark > > comparison versus master. Reporting on all PRs automatically might be > > quite a bit of work (and load on the machines) > > > > > * What is the standard benchmark output? I suppose Googlebench, but lets > > > state that. > > > > I thought the idea (based on our past e-mail discussions) was that we > > would implement benchmark collectors (as programs in the Arrow git > > repository) for each benchmarking framework, starting with gbenchmark > > and expanding to include ASV (for Python) and then others > > > > > * My interest is the C++ benchmarks only for now. Do we need to track all > > > benchmarks? > > > > Yes I think we want to be able to run the Python benchmarks too and > > insert that data. Other languages can implement a benchmark collector > > to arrange their benchmark data according to the database schema > > > > > * What is the process of adding benchmarks? > > > > Normal pull requests (see all the C++ programs that end in > > "-benchmark.cc"). The benchmark collector / insertion scripts may need > > to recognize when a benchmark has been run for the first time (I > > haven't looked closely enough at the schema to see if there are any > > primary keys associated with a particular benchmark name) > > > > > > > > Anything else for short term? > > > > It seems like writing the benchmark collector script that runs the > > benchmarks, collects machine information, and inserts data into an > > instance of the database is the next milestone. Until that's done it > > seems difficult to do much else > > > > > > > > -Areg. > > > > > > > > > > > > > > >