> > Side question: is it expected to be able to connect to the DB directly > from the outside? I don't have any clue about the possible security > implications.
This is do-able by creating different database accounts. Also, Wes's solution was to back up the database periodically (daily?) to protect against accidents. The current setup has a root user (full permission), `arrow_anonymous` user (select + insert only), and `arrow_admin` (select, insert, update, delete). On Wed, Feb 20, 2019 at 12:19 PM Antoine Pitrou <anto...@python.org> wrote: > > Side question: is it expected to be able to connect to the DB directly > from the outside? I don't have any clue about the possible security > implications. > > Regards > > Antoine. > > > > Le 20/02/2019 à 18:55, Melik-Adamyan, Areg a écrit : > > There is a lot of discussion going in the PR ARROW-4313 itself; I would > like to bring some of the high-level questions here to discuss. First of > all many thanks to Tanya for the work you are doing. > > Related to the dashboard intrinsics, I would like to set some scope and > stick to that so we would not waste any job and get maximum efficiency from > the work we are doing on the dashboard. > > One thing that IMHO we are missing is against which requirements the > work (DDL) is being done and in which scope? For me there are several > things: > > 1. We want continuous *validated* performance tracking against checkins > to catch performance regressions and progressions. Validated means that the > running environment is isolated enough so the stddev (assuming the > distribution is normal) is as close to 0 as possible. It means both > hardware and software should be fixed and not changeable to have only one > variable to measure. > > 2. The unit-tests framework (google/benchmark) allows to effectively > report in textual format the needed data on benchmark with preamble > containing information about the machine on which the benchmarks are run. > > 3. So with environments set and regular runs you have all the artifacts, > though not in a very comprehensible format. So the reason to set a > dashboard is to allow to consume data and be able to track performance of > various parts on a historical perspective and much more nicely with > visualizations. > > And here are the scope restrictions I have in mind: > > - Disallow to enter data to the central repo any single benchmarks run, > as they do not mean much in the case of continuous and statistically > relevant measurements. What information you will get if someone reports > some single run? You do not know how clean it was done, and more > importantly is it possible to reproduce elsewhere. That is why even if it > is better, worse or the same you cannot compare with the data already in > the DB. > > - Mandate the contributors to have dedicated environment for > measurements. Otherwise they can use the TeamCity to run and parse data and > publish on their site. Data that enters Arrow performance DB becomes Arrow > community owned data. And it becomes community's job to answer why certain > things are better or worse. > > - Because the numbers and flavors for CPU/GPU/accelerators are huge we > cannot satisfy all the needs upfront and create DB that covers all the > possible variants. I think we should have simple CPU and GPU configs now, > even if they will not be perfect. By simple I mean basic brand string. That > should be enough. Having all the detailed info in the DB does not make > sense, as my experience is telling, you never use them, you use the > CPUID/brandname to get the info needed. > > - Scope and reqs will change during the time and going huge now will > make things complicated later. So I think it will be beneficial to have > something quick up and running, get better understanding of our needs and > gaps, and go from there. > > The needed infra is already up on AWS, so as soon as we resolve DNS and > key exchange issues we can launch. > > > > -Areg. > > > > -----Original Message----- > > From: Tanya Schlusser [mailto:ta...@tickel.net] > > Sent: Thursday, February 7, 2019 4:40 PM > > To: dev@arrow.apache.org > > Subject: Re: Benchmarking dashboard proposal > > > > Late, but there's a PR now with first-draft DDL ( > https://github.com/apache/arrow/pull/3586). > > Happy to receive any feedback! > > > > I tried to think about how people would submit benchmarks, and added a > Postgraphile container for http-via-GraphQL. > > If others have strong opinions on the data modeling please speak up > because I'm more a database user than a designer. > > > > I can also help with benchmarking work in R/Python given guidance/a > roadmap/examples from someone else. > > > > Best, > > Tanya > > > > On Mon, Feb 4, 2019 at 12:37 PM Tanya Schlusser <ta...@tickel.net> > wrote: > > > >> I hope to make a PR with the DDL by tomorrow or Wednesday night—DDL > >> along with a README in a new directory `arrow/dev/benchmarking` unless > >> directed otherwise. > >> > >> A "C++ Benchmark Collector" script would be super. I expect some > >> back-and-forth on this to identify naïve assumptions in the data model. > >> > >> Attempting to submit actual benchmarks is how to get a handle on that. > >> I recognize I'm blocking downstream work. Better to get an initial PR > >> and some discussion going. > >> > >> Best, > >> Tanya > >> > >> On Mon, Feb 4, 2019 at 10:10 AM Wes McKinney <wesmck...@gmail.com> > wrote: > >> > >>> hi folks, > >>> > >>> I'm curious where we currently stand on this project. I see the > >>> discussion in https://issues.apache.org/jira/browse/ARROW-4313 -- > >>> would the next step be to have a pull request with .sql files > >>> containing the DDL required to create the schema in PostgreSQL? > >>> > >>> I could volunteer to write the "C++ Benchmark Collector" script that > >>> will run all the benchmarks on Linux and collect their data to be > >>> inserted into the database. > >>> > >>> Thanks > >>> Wes > >>> > >>> On Sun, Jan 27, 2019 at 12:20 AM Tanya Schlusser <ta...@tickel.net> > >>> wrote: > >>>> > >>>> I don't want to be the bottleneck and have posted an initial draft > >>>> data model in the JIRA issue > >>> https://issues.apache.org/jira/browse/ARROW-4313 > >>>> > >>>> It should not be a problem to get content into a form that would be > >>>> acceptable for either a static site like ASV (via CORS queries to a > >>>> GraphQL/REST interface) or a codespeed-style site (via a separate > >>>> schema organized for Django) > >>>> > >>>> I don't think I'm experienced enough to actually write any > >>>> benchmarks though, so all I can contribute is backend work for this > task. > >>>> > >>>> Best, > >>>> Tanya > >>>> > >>>> On Sat, Jan 26, 2019 at 7:37 PM Wes McKinney <wesmck...@gmail.com> > >>> wrote: > >>>> > >>>>> hi folks, > >>>>> > >>>>> I'd like to propose some kind of timeline for getting a first > >>>>> iteration of a benchmark database developed and live, with > >>>>> scripts to enable one or more initial agents to start adding new > >>>>> data on a daily / per-commit basis. I have at least 3 physical > >>>>> machines where I could immediately set up cron jobs to start > >>>>> adding new data, and I could attempt to backfill data as far back as > possible. > >>>>> > >>>>> Personally, I would like to see this done by the end of February > >>>>> if not sooner -- if we don't have the volunteers to push the work > >>>>> to completion by then please let me know as I will rearrange my > >>>>> priorities to make sure that it happens. Does that sounds reasonable? > >>>>> > >>>>> Please let me know if this plan sounds reasonable: > >>>>> > >>>>> * Set up a hosted PostgreSQL instance, configure backups > >>>>> * Propose and adopt a database schema for storing benchmark > >>>>> results > >>>>> * For C++, write script (or Dockerfile) to execute all > >>>>> google-benchmarks, output results to JSON, then adapter script > >>>>> (Python) to ingest into database > >>>>> * For Python, similar script that invokes ASV, then inserts ASV > >>>>> results into benchmark database > >>>>> > >>>>> This seems to be a pre-requisite for having a front-end to > >>>>> visualize the results, but the dashboard/front end can hopefully > >>>>> be implemented in such a way that the details of the benchmark > >>>>> database are not too tightly coupled > >>>>> > >>>>> (Do we have any other benchmarks in the project that would need > >>>>> to be inserted initially?) > >>>>> > >>>>> Related work to trigger benchmarks on agents when new commits > >>>>> land in master can happen concurrently -- one task need not block > >>>>> the other > >>>>> > >>>>> Thanks > >>>>> Wes > >>>>> > >>>>> On Mon, Jan 21, 2019 at 11:14 AM Wes McKinney > >>>>> <wesmck...@gmail.com> > >>> wrote: > >>>>>> > >>>>>> Sorry, copy-paste failure: > >>>>> https://issues.apache.org/jira/browse/ARROW-4313 > >>>>>> > >>>>>> On Mon, Jan 21, 2019 at 11:14 AM Wes McKinney > >>>>>> <wesmck...@gmail.com> > >>>>> wrote: > >>>>>>> > >>>>>>> I don't think there is one but I just created > >>>>>>> > >>>>> > >>> https://lists.apache.org/thread.html/278e573445c83bbd8ee66474b9356c52 > >>> 91a16f6b6eca11dbbe4b473a@%3Cdev.arrow.apache.org%3E > >>>>>>> > >>>>>>> On Mon, Jan 21, 2019 at 10:35 AM Tanya Schlusser < > >>> ta...@tickel.net> > >>>>> wrote: > >>>>>>>> > >>>>>>>> Areg, > >>>>>>>> > >>>>>>>> If you'd like help, I volunteer! No experience benchmarking > >>>>>>>> but > >>> tons > >>>>>>>> experience databasing—I can mock the backend (database + > >>>>>>>> http) > >>> as a > >>>>>>>> starting point for discussion if this is the way people > >>>>>>>> want to > >>> go. > >>>>>>>> > >>>>>>>> Is there a Jira ticket for this that i can jump into? > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Sun, Jan 20, 2019 at 3:24 PM Wes McKinney < > >>> wesmck...@gmail.com> > >>>>> wrote: > >>>>>>>> > >>>>>>>>> hi Areg, > >>>>>>>>> > >>>>>>>>> This sounds great -- we've discussed building a more > >>> full-featured > >>>>>>>>> benchmark automation system in the past but nothing has > >>>>>>>>> been > >>>>> developed > >>>>>>>>> yet. > >>>>>>>>> > >>>>>>>>> Your proposal about the details sounds OK; the single > >>>>>>>>> most > >>>>> important > >>>>>>>>> thing to me is that we build and maintain a very general > >>> purpose > >>>>>>>>> database schema for building the historical benchmark > >>>>>>>>> database > >>>>>>>>> > >>>>>>>>> The benchmark database should keep track of: > >>>>>>>>> > >>>>>>>>> * Timestamp of benchmark run > >>>>>>>>> * Git commit hash of codebase > >>>>>>>>> * Machine unique name (sort of the "user id") > >>>>>>>>> * CPU identification for machine, and clock frequency (in > >>> case of > >>>>>>>>> overclocking) > >>>>>>>>> * CPU cache sizes (L1/L2/L3) > >>>>>>>>> * Whether or not CPU throttling is enabled (if it can be > >>> easily > >>>>> determined) > >>>>>>>>> * RAM size > >>>>>>>>> * GPU identification (if any) > >>>>>>>>> * Benchmark unique name > >>>>>>>>> * Programming language(s) associated with benchmark (e.g. > >>>>>>>>> a > >>>>> benchmark > >>>>>>>>> may involve both C++ and Python) > >>>>>>>>> * Benchmark time, plus mean and standard deviation if > >>> available, > >>>>> else NULL > >>>>>>>>> > >>>>>>>>> (maybe some other things) > >>>>>>>>> > >>>>>>>>> I would rather not be locked into the internal database > >>> schema of a > >>>>>>>>> particular benchmarking tool. So people in the community > >>>>>>>>> can > >>> just > >>>>> run > >>>>>>>>> SQL queries against the database and use the data however > >>>>>>>>> they > >>>>> like. > >>>>>>>>> We'll just have to be careful that people don't DROP > >>>>>>>>> TABLE or > >>>>> DELETE > >>>>>>>>> (but we should have daily backups so we can recover from > >>>>>>>>> such > >>>>> cases) > >>>>>>>>> > >>>>>>>>> So while we may make use of TeamCity to schedule the runs > >>>>>>>>> on > >>> the > >>>>> cloud > >>>>>>>>> and physical hardware, we should also provide a path for > >>>>>>>>> other > >>>>> people > >>>>>>>>> in the community to add data to the benchmark database on > >>> their > >>>>>>>>> hardware on an ad hoc basis. For example, I have several > >>> machines > >>>>> in > >>>>>>>>> my home on all operating systems (Windows / macOS / > >>>>>>>>> Linux, > >>> and soon > >>>>>>>>> also ARM64) and I'd like to set up scheduled tasks / cron > >>> jobs to > >>>>>>>>> report in to the database at least on a daily basis. > >>>>>>>>> > >>>>>>>>> Ideally the benchmark database would just be a PostgreSQL > >>> server > >>>>> with > >>>>>>>>> a schema we write down and keep backed up etc. Hosted > >>> PostgreSQL is > >>>>>>>>> inexpensive ($200+ per year depending on size of > >>>>>>>>> instance; > >>> this > >>>>>>>>> probably doesn't need to be a crazy big machine) > >>>>>>>>> > >>>>>>>>> I suspect there will be a manageable amount of > >>>>>>>>> development > >>>>> involved to > >>>>>>>>> glue each of the benchmarking frameworks together with > >>>>>>>>> the > >>>>> benchmark > >>>>>>>>> database. This can also handle querying the operating > >>>>>>>>> system > >>> for > >>>>> the > >>>>>>>>> system information listed above > >>>>>>>>> > >>>>>>>>> Thanks > >>>>>>>>> Wes > >>>>>>>>> > >>>>>>>>> On Fri, Jan 18, 2019 at 12:14 AM Melik-Adamyan, Areg > >>>>>>>>> <areg.melik-adam...@intel.com> wrote: > >>>>>>>>>> > >>>>>>>>>> Hello, > >>>>>>>>>> > >>>>>>>>>> I want to restart/attach to the discussions for > >>>>>>>>>> creating > >>> Arrow > >>>>>>>>> benchmarking dashboard. I want to propose performance > >>> benchmark > >>>>> run per > >>>>>>>>> commit to track the changes. > >>>>>>>>>> The proposal includes building infrastructure for > >>>>>>>>>> per-commit > >>>>> tracking > >>>>>>>>> comprising of the following parts: > >>>>>>>>>> - Hosted JetBrains for OSS > >>>>>>>>>> https://teamcity.jetbrains.com/ > >>> as a > >>>>> build > >>>>>>>>> system > >>>>>>>>>> - Agents running in cloud both VM/container > >>>>>>>>>> (DigitalOcean, > >>> or > >>>>> others) > >>>>>>>>> and bare-metal (Packet.net/AWS) and on-premise(Nvidia > >>>>>>>>> boxes?) > >>>>>>>>>> - JFrog artifactory storage and management for OSS > >>>>>>>>>> projects > >>>>>>>>> https://jfrog.com/open-source/#artifactory2 > >>>>>>>>>> - Codespeed as a frontend > >>> https://github.com/tobami/codespeed > >>>>>>>>>> > >>>>>>>>>> I am volunteering to build such system (if needed more > >>>>>>>>>> Intel > >>>>> folks will > >>>>>>>>> be involved) so we can start tracking performance on > >>>>>>>>> various > >>>>> platforms and > >>>>>>>>> understand how changes affect it. > >>>>>>>>>> > >>>>>>>>>> Please, let me know your thoughts! > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> -Areg. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>> > >>> > >> >