(hunter) branch master updated: Introduce docs and reproducible examples (#28)

asorokoumov Mon, 27 Jan 2025 11:39:48 -0800

This is an automated email from the ASF dual-hosted git repository.

asorokoumov pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hunter.git



The following commit(s) were added to refs/heads/master by this push:
     new 251884d  Introduce docs and reproducible examples (#28)
251884d is described below

commit 251884d95b1121348eb8a38b494db37a4f5686ae
Author: Alex Sorokoumov <918393+ger...@users.noreply.github.com>
AuthorDate: Mon Jan 27 11:39:29 2025 -0800

    Introduce docs and reproducible examples (#28)
    
    * Add PostgreSQL example/docs
    
    1. Add a reproducible docker-compose.
    2. Move the docs from README to a separate docs page referencing the
    reproducible example.
    
    * Set content-type to application/json when creating an annotation in 
Grafana
    
    Without this change, current Grafana version responds with 400 Bad Request
    
    * Move CONTRIBUTING.md to docs/
    
    * Move BigQuery to docs
    
    * Add example and docs for csv
    
    * Add examples and docs for Graphite/Grafana
    
    * Move installation instructions into a separate docs page
    
    * Split Usage into individual docs pages
    
    * Remove PostgreSQL instructions from README.md
    
    * Add Getting Started
    
    * Remove trailing spaces in README
    
    * Pin poetry version to the same version we use in ci-tools, 1.1.13
    
    The newest version, 2.0.1, fails Docker build. Since the goal of this PR
    is to introduce docs with reproducible examples, I am not going to
    upgrade Poetry version _here_, despite it being really, really old.
    
    * Add table of contents
---
 Dockerfile                                         |   2 +-
 README.md                                          | 412 +--------------------
 docs/BASICS.md                                     | 177 +++++++++
 examples/bigquery/README.md => docs/BIG_QUERY.md   |   6 +-
 CONTRIBUTING.md => docs/CONTRIBUTING.md            |   0
 docs/CSV.md                                        |  45 +++
 docs/GETTING_STARTED.md                            | 129 +++++++
 docs/GRAFANA.md                                    |  65 ++++
 docs/GRAPHITE.md                                   | 106 ++++++
 docs/INSTALL.md                                    |  19 +
 docs/POSTGRESQL.md                                 | 119 ++++++
 docs/README.md                                     |  16 +
 examples/csv/data/local_sample.csv                 |  11 +
 examples/csv/docker-compose.yaml                   |  12 +
 examples/csv/hunter.yaml                           |  10 +
 examples/graphite/datagen/datagen.sh               |  72 ++++
 examples/graphite/docker-compose.yaml              |  62 ++++
 .../graphite/grafana/dashboards/benchmarks.json    | 246 ++++++++++++
 .../graphite/grafana/dashboards/dashboards.yaml    |   9 +
 .../graphite/grafana/datasources/graphite.yaml     |  12 +
 examples/graphite/hunter.yaml                      |  31 ++
 examples/postgresql/docker-compose.yaml            |  38 ++
 examples/{psql => postgresql}/hunter.yaml          |  26 +-
 examples/postgresql/init-db/schema.sql             |  85 +++++
 examples/psql/README.md                            |  22 --
 examples/psql/schema.sql                           |  48 ---
 hunter/grafana.py                                  |   2 +-
 27 files changed, 1313 insertions(+), 469 deletions(-)

diff --git a/Dockerfile b/Dockerfile
index b8ab8b0..67f2e39 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -25,7 +25,7 @@ RUN apt-get update --assume-yes && \
     && rm -rf /var/lib/apt/lists/*
 
 # Get poetry package
-RUN curl -sSL https://install.python-poetry.org | python3 -
+RUN curl -sSL https://install.python-poetry.org | python3 - --version 1.1.13
 # Adding poetry to PATH
 ENV PATH="/root/.local/bin/:$PATH"
 
diff --git a/README.md b/README.md
index dbd99ee..bfa3db8 100644
--- a/README.md
+++ b/README.md
@@ -4,409 +4,35 @@ Hunter – Hunts Performance Regressions
 _This is an unsupported open source project created by DataStax employees._
 
 
-Hunter performs statistical analysis of performance test results stored 
-in CSV files or Graphite database. It finds change-points and notifies about 
-possible performance regressions.  
- 
-A typical use-case of hunter is as follows: 
+Hunter performs statistical analysis of performance test results stored
+in CSV files, PostgreSQL, BigQuery, or Graphite database. It finds 
change-points and notifies about
+possible performance regressions.
+
+A typical use-case of hunter is as follows:
 
 - A set of performance tests is scheduled repeatedly.
-- The resulting metrics of the test runs are stored in a time series database 
(Graphite) 
-   or appended to CSV files. 
-- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded 
+- The resulting metrics of the test runs are stored in a time series database 
(Graphite)
+   or appended to CSV files.
+- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded
   metrics regularly.
 - Hunter notifies about significant changes in recorded metrics by outputting 
text reports or
   sending Slack notifications.
-  
-Hunter is capable of finding even small, but systematic shifts in metric 
values, 
-despite noise in data.
-It adapts automatically to the level of noise in data and tries not to notify 
about changes that 
-can happen by random. Unlike in threshold-based performance monitoring 
systems, 
-there is no need to setup fixed warning threshold levels manually for each 
recorded metric.  
-The level of accepted probability of false-positives, as well as the 
-minimal accepted magnitude of changes are tunable. Hunter is also capable of 
comparing 
-the level of performance recorded in two different periods of time – which is 
useful for
-e.g. validating the performance of the release candidate vs the previous 
release of your product.    
-
-This is still work-in-progress, unstable code. 
-Features may be missing. 
-Usability may be unsatisfactory.
-Documentation may be incomplete.
-Backward compatibility may be broken any time.
-
-See [CONTRIBUTING.md](CONTRIBUTING.md) for development instructions.
-
-## Installation
-
-Hunter requires Python 3.8.  If you don't have python 3.8, 
-use pyenv to install it.
-
-Use pipx to install hunter:
-
-```
-pipx install git+ssh://g...@github.com/datastax-labs/hunter
-```
-
-## Setup
-Copy the main configuration file `resources/hunter.yaml` to 
`~/.hunter/hunter.yaml` and adjust 
-Graphite and Grafana addresses and credentials. 
-
-Alternatively, it is possible to leave 
-the config file as is, and provide credentials in the environment
-by setting appropriate environment variables.
-Environment variables are interpolated before interpreting the configuration 
file.
-
-### Defining tests
-All test configurations are defined in the main configuration file.
-Hunter supports publishing results to a CSV file, 
[Graphite](https://graphiteapp.org/), and 
[PostgreSQL](https://www.postgresql.org/).
-
-Tests are defined in the `tests` section.
-
-#### Importing results from CSV
-The following definition will import results of the test from a local CSV 
file: 
-
-```yaml
-tests:
-  local.sample:
-    type: csv
-    file: tests/resources/sample.csv
-    time_column: time
-    metrics: [metric1, metric2]
-    attributes: [commit]
-    csv_options:
-      delimiter: ","
-      quote_char: "'"      
-```
-
-The `time_column` property points to the name of the column storing the 
timestamp
-of each test-run. The data points will be ordered by that column.
-
-The `metrics` property selects the columns tha hold the values to be analyzed. 
These values must
-be numbers convertible to floats. The `metrics` property can be not only a 
simple list of column 
-names, but it can also be a dictionary configuring other properties of each 
metric, 
-the column name or direction:
-
-```yaml
-metrics: 
-  resp_time_p99:
-    direction: -1
-    column: p99
-```
-
-Direction can be 1 or -1. If direction is set to 1, this means that the higher 
the metric, the
-better the performance is. If it is set to -1, higher values mean worse 
performance.
-
-The `attributes` property describes any other columns that should be attached 
to the final 
-report. Special attribute `version` and `commit` can be used to query for a 
given time-range.
-
-
-#### Importing results from Graphite
-
-To import data from Graphite, the test configuration must inform Hunter how the
-data are published in your history server. This is done by specifying the 
Graphite path prefix
-common for all the test's metrics and suffixes for each of the metrics 
recorded by the test run.
-
-```yaml
-tests:    
-  my-product.test:
-    type: graphite
-    tags: [perf-test, daily, my-product]
-    prefix: performance-tests.daily.my-product
-    metrics:
-      throughput: 
-        suffix: client.throughput
-      response-time:
-        suffix: client.p50
-        direction: -1    # lower is better
-      cpu-load: 
-        suffix: server.cpu
-        direction: -1    # lower is better
-```
-
-The optional `tags` property contains the tags that are used to query for 
Graphite events that store 
-additional test run metadata such as run identifier, commit, branch and 
product version information.
-
-The following command will post an event with the test run metadata:
-```shell
-$ curl -X POST "http://graphite_address/events/"; \
-    -d '{ 
-      "what": "Performance Test", 
-      "tags": ["perf-test", "daily", "my-product"],   
-      "when": 1537884100,
-      "data": {"commit": "fe6583ab", "branch": "new-feature", "version": 
"0.0.1"}
-    }'
-```
-
-Posting those events is not mandatory, but when they are available, Hunter is 
able to 
-filter data by commit or version using `--since-commit` or `--since-version` 
selectors.
-
-#### Importing results from PostgreSQL
-
-To import data from PostgreSQL, Hunter configuration must contain the database 
connection details:
-
-```yaml
-# External systems connectors configuration:
-postgres:
-  hostname: ...
-  port: ...
-  username: ...
-  password: ...
-  database: ...
-```
-
-Test configurations must contain a query to select experiment data, a time 
column, and a list of columns to analyze:
-
-```yaml
-tests:
-  aggregate_mem:
-    type: postgres
-    time_column: commit_ts
-    attributes: [experiment_id, config_id, commit]
-    metrics:
-      process_cumulative_rate_mean:
-        direction: 1
-        scale: 1
-      process_cumulative_rate_stderr:
-        direction: -1
-        scale: 1
-      process_cumulative_rate_diff:
-        direction: -1
-        scale: 1    
-    query: |
-      SELECT e.commit, 
-             e.commit_ts, 
-             r.process_cumulative_rate_mean, 
-             r.process_cumulative_rate_stderr, 
-             r.process_cumulative_rate_diff, 
-             r.experiment_id, 
-             r.config_id
-      FROM results r
-      INNER JOIN configs c ON r.config_id = c.id
-      INNER JOIN experiments e ON r.experiment_id = e.id
-      WHERE e.exclude_from_analysis = false AND
-            e.branch = 'trunk' AND
-            e.username = 'ci' AND
-            c.store = 'MEM' AND
-            c.cache = true AND
-            c.benchmark = 'aggregate' AND
-            c.instance_type = 'ec2i3.large'
-      ORDER BY e.commit_ts ASC;
-```
 
-For more details, see the examples in [examples/psql](examples/psql).
+Hunter is capable of finding even small, but persistent shifts in metric 
values,
+despite noise in data. It adapts automatically to the level of noise in data 
and
+tries to notify only about persistent, statistically significant changes, be 
it in the system
+under test or in the environment.
 
-#### Avoiding test definition duplication
-You may find that your test definitions are very similar to each other,
-e.g. they all have the same metrics. Instead of copy-pasting the definitions
-you can use templating capability built-in hunter to define the common bits of 
configs separately.
+Unlike in threshold-based performance monitoring systems, there is no need to 
setup fixed warning
+threshold levels manually for each recorded metric. The level of accepted 
probability of
+false-positives, as well as the minimal accepted magnitude of changes are 
tunable. Hunter is
+also capable of comparingthe level of performance recorded in two different 
periods of time – which
+is useful for e.g. validating the performance of the release candidate vs the 
previous release of your product.
 
-First, extract the common pieces to the `templates` section:
-```yaml
-templates:
-  common-metrics:
-    throughput: 
-      suffix: client.throughput
-    response-time:
-      suffix: client.p50
-      direction: -1    # lower is better
-    cpu-load: 
-      suffix: server.cpu
-      direction: -1    # lower is better
-```
-
-Next you can recall a template in the `inherit` property of the test:
-
-```yaml
-my-product.test-1:
-  type: graphite
-  tags: [perf-test, daily, my-product, test-1]
-  prefix: performance-tests.daily.my-product.test-1
-  inherit: common-metrics
-my-product.test-2:
-  type: graphite
-  tags: [perf-test, daily, my-product, test-2]
-  prefix: performance-tests.daily.my-product.test-2
-  inherit: common-metrics
-```
-
-You can inherit more than one template.
-
-## Usage
-### Listing Available Tests
-
-```
-hunter list-groups
-hunter list-tests [group name]
-```
-
-### Listing Available Metrics for Tests
-
-To list all available metrics defined for the test:
-```
-hunter list-metrics <test>
-```
-
-### Finding Change Points
-```
-hunter analyze <test>... 
-hunter analyze <group>...
-```
-
-This command prints interesting results of all
-runs of the test and a list of change-points. 
-A change-point is a moment when a metric value starts to differ significantly
-from the values of the earlier runs and when the difference 
-is consistent enough that it is unlikely to happen by chance.  
-Hunter calculates the probability (P-value) that the change point was caused 
-by chance - the closer to zero, the more "sure" it is about the regression or
-performance improvement. The smaller is the actual magnitude of the change,
-the more data points are needed to confirm the change, therefore Hunter may
-not notice the regression after the first run that regressed.
-
-The `analyze` command accepts multiple tests or test groups.
-The results are simply concatenated.
-
-#### Example
-
-```
-$ hunter analyze local.sample
-INFO: Computing change points for test sample.csv...
-sample:
-time                         metric1    metric2
--------------------------  ---------  ---------
-2021-01-01 02:00:00 +0000     154023      10.43
-2021-01-02 02:00:00 +0000     138455      10.23
-2021-01-03 02:00:00 +0000     143112      10.29
-2021-01-04 02:00:00 +0000     149190      10.91
-2021-01-05 02:00:00 +0000     132098      10.34
-2021-01-06 02:00:00 +0000     151344      10.69
-                                      ·········
-                                         -12.9%
-                                      ·········
-2021-01-07 02:00:00 +0000     155145       9.23
-2021-01-08 02:00:00 +0000     148889       9.11
-2021-01-09 02:00:00 +0000     149466       9.13
-2021-01-10 02:00:00 +0000     148209       9.03
-```
-
-### Annotating Change Points in Grafana
-Change points found by `analyze` can be exported 
-as Grafana annotations using the `--update-grafana` flag:
-
-```
-$ hunter analyze <test or group> --update-grafana
-```
-
-The annotations generated by Hunter get the following tags:
-- `hunter`
-- `change-point`  
-- `test:<test name>`
-- `metric:<metric name>`
-- tags configured in the `tags` property of the test
-- tags configured in the `annotate` property of the test
-- tags configured in the `annotate` property of the metric
-
-Additionally, the `annotate` property supports variable tags:
-- `%{TEST_NAME}` - name of the test
-- `%{METRIC_NAME}` - name of the metric  
-- `%{GRAPHITE_PATH}` - resolves to the path to the data in Graphite
-- `%{GRAPHITE_PATH_COMPONENTS}` - splits the path of the data in Graphite into 
separate components 
-  and each path component is exported as a separate tag
-- `%{GRAPHITE_PREFIX}` - resolves to the prefix of the path to the data in 
Graphite 
-  (the part of the path up to the metric suffix)
-- `%{GRAPHITE_PREFIX_COMPONENTS}` - similar as `%{GRAPHITE_PATH_COMPONENTS}` 
but splits the prefix
-of the path instead of the path
-  
-
-### Validating Performance of the Main Branch
-Often we want to know if the most recent product version  
-performs at least as well as one of the previous releases. It is hard to tell 
that by looking
-at the individual change points. Therefore, Hunter provides a separate command 
for comparing
-the current performance with the baseline performance level denoted by 
`--since-XXX` selector:
-
-```
-$ hunter regressions <test or group> 
-$ hunter regressions <test or group> --since <date>
-$ hunter regressions <test or group> --since-version <version>
-$ hunter regressions <test or group> --since-commit <commit>
-```
-
-If there are no regressions found in any of the tests, 
-Hunter prints `No regressions found` message. 
-Otherwise, it gives a list of tests with metrics and 
-magnitude of regressions.
- 
-In this test, Hunter compares performance level around the baseline ("since") 
point with 
-the performance level at the end of the time series. If the baseline point is 
not specified, the 
-beginning of the time series is assumed. The "performance level at the point" 
-is computed from all the data points between two nearest change points. 
-Then two such selected fragments are compared using Student's T-test for 
statistical differences. 
-
-#### Examples
-```
-$ hunter regressions local.sample
-INFO: Computing change points for test local.sample...
-local.sample:
-    metric2         :     10.5 -->     9.12 ( -12.9%)
-Regressions in 1 test found
-
-$ hunter regressions local.sample --since '2021-01-07 02:00:00'
-INFO: Computing change points for test local.sample...
-local.sample: OK
-No regressions found!
-```
-
-### Validating Performance of a Feature Branch
-The `hunter regressions` command can work with feature branches.
-
-First you need to tell Hunter how to fetch the data of the tests run against a 
feature branch.
-The `prefix` property of the graphite test definition accepts `%{BRANCH}` 
variable, 
-which is substituted at the data import time by the branch name passed to 
`--branch` 
-command argument. Alternatively, if the prefix for the main branch of your 
product is different
-from the prefix used for feature branches, you can define an additional 
`branch_prefix` property.
-
-```yaml
-my-product.test-1:
-  type: graphite
-  tags: [perf-test, daily, my-product, test-1]
-  prefix: performance-tests.daily.%{BRANCH}.my-product.test-1
-  inherit: common-metrics
-
-my-product.test-2:
-  type: graphite
-  tags: [perf-test, daily, my-product, test-2]
-  prefix: performance-tests.daily.master.my-product.test-2
-  branch_prefix: performance-tests.feature.%{BRANCH}.my-product.test-2
-  inherit: common-metrics
-```
-
-Now you can verify if correct data are imported by running 
-`hunter analyze <test> --branch <branch>`.
-
-The `--branch` argument also works with `hunter regressions`. In this case a 
comparison will be made
-between the tail of the specified branch and the tail of the main branch (or a 
point of the 
-main branch specified by one of the `--since` selectors).
-
-```
-$ hunter regressions <test or group> --branch <branch> 
-$ hunter regressions <test or group> --branch <branch> --since <date>
-$ hunter regressions <test or group> --branch <branch> --since-version 
<version>
-$ hunter regressions <test or group> --branch <branch> --since-commit <commit>
-```
-
-Sometimes when working on a feature branch, you may run the tests multiple 
times,
-creating more than one data point. To ignore the previous test results, and 
compare
-only the last few points on the branch with the tail of the main branch, 
-use the `--last <n>` selector. E.g. to check regressions on the last run of 
the tests
-on the feature branch:
+Backward compatibility may be broken any time.
 
-```
-$ hunter regressions <test or group> --branch <branch> --last 1  
-```
+See the documentation in [docs/README.md](docs/README.md).
 
-Please beware that performance validation based on a single data point is 
quite weak 
-and Hunter might miss a regression if the point is not too much different from
-the baseline. 
 
 ## License
 
diff --git a/docs/BASICS.md b/docs/BASICS.md
new file mode 100644
index 0000000..b2461f4
--- /dev/null
+++ b/docs/BASICS.md
@@ -0,0 +1,177 @@
+# Basics
+
+## Listing Available Tests
+
+```
+hunter list-groups
+```
+
+Lists all available test groups - high-level categories of tests.
+
+```
+hunter list-tests [group name]
+```
+
+Lists all tests or the tests within a given group, if the group name is 
provided.
+
+## Listing Available Metrics for Tests
+
+To list all available metrics defined for the test:
+
+```
+hunter list-metrics <test>
+```
+
+### Example
+
+> [!TIP]
+> See [hunter.yaml](../examples/csv/hunter.yaml) for the full example 
configuration.
+
+```
+$ hunter list-metrics local.sample
+metric1
+metric2
+```
+
+## Finding Change Points
+
+```
+hunter analyze <test>...
+hunter analyze <group>...
+```
+
+This command prints interesting results of all
+runs of the test and a list of change-points.
+A change-point is a moment when a metric value starts to differ significantly
+from the values of the earlier runs and when the difference
+is persistent and statistically significant that it is unlikely to happen by 
chance.
+Hunter calculates the probability (P-value) that the change point was caused
+by chance - the closer to zero, the more "sure" it is about the regression or
+performance improvement. The smaller is the actual magnitude of the change,
+the more data points are needed to confirm the change, therefore Hunter may
+not notice the regression immediately after the first run that regressed.
+However, it will eventually identify the specific commit that caused the 
regression,
+as it analyzes the history of changes rather than just the HEAD of a branch.
+
+The `analyze` command accepts multiple tests or test groups.
+The results are simply concatenated.
+
+### Example
+
+> [!TIP]
+> See [hunter.yaml](../examples/csv/hunter.yaml) for the full
+> example configuration and 
[local_samples.csv](../examples/csv/data/local_samples)
+> for the data.
+
+```
+$ hunter analyze local.sample --since=2024-01-01
+INFO: Computing change points for test sample.csv...
+sample:
+time                         metric1    metric2
+-------------------------  ---------  ---------
+2021-01-01 02:00:00 +0000     154023      10.43
+2021-01-02 02:00:00 +0000     138455      10.23
+2021-01-03 02:00:00 +0000     143112      10.29
+2021-01-04 02:00:00 +0000     149190      10.91
+2021-01-05 02:00:00 +0000     132098      10.34
+2021-01-06 02:00:00 +0000     151344      10.69
+                                      ·········
+                                         -12.9%
+                                      ·········
+2021-01-07 02:00:00 +0000     155145       9.23
+2021-01-08 02:00:00 +0000     148889       9.11
+2021-01-09 02:00:00 +0000     149466       9.13
+2021-01-10 02:00:00 +0000     148209       9.03
+```
+
+## Avoiding test definition duplication
+
+You may find that your test definitions are very similar to each other,  e.g. 
they all have the same metrics. Instead
+of copy-pasting the definitions  you can use templating capability built-in 
hunter to define the common bits of configs
+separately.
+
+First, extract the common pieces to the `templates` section:
+```yaml
+templates:
+  common-metrics:
+    throughput:
+      suffix: client.throughput
+    response-time:
+      suffix: client.p50
+      direction: -1    # lower is better
+    cpu-load:
+      suffix: server.cpu
+      direction: -1    # lower is better
+```
+
+Next you can recall a template in the `inherit` property of the test:
+
+```yaml
+my-product.test-1:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-1]
+  prefix: performance-tests.daily.my-product.test-1
+  inherit: common-metrics
+my-product.test-2:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-2]
+  prefix: performance-tests.daily.my-product.test-2
+  inherit: common-metrics
+```
+
+You can inherit more than one template.
+
+## Validating Performance of a Feature Branch
+
+The `hunter regressions` command can work with feature branches.
+
+First you need to tell Hunter how to fetch the data of the tests run against a 
feature branch.
+The `prefix` property of the graphite test definition accepts `%{BRANCH}` 
variable,
+which is substituted at the data import time by the branch name passed to 
`--branch`
+command argument. Alternatively, if the prefix for the main branch of your 
product is different
+from the prefix used for feature branches, you can define an additional 
`branch_prefix` property.
+
+```yaml
+my-product.test-1:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-1]
+  prefix: performance-tests.daily.%{BRANCH}.my-product.test-1
+  inherit: common-metrics
+
+my-product.test-2:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-2]
+  prefix: performance-tests.daily.master.my-product.test-2
+  branch_prefix: performance-tests.feature.%{BRANCH}.my-product.test-2
+  inherit: common-metrics
+```
+
+Now you can verify if correct data are imported by running
+`hunter analyze <test> --branch <branch>`.
+
+The `--branch` argument also works with `hunter regressions`. In this case a 
comparison will be made
+between the tail of the specified branch and the tail of the main branch (or a 
point of the
+main branch specified by one of the `--since` selectors).
+
+```
+$ hunter regressions <test or group> --branch <branch>
+$ hunter regressions <test or group> --branch <branch> --since <date>
+$ hunter regressions <test or group> --branch <branch> --since-version 
<version>
+$ hunter regressions <test or group> --branch <branch> --since-commit <commit>
+```
+
+Sometimes when working on a feature branch, you may run the tests multiple 
times,
+creating more than one data point. To ignore the previous test results, and 
compare
+only the last few points on the branch with the tail of the main branch,
+use the `--last <n>` selector. E.g. to check regressions on the last run of 
the tests
+on the feature branch:
+
+```
+$ hunter regressions <test or group> --branch <branch> --last 1
+```
+
+Please beware that performance validation based on a single data point is 
quite weak
+and Hunter might miss a regression if the point is not too much different from
+the baseline. However, accuracy improves as more data points accumulate, and 
it is
+a normal way of using Hunter to just merge a feature and then revert if it is
+flagged later.
diff --git a/examples/bigquery/README.md b/docs/BIG_QUERY.md
similarity index 68%
rename from examples/bigquery/README.md
rename to docs/BIG_QUERY.md
index 035d088..e4f9405 100644
--- a/examples/bigquery/README.md
+++ b/docs/BIG_QUERY.md
@@ -1,6 +1,8 @@
+# BigQuery
+
 ## Schema
 
-See [schema.sql](schema.sql) for the example schema.
+See [schema.sql](../examples/bigquery/schema.sql) for the example schema.
 
 ## Usage
 
@@ -13,7 +15,7 @@ export BIGQUERY_VAULT_SECRET=...
 ```
 or in `hunter.yaml`.
 
-Also configure the credentials. See 
[config_credentials.sh](config_credentials.sh) for an example.
+Also configure the credentials. See 
[config_credentials.sh](../examples/bigquery/config_credentials.sh) for an 
example.
 
 The following command shows results for a single test `aggregate_mem` and 
updates the database with newly found change points:
 
diff --git a/CONTRIBUTING.md b/docs/CONTRIBUTING.md
similarity index 100%
rename from CONTRIBUTING.md
rename to docs/CONTRIBUTING.md
diff --git a/docs/CSV.md b/docs/CSV.md
new file mode 100644
index 0000000..f342f7f
--- /dev/null
+++ b/docs/CSV.md
@@ -0,0 +1,45 @@
+# Importing results from CSV
+
+> [!TIP]
+> See [hunter.yaml](../examples/csv/hunter.yaml) for the full example 
configuration.
+
+## Tests
+
+```yaml
+tests:
+  local.sample:
+    type: csv
+    file: tests/local_sample.csv
+    time_column: time
+    attributes: [commit]
+    metrics: [metric1, metric2]
+    csv_options:
+      delimiter: ','
+      quotechar: "'"
+```
+
+## Example
+
+```bash
+docker-compose -f examples/csv/docker-compose.yaml run --build hunter 
bin/hunter analyze local.sample
+```
+
+Expected output:
+
+```bash
+time                       commit      metric1    metric2
+-------------------------  --------  ---------  ---------
+2024-01-01 02:00:00 +0000  aaa0         154023      10.43
+2024-01-02 02:00:00 +0000  aaa1         138455      10.23
+2024-01-03 02:00:00 +0000  aaa2         143112      10.29
+2024-01-04 02:00:00 +0000  aaa3         149190      10.91
+2024-01-05 02:00:00 +0000  aaa4         132098      10.34
+2024-01-06 02:00:00 +0000  aaa5         151344      10.69
+                                                ·········
+                                                   -12.9%
+                                                ·········
+2024-01-07 02:00:00 +0000  aaa6         155145       9.23
+2024-01-08 02:00:00 +0000  aaa7         148889       9.11
+2024-01-09 02:00:00 +0000  aaa8         149466       9.13
+2024-01-10 02:00:00 +0000  aaa9         148209       9.03
+```
diff --git a/docs/GETTING_STARTED.md b/docs/GETTING_STARTED.md
new file mode 100644
index 0000000..7e94221
--- /dev/null
+++ b/docs/GETTING_STARTED.md
@@ -0,0 +1,129 @@
+# Getting Started
+
+## Installation
+
+Hunter requires Python 3.8.  If you don't have python 3.8,
+use pyenv to install it.
+
+Use pipx to install hunter:
+
+```
+pipx install git+ssh://g...@github.com/apache/hunter
+```
+
+## Setup
+
+Copy the main configuration file `resources/hunter.yaml` to 
`~/.hunter/hunter.yaml` and adjust data source configuration.
+
+> [!TIP]
+> See docs on specific data sources to learn more about their configuration - 
[CSV](CSV.md), [Graphite](GRAPHITE.md),
+[PostgreSQL](POSTGRESQL.md), or [BigQuery](BIGQUERY.md).
+
+Alternatively, it is possible to leave the config file as is, and provide 
credentials in the environment
+by setting appropriate environment variables.
+Environment variables are interpolated before interpreting the configuration 
file.
+
+## Defining tests
+
+All test configurations are defined in the main configuration file.
+Hunter supports reading data from and publishing results to a CSV file, 
[Graphite](https://graphiteapp.org/),
+[PostgreSQL](https://www.postgresql.org/), and 
[BigQuery](https://cloud.google.com/bigquery).
+
+Tests are defined in the `tests` section. For example, the following 
definition will import results of the test from a CSV file:
+
+```yaml
+tests:
+  local.sample:
+    type: csv
+    file: tests/resources/sample.csv
+    time_column: time
+    metrics: [metric1, metric2]
+    attributes: [commit]
+    csv_options:
+      delimiter: ","
+      quote_char: "'"
+```
+
+The `time_column` property points to the name of the column storing the 
timestamp
+of each test-run. The data points will be ordered by that column.
+
+The `metrics` property selects the columns tha hold the values to be analyzed. 
These values must
+be numbers convertible to floats. The `metrics` property can be not only a 
simple list of column
+names, but it can also be a dictionary configuring other properties of each 
metric,
+the column name or direction:
+
+```yaml
+metrics:
+  resp_time_p99:
+    direction: -1
+    column: p99
+```
+
+Direction can be 1 or -1. If direction is set to 1, this means that the higher 
the metric, the
+better the performance is. If it is set to -1, higher values mean worse 
performance.
+
+The `attributes` property describes any other columns that should be attached 
to the final
+report. Special attribute `version` and `commit` can be used to query for a 
given time-range.
+
+> [!TIP] To learn how to avoid repeating the same configuration in multiple 
tests, see [Avoiding test definition duplication](TEMPLATES.md).
+
+## Listing Available Tests
+
+```
+hunter list-groups
+hunter list-tests [group name]
+```
+
+## Listing Available Metrics for Tests
+
+To list all available metrics defined for the test:
+```
+hunter list-metrics <test>
+```
+
+## Finding Change Points
+
+> [!TIP]
+> For more details, see [Finding Change 
Points](BASICS.md#finding-change-points) and
+> [Validating Performance of a Feature 
Branch](FEATURE_BRANCH.md#validating-performance-of-a-feature-branch).
+
+```
+hunter analyze <test>...
+hunter analyze <group>...
+```
+
+This command prints interesting results of all runs of the test and a list of 
change-points.
+
+A change-point is a moment when a metric value starts to differ significantly 
from the values of the earlier runs and
+when the difference is statistically significant.
+
+Hunter calculates the probability (P-value) that the change point was caused 
by chance - the closer to zero, the more
+"sure" it is about the regression or performance improvement. The smaller is 
the actual magnitude of the change, the
+more data points are needed to confirm the change, therefore Hunter may not 
notice the regression immediately after the first run
+that regressed.
+
+The `analyze` command accepts multiple tests or test groups.
+The results are simply concatenated.
+
+## Example
+
+```
+$ hunter analyze local.sample
+INFO: Computing change points for test sample.csv...
+sample:
+time                         metric1    metric2
+-------------------------  ---------  ---------
+2021-01-01 02:00:00 +0000     154023      10.43
+2021-01-02 02:00:00 +0000     138455      10.23
+2021-01-03 02:00:00 +0000     143112      10.29
+2021-01-04 02:00:00 +0000     149190      10.91
+2021-01-05 02:00:00 +0000     132098      10.34
+2021-01-06 02:00:00 +0000     151344      10.69
+                                      ·········
+                                         -12.9%
+                                      ·········
+2021-01-07 02:00:00 +0000     155145       9.23
+2021-01-08 02:00:00 +0000     148889       9.11
+2021-01-09 02:00:00 +0000     149466       9.13
+2021-01-10 02:00:00 +0000     148209       9.03
+```
diff --git a/docs/GRAFANA.md b/docs/GRAFANA.md
new file mode 100644
index 0000000..da3ad39
--- /dev/null
+++ b/docs/GRAFANA.md
@@ -0,0 +1,65 @@
+# Annotating Change Points in Grafana
+
+Change points found by `analyze` can be exported
+as Grafana annotations using the `--update-grafana` flag:
+
+```
+$ hunter analyze <test or group> --update-grafana
+```
+
+The annotations generated by Hunter get the following tags:
+- `hunter`
+- `change-point`
+- `test:<test name>`
+- `metric:<metric name>`
+- tags configured in the `tags` property of the test
+- tags configured in the `annotate` property of the test
+- tags configured in the `annotate` property of the metric
+
+Additionally, the `annotate` property supports variable tags:
+- `%{TEST_NAME}` - name of the test
+- `%{METRIC_NAME}` - name of the metric
+- `%{GRAPHITE_PATH}` - resolves to the path to the data in Graphite
+- `%{GRAPHITE_PATH_COMPONENTS}` - splits the path of the data in Graphite into 
separate components
+  and each path component is exported as a separate tag
+- `%{GRAPHITE_PREFIX}` - resolves to the prefix of the path to the data in 
Graphite
+  (the part of the path up to the metric suffix)
+- `%{GRAPHITE_PREFIX_COMPONENTS}` - similar as `%{GRAPHITE_PATH_COMPONENTS}` 
but splits the prefix
+of the path instead of the path
+
+## Example
+
+> [!TIP]
+> See [hunter.yaml](../examples/graphite/hunter.yaml) for the full Graphite & 
Grafana example.
+
+Start docker-compose with Graphite in one tab:
+
+```bash
+docker-compose -f examples/graphite/docker-compose.yaml up --force-recreate 
--always-recreate-deps --renew-anon-volumes --build
+````
+
+
+Run hunter in another tab:
+
+```bash
+docker-compose -f examples/graphite/docker-compose.yaml run hunter hunter 
analyze my-product.test --since=-10m --update-grafana
+```
+
+Expected output:
+
+```bash
+time                       run    branch    version    commit      throughput  
  response_time    cpu_usage
+-------------------------  -----  --------  ---------  --------  ------------  
---------------  -----------
+2024-12-14 22:45:10 +0000                                               61160  
             87          0.2
+2024-12-14 22:46:10 +0000                                               60160  
             85          0.3
+2024-12-14 22:47:10 +0000                                               60960  
             89          0.1
+                                                                 ············  
                 ···········
+                                                                        -5.6%  
                     +300.0%
+                                                                 ············  
                 ···········
+2024-12-14 22:48:10 +0000                                               57123  
             88          0.8
+2024-12-14 22:49:10 +0000                                               57980  
             87          0.9
+2024-12-14 22:50:10 +0000                                               56950  
             85          0.7
+```
+
+Open local [Grafana](http://localhost:3000) in your browser (use default 
`admin/admin` credentials), open
+`Benchmarks` dashboard, and see your data points and annotations on both 
charts.:
diff --git a/docs/GRAPHITE.md b/docs/GRAPHITE.md
new file mode 100644
index 0000000..9035632
--- /dev/null
+++ b/docs/GRAPHITE.md
@@ -0,0 +1,106 @@
+# Importing results from Graphite
+
+> [!TIP]
+> See [hunter.yaml](../examples/graphite/hunter.yaml) for the full example 
configuration.
+
+## Graphite and Grafana Connection
+
+The following block contains Graphite and Grafana connection details:
+
+```yaml
+graphite:
+  url: ...
+
+grafana:
+  url: ...
+  user: ...
+  password: ...
+```
+
+These variables can be specified directly in `hunter.yaml` or passed as 
environment variables:
+
+```yaml
+graphite:
+  url: ${GRAPHITE_ADDRESS}
+
+grafana:
+  url: ${GRAFANA_ADDRESS}
+  user: ${GRAFANA_USER}
+  password: ${GRAFANA_PASSWORD}
+```
+
+
+## Tests
+
+### Importing results from Graphite
+
+Test configuration contains queries selecting experiment data from Graphite. 
This is done by specifying the Graphite
+path prefix common for all the test's metrics and suffixes for each of the 
metrics recorded by the test run.
+
+```yaml
+tests:
+  my-product.test:
+    type: graphite
+    prefix: performance-tests.daily.my-product
+    metrics:
+      throughput:
+        suffix: client.throughput
+      response-time:
+        suffix: client.p50
+        direction: -1    # lower is better
+      cpu-load:
+        suffix: server.cpu
+        direction: -1    # lower is better
+```
+
+### Tags
+
+> [!WARNING]
+> Tags do not work as expected in the current version. See 
https://github.com/datastax-labs/hunter/issues/24 for more details
+
+The optional `tags` property contains the tags that are used to query for 
Graphite events that store
+additional test run metadata such as run identifier, commit, branch and 
product version information.
+
+The following command will post an event with the test run metadata:
+```shell
+$ curl -X POST "http://graphite_address/events/"; \
+    -d '{
+      "what": "Performance Test",
+      "tags": ["perf-test", "daily", "my-product"],
+      "when": 1537884100,
+      "data": {"commit": "fe6583ab", "branch": "new-feature", "version": 
"0.0.1"}
+    }'
+```
+
+Posting those events is not mandatory, but when they are available, Hunter is 
able to
+filter data by commit or version using `--since-commit` or `--since-version` 
selectors.
+
+## Example
+
+Start docker-compose with Graphite in one tab:
+
+```bash
+docker-compose -f examples/graphite/docker-compose.yaml up --force-recreate 
--always-recreate-deps --renew-anon-volumes --build
+````
+
+Run hunter in another tab:
+
+```bash
+docker-compose -f examples/graphite/docker-compose.yaml run hunter hunter 
analyze my-product.test --since=-10m
+```
+
+Expected output:
+
+```bash
+time                       run    branch    version    commit      throughput  
  response_time    cpu_usage
+-------------------------  -----  --------  ---------  --------  ------------  
---------------  -----------
+2024-12-14 22:45:10 +0000                                               61160  
             87          0.2
+2024-12-14 22:46:10 +0000                                               60160  
             85          0.3
+2024-12-14 22:47:10 +0000                                               60960  
             89          0.1
+                                                                 ············  
                 ···········
+                                                                        -5.6%  
                     +300.0%
+                                                                 ············  
                 ···········
+2024-12-14 22:48:10 +0000                                               57123  
             88          0.8
+2024-12-14 22:49:10 +0000                                               57980  
             87          0.9
+2024-12-14 22:50:10 +0000                                               56950  
             85          0.7
+```
diff --git a/docs/INSTALL.md b/docs/INSTALL.md
new file mode 100644
index 0000000..3d284e7
--- /dev/null
+++ b/docs/INSTALL.md
@@ -0,0 +1,19 @@
+# Installation
+
+## Install using pipx
+
+Hunter requires Python 3.8.  If you don't have python 3.8, use pyenv to 
install it.
+
+Use pipx to install hunter:
+
+```
+pipx install git+ssh://g...@github.com/datastax-labs/hunter
+```
+
+## Build Docker container
+
+To build the Docker container, run the following command:
+
+```bash
+docker build -t hunter .
+```
diff --git a/docs/POSTGRESQL.md b/docs/POSTGRESQL.md
new file mode 100644
index 0000000..107f691
--- /dev/null
+++ b/docs/POSTGRESQL.md
@@ -0,0 +1,119 @@
+# Importing results from PostgreSQL
+
+> [!TIP]
+> See [hunter.yaml](../examples/postgresql/hunter.yaml) for the full example 
configuration.
+
+## PostgreSQL Connection
+The following block contains PostgreSQL connection details:
+
+```yaml
+postgres:
+  hostname: ...
+  port: ...
+  username: ...
+  password: ...
+  database: ...
+```
+
+These variables can be specified directly in `hunter.yaml` or passed as 
environment variables:
+
+```yaml
+postgres:
+  hostname: ${POSTGRES_HOSTNAME}
+  port: ${POSTGRES_PORT}
+  username: ${POSTGRES_USERNAME}
+  password: ${POSTGRES_PASSWORD}
+  database: ${POSTGRES_DATABASE}
+```
+
+## Tests
+
+Test configuration contains queries selecting experiment data, a time column, 
and a list of columns to analyze:
+
+```yaml
+tests:
+  aggregate_mem:
+    type: postgres
+    time_column: commit_ts
+    attributes: [experiment_id, config_id, commit]
+    metrics:
+      process_cumulative_rate_mean:
+        direction: 1
+        scale: 1
+      process_cumulative_rate_stderr:
+        direction: -1
+        scale: 1
+      process_cumulative_rate_diff:
+        direction: -1
+        scale: 1
+    query: |
+      SELECT e.commit,
+             e.commit_ts,
+             r.process_cumulative_rate_mean,
+             r.process_cumulative_rate_stderr,
+             r.process_cumulative_rate_diff,
+             r.experiment_id,
+             r.config_id
+      FROM results r
+      INNER JOIN configs c ON r.config_id = c.id
+      INNER JOIN experiments e ON r.experiment_id = e.id
+      WHERE e.exclude_from_analysis = false AND
+            e.branch = 'trunk' AND
+            e.username = 'ci' AND
+            c.store = 'MEM' AND
+            c.cache = true AND
+            c.benchmark = 'aggregate' AND
+            c.instance_type = 'ec2i3.large'
+      ORDER BY e.commit_ts ASC;
+```
+
+## Example
+
+### Usage
+
+Start docker-compose with PostgreSQL in one tab:
+
+```bash
+docker-compose -f examples/postgresql/docker-compose.yaml up --force-recreate 
--always-recreate-deps --renew-anon-volumes
+````
+
+Run Hunter in the other tab to show results for a single test `aggregate_mem` 
and update the database with newly found change points:
+
+```bash
+docker-compose -f examples/postgresql/docker-compose.yaml run --build hunter 
bin/hunter analyze aggregate_mem --update-postgres
+```
+
+Expected output:
+
+```bash                                                                        
                                                                                
                                               0.0s
+time                       experiment_id       commit      
process_cumulative_rate_mean    process_cumulative_rate_stderr    
process_cumulative_rate_diff
+-------------------------  ------------------  --------  
------------------------------  --------------------------------  
------------------------------
+2024-03-13 10:03:02 +0000  aggregate-36e5ccd2  36e5ccd2                        
   61160                              2052                           13558
+2024-03-25 10:03:02 +0000  aggregate-d5460f38  d5460f38                        
   60160                              2142                           13454
+2024-04-02 10:03:02 +0000  aggregate-bc9425cb  bc9425cb                        
   60960                              2052                           13053
+                                                         
······························
+                                                                               
   -5.6%
+                                                         
······························
+2024-04-06 10:03:02 +0000  aggregate-14df1b11  14df1b11                        
   57123                              2052                           14052
+2024-04-13 10:03:02 +0000  aggregate-ac40c0d8  ac40c0d8                        
   57980                              2052                           13521
+2024-04-27 10:03:02 +0000  aggregate-0af4ccbc  0af4ccbc                        
   56950                              2052                           13532
+```
+
+### Configuration
+
+See [hunter.yaml](../examples/postgresql/hunter.yaml) for the example 
configuration:
+* Block `postgres` contains connection details to the PostgreSQL database.
+* Block `templates` contains common pieces of configuration used by all tests 
- time column and a list of attributes and metrics.
+* Block `tests` contains configuration for the individual tests, specifically 
a query that fetches analyzed columns sorted by commit timestamp.
+
+[schema.sql](../examples/postgresql/init-db/schema.sql) contains the schema 
used in this example.
+
+[docker-compose.yaml](../examples/postgresql/docker-compose.yaml) contains 
example config required to connect to PosgreSQL:
+1. `POSTGRES_*` environment variables are used to pass connection details to 
the container.
+2. `HUNTER_CONFIG` is the path to the configuration file described above.
+3. `BRANCH` variable is used within `HUNTER_CONFIG` to analyze experiment 
results only for a specific branch.
+
+
+### CLI arguments
+
+* `--update-postgres` - updates the database with newly found change points.
diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 0000000..3899795
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,16 @@
+# Table of Contents
+
+## Getting Started
+- [Installation](docs/INSTALL.md)
+- [Getting Started](docs/GETTING_STARTED.md)
+- [Contributing](docs/CONTRIBUTING.md)
+
+## Basics
+- [Basics](docs/BASICS.md)
+
+## Data Sources
+- [Graphite](docs/GRAPHITE.md)
+- [PostgreSQL][docs/POSTGRESQL.md)
+- [BigQuery](docs/BIG_QUERY.md)
+- [CSV](docs/CSV.md)
+- [Annotating Change Points in Grafana](docs/GRAFANA.md)
diff --git a/examples/csv/data/local_sample.csv 
b/examples/csv/data/local_sample.csv
new file mode 100644
index 0000000..4fd9b90
--- /dev/null
+++ b/examples/csv/data/local_sample.csv
@@ -0,0 +1,11 @@
+time,commit,metric1,metric2
+2025.01.01 3:00:00 +0100,aaa0,154023,10.43
+2025.01.02 3:00:00 +0100,aaa1,138455,10.23
+2025.01.03 3:00:00 +0100,aaa2,143112,10.29
+2025.01.04 3:00:00 +0100,aaa3,149190,10.91
+2025.01.05 3:00:00 +0100,aaa4,132098,10.34
+2025.01.06 3:00:00 +0100,aaa5,151344,10.69
+2025.01.07 3:00:00 +0100,aaa6,155145,9.23
+2025.01.08 3:00:00 +0100,aaa7,148889,9.11
+2025.01.09 3:00:00 +0100,aaa8,149466,9.13
+2025.01.10 3:00:00 +0100,aaa9,148209,9.03
diff --git a/examples/csv/docker-compose.yaml b/examples/csv/docker-compose.yaml
new file mode 100644
index 0000000..d2b58be
--- /dev/null
+++ b/examples/csv/docker-compose.yaml
@@ -0,0 +1,12 @@
+services:
+  hunter:
+    build:
+      context: ../..
+      dockerfile: Dockerfile
+    container_name: hunter
+    environment:
+      HUNTER_CONFIG: examples/csv/hunter.yaml
+    volumes:
+      - ./tests:/tests
+
+
diff --git a/examples/csv/hunter.yaml b/examples/csv/hunter.yaml
new file mode 100644
index 0000000..5e43024
--- /dev/null
+++ b/examples/csv/hunter.yaml
@@ -0,0 +1,10 @@
+tests:
+  local.sample:
+    type: csv
+    file: /tests/local_sample.csv
+    time_column: time
+    attributes: [commit]
+    metrics: [metric1, metric2]
+    csv_options:
+      delimiter: ','
+      quotechar: "'"
\ No newline at end of file
diff --git a/examples/graphite/datagen/datagen.sh 
b/examples/graphite/datagen/datagen.sh
new file mode 100755
index 0000000..426abaf
--- /dev/null
+++ b/examples/graphite/datagen/datagen.sh
@@ -0,0 +1,72 @@
+#!/bin/bash
+
+GRAPHITE_SERVER="graphite"
+GRAPHITE_PORT=2003
+
+commits=("a1b2c3" "d4e5f6" "g7h8i9" "j1k2l3" "m4n5o6" "p7q8r9")
+num_commits=${#commits[@]}
+
+throughput_path="performance-tests.daily.my-product.client.throughput"
+throughput_values=(56950 57980 57123 60960 60160 61160)
+
+p50_path="performance-tests.daily.my-product.client.p50"
+p50_values=(85 87 88 89 85 87)
+
+cpu_path="performance-tests.daily.my-product.server.cpu"
+cpu_values=(0.7 0.9 0.8 0.1 0.3 0.2)
+
+
+# Function to send throughput to Graphite
+send_to_graphite() {
+    local throughput_path=$1
+    local value=$2
+    local timestamp=$3
+    local commit=$4
+    # send the metric
+    echo "${throughput_path} ${value} ${timestamp}" | nc ${GRAPHITE_SERVER} 
${GRAPHITE_PORT}
+    # annotate the metric
+    # Commented out, waiting for 
https://github.com/datastax-labs/hunter/issues/24 to be fixed
+    #    curl -X POST "http://${GRAPHITE_SERVER}/events/"; \
+    #        -d "{
+    #          \"what\": \"Performance Test\",
+    #          \"tags\": [\"perf-test\", \"daily\", \"my-product\"],
+    #          \"when\": ${timestamp},
+    #          \"data\": {\"commit\": \"${commit}\", \"branch\": 
\"new-feature\",  \"version\": \"0.0.1\"}
+    #        }"
+}
+
+
+
+sleep 5 # Wait for Graphite to start
+
+start_timestamp=$(date +%s)
+timestamp=$start_timestamp
+
+# Send metrics for each commit
+for ((i=0; i<${num_commits}; i++)); do
+    send_to_graphite ${throughput_path} ${throughput_values[$i]} ${timestamp} 
${commits[$i]}
+    send_to_graphite ${p50_path} ${p50_values[$i]} ${timestamp} ${commits[$i]}
+    send_to_graphite ${cpu_path} ${cpu_values[$i]} ${timestamp} ${commits[$i]}
+    timestamp=$((timestamp - 60))
+done
+
+## Send each throughput value
+#timestamp=$start_timestamp
+#for value in "${throughput_values[@]}"; do
+#    send_to_graphite ${throughput_path} ${value} ${timestamp}
+#    timestamp=$((timestamp - 60))
+#done
+#
+## Send each p50 value
+#timestamp=$start_timestamp
+#for value in "${p50_values[@]}"; do
+#    send_to_graphite ${p50_path} ${value} ${timestamp}
+#    timestamp=$((timestamp - 60))
+#done
+#
+## Send each CPU value
+#timestamp=$start_timestamp
+#for value in "${cpu_values[@]}"; do
+#    send_to_graphite ${cpu_path} ${value} ${timestamp}
+#    timestamp=$((timestamp - 60))
+#done
\ No newline at end of file
diff --git a/examples/graphite/docker-compose.yaml 
b/examples/graphite/docker-compose.yaml
new file mode 100644
index 0000000..c4a56c3
--- /dev/null
+++ b/examples/graphite/docker-compose.yaml
@@ -0,0 +1,62 @@
+services:
+  graphite:
+    image: graphiteapp/graphite-statsd
+    container_name: graphite
+    ports:
+      - "80:80"
+      - "2003-2004:2003-2004"
+      - "2023-2024:2023-2024"
+      - "8125:8125/udp"
+      - "8126:8126"
+    networks:
+      - hunter-graphite
+
+  grafana:
+    image: grafana/grafana
+    container_name: grafana
+    environment:
+      - GF_SECURITY_ADMIN_PASSWORD=admin
+    depends_on:
+      - graphite
+    ports:
+      - "3000:3000"
+    volumes:
+        - ./grafana:/etc/grafana/provisioning
+    networks:
+      - hunter-graphite
+
+  data-sender:
+    image: bash
+    container_name: data-sender
+    depends_on:
+      - graphite
+    volumes:
+      - ./datagen:/datagen
+    entrypoint: ["bash", "/datagen/datagen.sh"]
+    networks:
+      - hunter-graphite
+
+  hunter:
+    build:
+      context: ../..
+      dockerfile: Dockerfile
+    container_name: hunter
+    depends_on:
+      - graphite
+    environment:
+      GRAPHITE_ADDRESS: http://graphite/
+      GRAFANA_ADDRESS: http://grafana:3000/
+      GRAFANA_USER: admin
+      GRAFANA_PASSWORD: admin
+      HUNTER_CONFIG: examples/graphite/hunter.yaml
+    networks:
+      - hunter-graphite
+
+networks:
+  hunter-graphite:
+    driver: bridge
+
+
+# TODO:
+# 3. make sure Hunter can connect to graphite and query the data
+# 4. make sure it annotates the dashboard correctly
\ No newline at end of file
diff --git a/examples/graphite/grafana/dashboards/benchmarks.json 
b/examples/graphite/grafana/dashboards/benchmarks.json
new file mode 100644
index 0000000..612551d
--- /dev/null
+++ b/examples/graphite/grafana/dashboards/benchmarks.json
@@ -0,0 +1,246 @@
+{
+  "annotations": {
+    "list": [
+      {
+        "builtIn": 1,
+        "datasource": {
+          "type": "grafana",
+          "uid": "-- Grafana --"
+        },
+        "enable": true,
+        "hide": true,
+        "iconColor": "rgba(0, 211, 255, 1)",
+        "name": "Annotations & Alerts",
+        "target": {
+          "limit": 100,
+          "matchAny": false,
+          "tags": [],
+          "type": "dashboard"
+        },
+        "type": "dashboard"
+      },
+      {
+        "datasource": {
+          "type": "datasource",
+          "uid": "grafana"
+        },
+        "enable": true,
+        "hide": false,
+        "iconColor": "red",
+        "name": "New annotation",
+        "target": {
+          "fromAnnotations": true,
+          "queryType": "annotations",
+          "tags": [""]
+        }
+      }
+    ]
+  },
+  "editable": true,
+  "fiscalYearStartMonth": 0,
+  "graphTooltip": 0,
+  "id": 1,
+  "links": [],
+  "panels": [
+    {
+      "datasource": {
+        "type": "graphite",
+        "uid": "P1D261A8554D2DA69"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisBorderShow": false,
+            "axisCenteredZero": false,
+            "axisColorMode": "text",
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "barWidthFactor": 0.6,
+            "drawStyle": "line",
+            "fillOpacity": 0,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "viz": false
+            },
+            "insertNulls": false,
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "auto",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          }
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 0
+      },
+      "id": 2,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom",
+          "showLegend": true
+        },
+        "tooltip": {
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "pluginVersion": "11.4.0",
+      "targets": [
+        {
+          "refCount": 0,
+          "refId": "A",
+          "target": "performance-tests.daily.my-product.server.cpu"
+        }
+      ],
+      "title": "Server CPU Usage",
+      "type": "timeseries"
+    },
+    {
+      "datasource": {
+        "type": "graphite",
+        "uid": "Graphite"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisBorderShow": false,
+            "axisCenteredZero": false,
+            "axisColorMode": "text",
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "barWidthFactor": 0.6,
+            "drawStyle": "line",
+            "fillOpacity": 0,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "viz": false
+            },
+            "insertNulls": false,
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "auto",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          }
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 0
+      },
+      "id": 1,
+      "options": {
+        "legend": {
+          "calcs": [],
+          "displayMode": "list",
+          "placement": "bottom",
+          "showLegend": true
+        },
+        "tooltip": {
+          "maxHeight": 600,
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "pluginVersion": "11.4.0",
+      "targets": [
+        {
+          "datasource": {
+            "type": "graphite",
+            "uid": "${DS_GRAPHITE}"
+          },
+          "refId": "A",
+          "target": "performance-tests.daily.my-product.client.throughput"
+        }
+      ],
+      "title": "Throughput",
+      "type": "timeseries"
+    }
+  ],
+  "preload": false,
+  "schemaVersion": 40,
+  "tags": [],
+  "templating": {
+    "list": []
+  },
+  "time": {
+    "from": "now-5m",
+    "to": "now"
+  },
+  "timepicker": {},
+  "timezone": "browser",
+  "title": "Benchmarks",
+  "uid": "adr2geu68gutcd",
+  "version": 1,
+  "weekStart": ""
+}
diff --git a/examples/graphite/grafana/dashboards/dashboards.yaml 
b/examples/graphite/grafana/dashboards/dashboards.yaml
new file mode 100644
index 0000000..4ae98b0
--- /dev/null
+++ b/examples/graphite/grafana/dashboards/dashboards.yaml
@@ -0,0 +1,9 @@
+apiVersion: 1
+
+providers:
+  - name: 'default'
+    orgId: 1
+    folder: ''
+    type: file
+    options:
+      path: /etc/grafana/provisioning/dashboards
\ No newline at end of file
diff --git a/examples/graphite/grafana/datasources/graphite.yaml 
b/examples/graphite/grafana/datasources/graphite.yaml
new file mode 100644
index 0000000..f4fb287
--- /dev/null
+++ b/examples/graphite/grafana/datasources/graphite.yaml
@@ -0,0 +1,12 @@
+apiVersion: 1
+
+datasources:
+  - name: Graphite
+    type: graphite
+    access: proxy
+    url: http://graphite:80
+    isDefault: true
+    jsonData:
+      graphiteVersion: "1.1"
+      tlsAuth: false
+      tlsAuthWithCACert: false
\ No newline at end of file
diff --git a/examples/graphite/hunter.yaml b/examples/graphite/hunter.yaml
new file mode 100644
index 0000000..bf791f1
--- /dev/null
+++ b/examples/graphite/hunter.yaml
@@ -0,0 +1,31 @@
+# External systems connectors configuration:
+graphite:
+  url: ${GRAPHITE_ADDRESS}
+
+grafana:
+  url: ${GRAFANA_ADDRESS}
+  user: ${GRAFANA_USER}
+  password: ${GRAFANA_PASSWORD}
+
+# Define your tests here:
+tests:
+  my-product.test:
+    type: graphite
+    prefix: performance-tests.daily.my-product
+    tags: [perf-test, daily, my-product]
+    metrics:
+      throughput:
+        suffix: client.throughput
+        direction: 1   # higher is better
+        scale: 1
+      response_time:
+        suffix: client.p50
+        direction: -1  # lower is better
+        scale: 1
+      cpu_usage:
+        suffix: server.cpu
+        direction: -1  # lower is better
+        scale: 1
+
+
+
diff --git a/examples/postgresql/docker-compose.yaml 
b/examples/postgresql/docker-compose.yaml
new file mode 100644
index 0000000..569a8cf
--- /dev/null
+++ b/examples/postgresql/docker-compose.yaml
@@ -0,0 +1,38 @@
+version: "3.8"
+
+services:
+  postgres:
+    image: postgres:latest
+    container_name: postgres
+    environment:
+      POSTGRES_USER: exampleuser
+      POSTGRES_PASSWORD: examplepassword
+      POSTGRES_DB: benchmark_results
+    ports:
+      - "5432:5432"
+    volumes:
+      - ./init-db:/docker-entrypoint-initdb.d
+    networks:
+      - hunter-postgres
+
+  hunter:
+    build:
+      context: ../..
+      dockerfile: Dockerfile
+    container_name: hunter
+    depends_on:
+      - postgres
+    environment:
+      POSTGRES_HOSTNAME: postgres
+      POSTGRES_PORT: 5432
+      POSTGRES_USERNAME: exampleuser
+      POSTGRES_PASSWORD: examplepassword
+      POSTGRES_DATABASE: benchmark_results
+      HUNTER_CONFIG: examples/postgresql/hunter.yaml
+      BRANCH: trunk
+    networks:
+      - hunter-postgres
+
+networks:
+  hunter-postgres:
+    driver: bridge
diff --git a/examples/psql/hunter.yaml b/examples/postgresql/hunter.yaml
similarity index 65%
rename from examples/psql/hunter.yaml
rename to examples/postgresql/hunter.yaml
index 664dca2..36b14ac 100644
--- a/examples/psql/hunter.yaml
+++ b/examples/postgresql/hunter.yaml
@@ -11,7 +11,7 @@ templates:
   common:
     type: postgres
     time_column: commit_ts
-    attributes: [experiment_id, config_id, commit]
+    attributes: [experiment_id, commit, config_id]
     # required for --update-postgres to work
     update_statement: |
       UPDATE results 
@@ -40,7 +40,7 @@ tests:
              r.process_cumulative_rate_mean, 
              r.process_cumulative_rate_stderr, 
              r.process_cumulative_rate_diff, 
-             r.experiment_id, 
+             r.experiment_id,
              r.config_id
       FROM results r
       INNER JOIN configs c ON r.config_id = c.id
@@ -52,4 +52,26 @@ tests:
             c.cache = true AND
             c.benchmark = 'aggregate' AND
             c.instance_type = 'ec2i3.large'
+      ORDER BY e.commit_ts ASC;
+
+  aggregate_time_rocks:
+    inherit: [ common ]
+    query: |
+      SELECT e.commit, 
+             e.commit_ts, 
+             r.process_cumulative_rate_mean, 
+             r.process_cumulative_rate_stderr, 
+             r.process_cumulative_rate_diff,  
+             r.experiment_id,
+             r.config_id
+      FROM results r
+      INNER JOIN configs c ON r.config_id = c.id
+      INNER JOIN experiments e ON r.experiment_id = e.id
+      WHERE e.exclude_from_analysis = false AND
+            e.branch = '${BRANCH}' AND
+            e.username = 'ci' AND
+            c.store = 'TIME_ROCKS' AND
+            c.cache = true AND
+            c.benchmark = 'aggregate' AND
+            c.instance_type = 'ec2i3.large'
       ORDER BY e.commit_ts ASC;
\ No newline at end of file
diff --git a/examples/postgresql/init-db/schema.sql 
b/examples/postgresql/init-db/schema.sql
new file mode 100644
index 0000000..99ff9cf
--- /dev/null
+++ b/examples/postgresql/init-db/schema.sql
@@ -0,0 +1,85 @@
+\c benchmark_results;
+
+CREATE TABLE IF NOT EXISTS configs (
+    id SERIAL PRIMARY KEY,
+    benchmark TEXT NOT NULL,
+    store TEXT NOT NULL,
+    instance_type TEXT NOT NULL,
+    cache BOOLEAN NOT NULL,
+    UNIQUE(benchmark,
+           store,
+           cache,
+           instance_type)
+);
+
+CREATE TABLE IF NOT EXISTS experiments (
+    id TEXT PRIMARY KEY,
+    ts TIMESTAMPTZ NOT NULL,
+    branch TEXT NOT NULL,
+    commit TEXT NOT NULL,
+    commit_ts TIMESTAMPTZ NOT NULL,
+    username TEXT NOT NULL,
+    details_url TEXT NOT NULL,
+    exclude_from_analysis BOOLEAN DEFAULT false NOT NULL,
+    exclude_reason TEXT
+);
+
+CREATE TABLE IF NOT EXISTS results (
+  experiment_id TEXT NOT NULL REFERENCES experiments(id),
+  config_id INTEGER NOT NULL REFERENCES configs(id),
+
+  process_cumulative_rate_mean BIGINT NOT NULL,
+  process_cumulative_rate_stderr BIGINT NOT NULL,
+  process_cumulative_rate_diff BIGINT NOT NULL,
+
+  process_cumulative_rate_mean_rel_forward_change DOUBLE PRECISION,
+  process_cumulative_rate_mean_rel_backward_change DOUBLE PRECISION,
+  process_cumulative_rate_mean_p_value DECIMAL,
+
+  process_cumulative_rate_stderr_rel_forward_change DOUBLE PRECISION,
+  process_cumulative_rate_stderr_rel_backward_change DOUBLE PRECISION,
+  process_cumulative_rate_stderr_p_value DECIMAL,
+
+  process_cumulative_rate_diff_rel_forward_change DOUBLE PRECISION,
+  process_cumulative_rate_diff_rel_backward_change DOUBLE PRECISION,
+  process_cumulative_rate_diff_p_value DECIMAL,
+
+  PRIMARY KEY (experiment_id, config_id)
+);
+
+-- configurations --
+INSERT INTO configs (id, benchmark, store, instance_type, cache) VALUES
+    (1, 'aggregate', 'MEM', 'ec2i3.large', true),
+    (2, 'aggregate', 'TIME_ROCKS', 'ec2i3.large', true);
+
+-- experiments --
+INSERT INTO experiments
+    (id, ts, branch, commit, commit_ts, username, details_url)
+VALUES
+    ('aggregate-36e5ccd2', '2024-03-14 12:03:02+00', 'trunk', '36e5ccd2', 
'2024-03-13 10:03:02+00', 'ci', 
'https://example.com/experiments/aggregate-36e5ccd2'),
+    ('aggregate-d5460f38', '2024-03-27 12:03:02+00', 'trunk', 'd5460f38', 
'2024-03-25 10:03:02+00', 'ci', 
'https://example.com/experiments/aggregate-d5460f38'),
+    ('aggregate-bc9425cb', '2024-04-01 12:03:02+00', 'trunk', 'bc9425cb', 
'2024-04-02 10:03:02+00', 'ci', 
'https://example.com/experiments/aggregate-bc9425cb'),
+    ('aggregate-14df1b11', '2024-04-07 12:03:02+00', 'trunk', '14df1b11', 
'2024-04-06 10:03:02+00', 'ci', 
'https://example.com/experiments/aggregate-14df1b11'),
+    ('aggregate-ac40c0d8', '2024-04-14 12:03:02+00', 'trunk', 'ac40c0d8', 
'2024-04-13 10:03:02+00', 'ci', 
'https://example.com/experiments/aggregate-ac40c0d8'),
+    ('aggregate-0af4ccbc', '2024-04-28 12:03:02+00', 'trunk', '0af4ccbc', 
'2024-04-27 10:03:02+00', 'ci', 
'https://example.com/experiments/aggregate-0af4ccbc');
+
+
+INSERT INTO results (experiment_id, config_id, process_cumulative_rate_mean, 
process_cumulative_rate_stderr, process_cumulative_rate_diff)
+VALUES
+    ('aggregate-36e5ccd2', 1, 61160, 2052, 13558),
+    ('aggregate-36e5ccd2', 2, 59250, 2599, 15557),
+
+    ('aggregate-d5460f38', 1, 60160, 2142, 13454),
+    ('aggregate-d5460f38', 2, 58316, 2573, 16028),
+
+    ('aggregate-bc9425cb', 1, 60960, 2052, 13053),
+    ('aggregate-bc9425cb', 2, 59021, 2459, 15259),
+
+    ('aggregate-14df1b11', 1, 57123, 2052, 14052),
+    ('aggregate-14df1b11', 2, 54725, 2291, 15558),
+
+    ('aggregate-ac40c0d8', 1, 57980, 2052, 13521),
+    ('aggregate-ac40c0d8', 2, 54250, 2584, 15558),
+
+    ('aggregate-0af4ccbc', 1, 56950, 2052, 13532),
+    ('aggregate-0af4ccbc', 2, 54992, 2311, 15585);
diff --git a/examples/psql/README.md b/examples/psql/README.md
deleted file mode 100644
index f1333db..0000000
--- a/examples/psql/README.md
+++ /dev/null
@@ -1,22 +0,0 @@
-## Schema
-
-See [schema.sql](schema.sql) for the example schema.
-
-## Usage
-
-Define PostgreSQL connection details via environment variables:
-
-```bash
-export POSTGRES_HOSTNAME=...
-export POSTGRES_USERNAME=...
-export POSTGRES_PASSWORD=...
-export POSTGRES_DATABASE=...
-```
-
-or in `hunter.yaml`.
-
-The following command shows results for a single test `aggregate_mem` and 
updates the database with newly found change points:
-
-```bash
-$ BRANCH=trunk HUNTER_CONFIG=hunter.yaml hunter analyze aggregate_mem 
--update-postgres
-```
diff --git a/examples/psql/schema.sql b/examples/psql/schema.sql
deleted file mode 100644
index de825a9..0000000
--- a/examples/psql/schema.sql
+++ /dev/null
@@ -1,48 +0,0 @@
-CREATE TABLE IF NOT EXISTS configs (
-    id SERIAL PRIMARY KEY,
-    benchmark TEXT NOT NULL,
-    scenario TEXT NOT NULL,
-    store TEXT NOT NULL,
-    instance_type TEXT NOT NULL,
-    cache BOOLEAN NOT NULL,
-    UNIQUE(benchmark,
-           scenario,
-           store,
-           cache,
-           instance_type)
-);
-
-CREATE TABLE IF NOT EXISTS experiments (
-    id TEXT PRIMARY KEY,
-    ts TIMESTAMPTZ NOT NULL,
-    branch TEXT NOT NULL,
-    commit TEXT NOT NULL,
-    commit_ts TIMESTAMPTZ NOT NULL,
-    username TEXT NOT NULL,
-    details_url TEXT NOT NULL,
-    exclude_from_analysis BOOLEAN DEFAULT false NOT NULL,
-    exclude_reason TEXT
-);
-
-CREATE TABLE IF NOT EXISTS results (
-  experiment_id TEXT NOT NULL REFERENCES experiments(id),
-  config_id INTEGER NOT NULL REFERENCES configs(id),
-
-  process_cumulative_rate_mean BIGINT NOT NULL,
-  process_cumulative_rate_stderr BIGINT NOT NULL,
-  process_cumulative_rate_diff BIGINT NOT NULL,
-
-  process_cumulative_rate_mean_rel_forward_change DOUBLE PRECISION,
-  process_cumulative_rate_mean_rel_backward_change DOUBLE PRECISION,
-  process_cumulative_rate_mean_p_value DECIMAL,
-
-  process_cumulative_rate_stderr_rel_forward_change DOUBLE PRECISION,
-  process_cumulative_rate_stderr_rel_backward_change DOUBLE PRECISION,
-  process_cumulative_rate_stderr_p_value DECIMAL,
-
-  process_cumulative_rate_diff_rel_forward_change DOUBLE PRECISION,
-  process_cumulative_rate_diff_rel_backward_change DOUBLE PRECISION,
-  process_cumulative_rate_diff_p_value DECIMAL,
-
-  PRIMARY KEY (experiment_id, config_id)
-);
\ No newline at end of file
diff --git a/hunter/grafana.py b/hunter/grafana.py
index a643c9a..7cc02eb 100644
--- a/hunter/grafana.py
+++ b/hunter/grafana.py
@@ -100,7 +100,7 @@ class Grafana:
                 data = asdict(annotation)
                 data["time"] = int(annotation.time.timestamp() * 1000)
                 del data["id"]
-                response = requests.post(url=url, data=data, 
auth=(self.__user, self.__password))
+                response = requests.post(url=url, json=data, 
auth=(self.__user, self.__password))
                 response.raise_for_status()
         except HTTPError as err:
             raise GrafanaError(str(err))

(hunter) branch master updated: Introduce docs and reproducible examples (#28)

Reply via email to