Re: [PR] Introduce docs and reproducible examples [hunter]

via GitHub Sun, 26 Jan 2025 19:17:38 -0800


Gerrrr commented on code in PR #28:
URL: https://github.com/apache/hunter/pull/28#discussion_r1929949312



##########
docs/INSTALL.md:
##########
@@ -0,0 +1,19 @@
+# Installation

Review Comment:
   README no longer contains installation instructions or anything else really 
that belongs in the docs.



##########
docs/VALIDATING_PERF.md:
##########
@@ -0,0 +1,45 @@
+# Validating Performance against Baseline

Review Comment:
   We are not using this feature. I agree with you that it is not really 
helpful if one uses Hunter correctly. If you want, I can omit this feature in 
the docs to avoid potential increase of its user base :)



##########
docs/GETTING_STARTED.md:
##########
@@ -0,0 +1,129 @@
+# Getting Started
+
+## Installation
+
+Hunter requires Python 3.8.  If you don't have python 3.8,
+use pyenv to install it.
+
+Use pipx to install hunter:
+
+```
+pipx install git+ssh://g...@github.com/apache/hunter
+```
+
+## Setup
+
+Copy the main configuration file `resources/hunter.yaml` to 
`~/.hunter/hunter.yaml` and adjust data source configuration.
+
+> [!TIP]
+> See docs on specific data sources to learn more about their configuration - 
[CSV](CSV.md), [Graphite](GRAPHITE.md),
+[PostgreSQL](POSTGRESQL.md), or [BigQuery](BIGQUERY.md).
+
+Alternatively, it is possible to leave the config file as is, and provide 
credentials in the environment
+by setting appropriate environment variables.
+Environment variables are interpolated before interpreting the configuration 
file.
+
+## Defining tests
+
+All test configurations are defined in the main configuration file.
+Hunter supports reading data from and publishing results to a CSV file, 
[Graphite](https://graphiteapp.org/),
+[PostgreSQL](https://www.postgresql.org/), and 
[BigQuery](https://cloud.google.com/bigquery).
+
+Tests are defined in the `tests` section. For example, the following 
definition will import results of the test from a CSV file:
+
+```yaml
+tests:
+  local.sample:
+    type: csv
+    file: tests/resources/sample.csv
+    time_column: time
+    metrics: [metric1, metric2]
+    attributes: [commit]
+    csv_options:
+      delimiter: ","
+      quote_char: "'"
+```
+
+The `time_column` property points to the name of the column storing the 
timestamp
+of each test-run. The data points will be ordered by that column.
+
+The `metrics` property selects the columns tha hold the values to be analyzed. 
These values must
+be numbers convertible to floats. The `metrics` property can be not only a 
simple list of column
+names, but it can also be a dictionary configuring other properties of each 
metric,
+the column name or direction:
+
+```yaml
+metrics:
+  resp_time_p99:
+    direction: -1
+    column: p99
+```
+
+Direction can be 1 or -1. If direction is set to 1, this means that the higher 
the metric, the
+better the performance is. If it is set to -1, higher values mean worse 
performance.
+
+The `attributes` property describes any other columns that should be attached 
to the final
+report. Special attribute `version` and `commit` can be used to query for a 
given time-range.
+
+> [!TIP] To learn how to avoid repeating the same configuration in multiple 
tests, see [Avoiding test definition duplication](TEMPLATES.md).
+
+## Listing Available Tests
+
+```
+hunter list-groups
+hunter list-tests [group name]
+```
+
+## Listing Available Metrics for Tests
+
+To list all available metrics defined for the test:
+```
+hunter list-metrics <test>
+```
+
+## Finding Change Points
+
+> [!TIP]
+> For more details, see docs about [Finding Change Points](ANALYZE.md), 
[Validating Performance against Baseline](VALIDATING_PERF.md),
+> and [Validating Performance of a Feature Branch](FEATURE_BRANCH.md).
+
+```
+hunter analyze <test>...
+hunter analyze <group>...
+```
+
+This command prints interesting results of all runs of the test and a list of 
change-points.
+
+A change-point is a moment when a metric value starts to differ significantly 
from the values of the earlier runs and
+when the difference is consistent enough that it is unlikely to happen by 
chance.
+
+Hunter calculates the probability (P-value) that the change point was caused 
by chance - the closer to zero, the more
+"sure" it is about the regression or performance improvement. The smaller is 
the actual magnitude of the change, the
+more data points are needed to confirm the change, therefore Hunter may not 
notice the regression after the first run
+that regressed.

Review Comment:
   
https://github.com/apache/hunter/pull/28/commits/2425c0f984f9d15e8d5a69b72ab32de76fb72fb7



##########
docs/README.md:
##########
@@ -0,0 +1,21 @@
+# Table of Contents
+
+## Getting Started
+- [Installation](docs/INSTALL.md)
+- [Getting Started](docs/GETTING_STARTED.md)
+- [Contributing](docs/CONTRIBUTING.md)
+
+## Basics

Review Comment:
   I've combined listing tests/metrics with "Finding Change Points" into Basics 
in 
[7cdba1a](https://github.com/apache/hunter/pull/28/commits/7cdba1add93a9b065c52935d34b93446f8ca9f74)
 and then moved the rest in 
[1458c06](https://github.com/apache/hunter/pull/28/commits/1458c06f77f69a873e7951c7c41e6490077f5800).
   



##########
docs/FEATURE_BRANCH.md:
##########
@@ -0,0 +1,52 @@
+# Validating Performance of a Feature Branch
+
+The `hunter regressions` command can work with feature branches.
+
+First you need to tell Hunter how to fetch the data of the tests run against a 
feature branch.
+The `prefix` property of the graphite test definition accepts `%{BRANCH}` 
variable,
+which is substituted at the data import time by the branch name passed to 
`--branch`
+command argument. Alternatively, if the prefix for the main branch of your 
product is different
+from the prefix used for feature branches, you can define an additional 
`branch_prefix` property.
+
+```yaml
+my-product.test-1:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-1]
+  prefix: performance-tests.daily.%{BRANCH}.my-product.test-1
+  inherit: common-metrics
+
+my-product.test-2:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-2]
+  prefix: performance-tests.daily.master.my-product.test-2
+  branch_prefix: performance-tests.feature.%{BRANCH}.my-product.test-2
+  inherit: common-metrics
+```
+
+Now you can verify if correct data are imported by running
+`hunter analyze <test> --branch <branch>`.
+
+The `--branch` argument also works with `hunter regressions`. In this case a 
comparison will be made
+between the tail of the specified branch and the tail of the main branch (or a 
point of the
+main branch specified by one of the `--since` selectors).
+
+```
+$ hunter regressions <test or group> --branch <branch> 
+$ hunter regressions <test or group> --branch <branch> --since <date>
+$ hunter regressions <test or group> --branch <branch> --since-version 
<version>
+$ hunter regressions <test or group> --branch <branch> --since-commit <commit>
+```
+
+Sometimes when working on a feature branch, you may run the tests multiple 
times,
+creating more than one data point. To ignore the previous test results, and 
compare
+only the last few points on the branch with the tail of the main branch,
+use the `--last <n>` selector. E.g. to check regressions on the last run of 
the tests
+on the feature branch:
+
+```
+$ hunter regressions <test or group> --branch <branch> --last 1  
+```
+
+Please beware that performance validation based on a single data point is 
quite weak
+and Hunter might miss a regression if the point is not too much different from
+the baseline. 

Review Comment:
   238c1b4



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hunter.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Introduce docs and reproducible examples [hunter]

Reply via email to