Gerrrr commented on code in PR #28: URL: https://github.com/apache/hunter/pull/28#discussion_r1929949312
########## docs/INSTALL.md: ########## @@ -0,0 +1,19 @@ +# Installation Review Comment: README no longer contains installation instructions or anything else really that belongs in the docs. ########## docs/VALIDATING_PERF.md: ########## @@ -0,0 +1,45 @@ +# Validating Performance against Baseline Review Comment: We are not using this feature. I agree with you that it is not really helpful if one uses Hunter correctly. If you want, I can omit this feature in the docs to avoid potential increase of its user base :) ########## docs/GETTING_STARTED.md: ########## @@ -0,0 +1,129 @@ +# Getting Started + +## Installation + +Hunter requires Python 3.8. If you don't have python 3.8, +use pyenv to install it. + +Use pipx to install hunter: + +``` +pipx install git+ssh://g...@github.com/apache/hunter +``` + +## Setup + +Copy the main configuration file `resources/hunter.yaml` to `~/.hunter/hunter.yaml` and adjust data source configuration. + +> [!TIP] +> See docs on specific data sources to learn more about their configuration - [CSV](CSV.md), [Graphite](GRAPHITE.md), +[PostgreSQL](POSTGRESQL.md), or [BigQuery](BIGQUERY.md). + +Alternatively, it is possible to leave the config file as is, and provide credentials in the environment +by setting appropriate environment variables. +Environment variables are interpolated before interpreting the configuration file. + +## Defining tests + +All test configurations are defined in the main configuration file. +Hunter supports reading data from and publishing results to a CSV file, [Graphite](https://graphiteapp.org/), +[PostgreSQL](https://www.postgresql.org/), and [BigQuery](https://cloud.google.com/bigquery). + +Tests are defined in the `tests` section. For example, the following definition will import results of the test from a CSV file: + +```yaml +tests: + local.sample: + type: csv + file: tests/resources/sample.csv + time_column: time + metrics: [metric1, metric2] + attributes: [commit] + csv_options: + delimiter: "," + quote_char: "'" +``` + +The `time_column` property points to the name of the column storing the timestamp +of each test-run. The data points will be ordered by that column. + +The `metrics` property selects the columns tha hold the values to be analyzed. These values must +be numbers convertible to floats. The `metrics` property can be not only a simple list of column +names, but it can also be a dictionary configuring other properties of each metric, +the column name or direction: + +```yaml +metrics: + resp_time_p99: + direction: -1 + column: p99 +``` + +Direction can be 1 or -1. If direction is set to 1, this means that the higher the metric, the +better the performance is. If it is set to -1, higher values mean worse performance. + +The `attributes` property describes any other columns that should be attached to the final +report. Special attribute `version` and `commit` can be used to query for a given time-range. + +> [!TIP] To learn how to avoid repeating the same configuration in multiple tests, see [Avoiding test definition duplication](TEMPLATES.md). + +## Listing Available Tests + +``` +hunter list-groups +hunter list-tests [group name] +``` + +## Listing Available Metrics for Tests + +To list all available metrics defined for the test: +``` +hunter list-metrics <test> +``` + +## Finding Change Points + +> [!TIP] +> For more details, see docs about [Finding Change Points](ANALYZE.md), [Validating Performance against Baseline](VALIDATING_PERF.md), +> and [Validating Performance of a Feature Branch](FEATURE_BRANCH.md). + +``` +hunter analyze <test>... +hunter analyze <group>... +``` + +This command prints interesting results of all runs of the test and a list of change-points. + +A change-point is a moment when a metric value starts to differ significantly from the values of the earlier runs and +when the difference is consistent enough that it is unlikely to happen by chance. + +Hunter calculates the probability (P-value) that the change point was caused by chance - the closer to zero, the more +"sure" it is about the regression or performance improvement. The smaller is the actual magnitude of the change, the +more data points are needed to confirm the change, therefore Hunter may not notice the regression after the first run +that regressed. Review Comment: https://github.com/apache/hunter/pull/28/commits/2425c0f984f9d15e8d5a69b72ab32de76fb72fb7 ########## docs/README.md: ########## @@ -0,0 +1,21 @@ +# Table of Contents + +## Getting Started +- [Installation](docs/INSTALL.md) +- [Getting Started](docs/GETTING_STARTED.md) +- [Contributing](docs/CONTRIBUTING.md) + +## Basics Review Comment: I've combined listing tests/metrics with "Finding Change Points" into Basics in [7cdba1a](https://github.com/apache/hunter/pull/28/commits/7cdba1add93a9b065c52935d34b93446f8ca9f74) and then moved the rest in [1458c06](https://github.com/apache/hunter/pull/28/commits/1458c06f77f69a873e7951c7c41e6490077f5800). ########## docs/FEATURE_BRANCH.md: ########## @@ -0,0 +1,52 @@ +# Validating Performance of a Feature Branch + +The `hunter regressions` command can work with feature branches. + +First you need to tell Hunter how to fetch the data of the tests run against a feature branch. +The `prefix` property of the graphite test definition accepts `%{BRANCH}` variable, +which is substituted at the data import time by the branch name passed to `--branch` +command argument. Alternatively, if the prefix for the main branch of your product is different +from the prefix used for feature branches, you can define an additional `branch_prefix` property. + +```yaml +my-product.test-1: + type: graphite + tags: [perf-test, daily, my-product, test-1] + prefix: performance-tests.daily.%{BRANCH}.my-product.test-1 + inherit: common-metrics + +my-product.test-2: + type: graphite + tags: [perf-test, daily, my-product, test-2] + prefix: performance-tests.daily.master.my-product.test-2 + branch_prefix: performance-tests.feature.%{BRANCH}.my-product.test-2 + inherit: common-metrics +``` + +Now you can verify if correct data are imported by running +`hunter analyze <test> --branch <branch>`. + +The `--branch` argument also works with `hunter regressions`. In this case a comparison will be made +between the tail of the specified branch and the tail of the main branch (or a point of the +main branch specified by one of the `--since` selectors). + +``` +$ hunter regressions <test or group> --branch <branch> +$ hunter regressions <test or group> --branch <branch> --since <date> +$ hunter regressions <test or group> --branch <branch> --since-version <version> +$ hunter regressions <test or group> --branch <branch> --since-commit <commit> +``` + +Sometimes when working on a feature branch, you may run the tests multiple times, +creating more than one data point. To ignore the previous test results, and compare +only the last few points on the branch with the tail of the main branch, +use the `--last <n>` selector. E.g. to check regressions on the last run of the tests +on the feature branch: + +``` +$ hunter regressions <test or group> --branch <branch> --last 1 +``` + +Please beware that performance validation based on a single data point is quite weak +and Hunter might miss a regression if the point is not too much different from +the baseline. Review Comment: 238c1b4 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hunter.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org