Re: [PR] Introduce docs and reproducible examples [hunter]

via GitHub Thu, 23 Jan 2025 04:11:42 -0800


henrikingo commented on code in PR #28:
URL: https://github.com/apache/hunter/pull/28#discussion_r1926812812



##########
README.md:
##########
@@ -4,409 +4,38 @@ Hunter – Hunts Performance Regressions
 _This is an unsupported open source project created by DataStax employees._
 
 
-Hunter performs statistical analysis of performance test results stored 
-in CSV files or Graphite database. It finds change-points and notifies about 
-possible performance regressions.  
- 
-A typical use-case of hunter is as follows: 
+Hunter performs statistical analysis of performance test results stored
+in CSV files, PostgreSQL, BigQuery, or Graphite database. It finds 
change-points and notifies about
+possible performance regressions.
+
+A typical use-case of hunter is as follows:
 
 - A set of performance tests is scheduled repeatedly.
-- The resulting metrics of the test runs are stored in a time series database 
(Graphite) 
-   or appended to CSV files. 
-- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded 
+- The resulting metrics of the test runs are stored in a time series database 
(Graphite)
+   or appended to CSV files.
+- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded
   metrics regularly.
 - Hunter notifies about significant changes in recorded metrics by outputting 
text reports or
   sending Slack notifications.
-  
-Hunter is capable of finding even small, but systematic shifts in metric 
values, 
+
+Hunter is capable of finding even small, but systematic shifts in metric 
values,
 despite noise in data.
-It adapts automatically to the level of noise in data and tries not to notify 
about changes that 
-can happen by random. Unlike in threshold-based performance monitoring 
systems, 
-there is no need to setup fixed warning threshold levels manually for each 
recorded metric.  
-The level of accepted probability of false-positives, as well as the 
-minimal accepted magnitude of changes are tunable. Hunter is also capable of 
comparing 
+It adapts automatically to the level of noise in data and tries not to notify 
about changes that
+can happen by random. Unlike in threshold-based performance monitoring systems,
+there is no need to setup fixed warning threshold levels manually for each 
recorded metric.
+The level of accepted probability of false-positives, as well as the
+minimal accepted magnitude of changes are tunable. Hunter is also capable of 
comparing
 the level of performance recorded in two different periods of time – which is 
useful for
-e.g. validating the performance of the release candidate vs the previous 
release of your product.    
+e.g. validating the performance of the release candidate vs the previous 
release of your product.
 
-This is still work-in-progress, unstable code. 
-Features may be missing. 
+This is still work-in-progress, unstable code.

Review Comment:
   I disagree. This is in production for years. But it is a testing tool. I 
would say we develop Hunter with lower standards than Cassandra and Kafka, the 
products we use Hunter on.



##########
README.md:
##########
@@ -4,409 +4,38 @@ Hunter – Hunts Performance Regressions
 _This is an unsupported open source project created by DataStax employees._
 
 
-Hunter performs statistical analysis of performance test results stored 
-in CSV files or Graphite database. It finds change-points and notifies about 
-possible performance regressions.  
- 
-A typical use-case of hunter is as follows: 
+Hunter performs statistical analysis of performance test results stored
+in CSV files, PostgreSQL, BigQuery, or Graphite database. It finds 
change-points and notifies about
+possible performance regressions.
+
+A typical use-case of hunter is as follows:
 
 - A set of performance tests is scheduled repeatedly.
-- The resulting metrics of the test runs are stored in a time series database 
(Graphite) 
-   or appended to CSV files. 
-- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded 
+- The resulting metrics of the test runs are stored in a time series database 
(Graphite)
+   or appended to CSV files.
+- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded
   metrics regularly.
 - Hunter notifies about significant changes in recorded metrics by outputting 
text reports or
   sending Slack notifications.
-  
-Hunter is capable of finding even small, but systematic shifts in metric 
values, 
+
+Hunter is capable of finding even small, but systematic shifts in metric 
values,

Review Comment:
   I use the word "persistent". Systematic to me suggest the change is 
repeating.
   
   This comment is more conversational, no need to change.



##########
README.md:
##########
@@ -4,409 +4,38 @@ Hunter – Hunts Performance Regressions
 _This is an unsupported open source project created by DataStax employees._
 
 
-Hunter performs statistical analysis of performance test results stored 
-in CSV files or Graphite database. It finds change-points and notifies about 
-possible performance regressions.  
- 
-A typical use-case of hunter is as follows: 
+Hunter performs statistical analysis of performance test results stored
+in CSV files, PostgreSQL, BigQuery, or Graphite database. It finds 
change-points and notifies about
+possible performance regressions.
+
+A typical use-case of hunter is as follows:
 
 - A set of performance tests is scheduled repeatedly.
-- The resulting metrics of the test runs are stored in a time series database 
(Graphite) 
-   or appended to CSV files. 
-- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded 
+- The resulting metrics of the test runs are stored in a time series database 
(Graphite)
+   or appended to CSV files.
+- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded
   metrics regularly.
 - Hunter notifies about significant changes in recorded metrics by outputting 
text reports or
   sending Slack notifications.
-  
-Hunter is capable of finding even small, but systematic shifts in metric 
values, 
+
+Hunter is capable of finding even small, but systematic shifts in metric 
values,
 despite noise in data.
-It adapts automatically to the level of noise in data and tries not to notify 
about changes that 
-can happen by random. Unlike in threshold-based performance monitoring 
systems, 
-there is no need to setup fixed warning threshold levels manually for each 
recorded metric.  
-The level of accepted probability of false-positives, as well as the 
-minimal accepted magnitude of changes are tunable. Hunter is also capable of 
comparing 
+It adapts automatically to the level of noise in data and tries not to notify 
about changes that
+can happen by random. Unlike in threshold-based performance monitoring systems,

Review Comment:
   Here I would be more specific as "random" means a lot of things. At MongoDB 
we talked about anomaly detection vs change detection. Anomaly detection is 
good at finding outliers, which are singular events caused by a bad host or 
network issue or whatever. Change on the other hand is persistent.
   
   In addition of course we call the randomness "noise", which means changes 
that:
   * Are not systematic rather go up and down around some stable mean
   * Are explained by factors we are not measuring, or are not interested in. 
"network was slow"
   
   
   To continue, e-divisive actually can and will also alert to changes in this 
noise itself. (And such changes can be due to issues in the product or test 
setup. For example, MongoDB would perform background flushes to disk every 60 
seconds.  So a 30 second test would have a 50% change to be affected by 
flushing or not. e-divisive will detect changes where the mean is constant but 
variance changed.



##########
docs/LIST_TESTS.md:
##########
@@ -0,0 +1,13 @@
+# Listing Available Tests

Review Comment:
   These two could be IMO combined to a single page?



##########
docs/VALIDATING_PERF.md:
##########
@@ -0,0 +1,45 @@
+# Validating Performance against Baseline

Review Comment:
   Do you ever use this? I consider this feature heretic myself :-) Hunter 
should be used to find change points. If you fix all the regressions, then you 
don't  have regressions. If you don't, then you have regressions.
   
   What this feature supports  is the old way of doing things: release 
candidate is slower than previous release but we don't know why. We then 
improve performance in an arbitrary component to compensate. When we get back 
to the original level, we can make a release.
   
   Oh god now that I wrote it it sounds even worse! I suggest to deprecate this 
feature. It's existence misunderstands the whole benefit of hunter!



##########
docs/GETTING_STARTED.md:
##########
@@ -0,0 +1,129 @@
+# Getting Started
+
+## Installation
+
+Hunter requires Python 3.8.  If you don't have python 3.8,
+use pyenv to install it.
+
+Use pipx to install hunter:
+
+```
+pipx install git+ssh://g...@github.com/apache/hunter
+```
+
+## Setup
+
+Copy the main configuration file `resources/hunter.yaml` to 
`~/.hunter/hunter.yaml` and adjust data source configuration.
+
+> [!TIP]
+> See docs on specific data sources to learn more about their configuration - 
[CSV](CSV.md), [Graphite](GRAPHITE.md),
+[PostgreSQL](POSTGRESQL.md), or [BigQuery](BIGQUERY.md).
+
+Alternatively, it is possible to leave the config file as is, and provide 
credentials in the environment
+by setting appropriate environment variables.
+Environment variables are interpolated before interpreting the configuration 
file.
+
+## Defining tests
+
+All test configurations are defined in the main configuration file.
+Hunter supports reading data from and publishing results to a CSV file, 
[Graphite](https://graphiteapp.org/),
+[PostgreSQL](https://www.postgresql.org/), and 
[BigQuery](https://cloud.google.com/bigquery).
+
+Tests are defined in the `tests` section. For example, the following 
definition will import results of the test from a CSV file:
+
+```yaml
+tests:
+  local.sample:
+    type: csv
+    file: tests/resources/sample.csv
+    time_column: time
+    metrics: [metric1, metric2]
+    attributes: [commit]
+    csv_options:
+      delimiter: ","
+      quote_char: "'"
+```
+
+The `time_column` property points to the name of the column storing the 
timestamp
+of each test-run. The data points will be ordered by that column.
+
+The `metrics` property selects the columns tha hold the values to be analyzed. 
These values must
+be numbers convertible to floats. The `metrics` property can be not only a 
simple list of column
+names, but it can also be a dictionary configuring other properties of each 
metric,
+the column name or direction:
+
+```yaml
+metrics:
+  resp_time_p99:
+    direction: -1
+    column: p99
+```
+
+Direction can be 1 or -1. If direction is set to 1, this means that the higher 
the metric, the
+better the performance is. If it is set to -1, higher values mean worse 
performance.
+
+The `attributes` property describes any other columns that should be attached 
to the final
+report. Special attribute `version` and `commit` can be used to query for a 
given time-range.
+
+> [!TIP] To learn how to avoid repeating the same configuration in multiple 
tests, see [Avoiding test definition duplication](TEMPLATES.md).
+
+## Listing Available Tests
+
+```
+hunter list-groups
+hunter list-tests [group name]
+```
+
+## Listing Available Metrics for Tests
+
+To list all available metrics defined for the test:
+```
+hunter list-metrics <test>
+```
+
+## Finding Change Points
+
+> [!TIP]
+> For more details, see docs about [Finding Change Points](ANALYZE.md), 
[Validating Performance against Baseline](VALIDATING_PERF.md),
+> and [Validating Performance of a Feature Branch](FEATURE_BRANCH.md).
+
+```
+hunter analyze <test>...
+hunter analyze <group>...
+```
+
+This command prints interesting results of all runs of the test and a list of 
change-points.
+
+A change-point is a moment when a metric value starts to differ significantly 
from the values of the earlier runs and
+when the difference is consistent enough that it is unlikely to happen by 
chance.

Review Comment:
   Strictly speaking nothing happens by chance :-)
   
   I would just says "statistically  significant" or "according to some 
statistical significance test. (Hunter uses Student  t-test by default.)"



##########
docs/GETTING_STARTED.md:
##########
@@ -0,0 +1,129 @@
+# Getting Started
+
+## Installation
+
+Hunter requires Python 3.8.  If you don't have python 3.8,
+use pyenv to install it.
+
+Use pipx to install hunter:
+
+```
+pipx install git+ssh://g...@github.com/apache/hunter
+```
+
+## Setup
+
+Copy the main configuration file `resources/hunter.yaml` to 
`~/.hunter/hunter.yaml` and adjust data source configuration.
+
+> [!TIP]
+> See docs on specific data sources to learn more about their configuration - 
[CSV](CSV.md), [Graphite](GRAPHITE.md),
+[PostgreSQL](POSTGRESQL.md), or [BigQuery](BIGQUERY.md).
+
+Alternatively, it is possible to leave the config file as is, and provide 
credentials in the environment
+by setting appropriate environment variables.
+Environment variables are interpolated before interpreting the configuration 
file.
+
+## Defining tests
+
+All test configurations are defined in the main configuration file.
+Hunter supports reading data from and publishing results to a CSV file, 
[Graphite](https://graphiteapp.org/),
+[PostgreSQL](https://www.postgresql.org/), and 
[BigQuery](https://cloud.google.com/bigquery).
+
+Tests are defined in the `tests` section. For example, the following 
definition will import results of the test from a CSV file:
+
+```yaml
+tests:
+  local.sample:
+    type: csv
+    file: tests/resources/sample.csv
+    time_column: time
+    metrics: [metric1, metric2]
+    attributes: [commit]
+    csv_options:
+      delimiter: ","
+      quote_char: "'"
+```
+
+The `time_column` property points to the name of the column storing the 
timestamp
+of each test-run. The data points will be ordered by that column.
+
+The `metrics` property selects the columns tha hold the values to be analyzed. 
These values must
+be numbers convertible to floats. The `metrics` property can be not only a 
simple list of column
+names, but it can also be a dictionary configuring other properties of each 
metric,
+the column name or direction:
+
+```yaml
+metrics:
+  resp_time_p99:
+    direction: -1
+    column: p99
+```
+
+Direction can be 1 or -1. If direction is set to 1, this means that the higher 
the metric, the
+better the performance is. If it is set to -1, higher values mean worse 
performance.
+
+The `attributes` property describes any other columns that should be attached 
to the final
+report. Special attribute `version` and `commit` can be used to query for a 
given time-range.
+
+> [!TIP] To learn how to avoid repeating the same configuration in multiple 
tests, see [Avoiding test definition duplication](TEMPLATES.md).
+
+## Listing Available Tests
+
+```
+hunter list-groups
+hunter list-tests [group name]
+```
+
+## Listing Available Metrics for Tests
+
+To list all available metrics defined for the test:
+```
+hunter list-metrics <test>
+```
+
+## Finding Change Points
+
+> [!TIP]
+> For more details, see docs about [Finding Change Points](ANALYZE.md), 
[Validating Performance against Baseline](VALIDATING_PERF.md),
+> and [Validating Performance of a Feature Branch](FEATURE_BRANCH.md).
+
+```
+hunter analyze <test>...
+hunter analyze <group>...
+```
+
+This command prints interesting results of all runs of the test and a list of 
change-points.
+
+A change-point is a moment when a metric value starts to differ significantly 
from the values of the earlier runs and
+when the difference is consistent enough that it is unlikely to happen by 
chance.
+
+Hunter calculates the probability (P-value) that the change point was caused 
by chance - the closer to zero, the more
+"sure" it is about the regression or performance improvement. The smaller is 
the actual magnitude of the change, the
+more data points are needed to confirm the change, therefore Hunter may not 
notice the regression after the first run
+that regressed.

Review Comment:
   immediately after



##########
docs/ANALYZE.md:
##########
@@ -0,0 +1,46 @@
+# Finding Change Points
+
+```
+hunter analyze <test>... 
+hunter analyze <group>...
+```
+
+This command prints interesting results of all
+runs of the test and a list of change-points.
+A change-point is a moment when a metric value starts to differ significantly
+from the values of the earlier runs and when the difference
+is consistent enough that it is unlikely to happen by chance.  
+Hunter calculates the probability (P-value) that the change point was caused
+by chance - the closer to zero, the more "sure" it is about the regression or
+performance improvement. The smaller is the actual magnitude of the change,
+the more data points are needed to confirm the change, therefore Hunter may
+not notice the regression after the first run that regressed.
+
+The `analyze` command accepts multiple tests or test groups.
+The results are simply concatenated.
+
+#### Example
+
+> [!TIP]
+> See [hunter.yaml](../examples/csv/hunter.yaml) for the full example 
configuration.
+
+```
+$ hunter analyze local.sample --since=2024-01-01
+INFO: Computing change points for test sample.csv...

Review Comment:
   are the samples distributed in the repo? If yes, maybe preface this with a 
link to where they are. If no, I added our tigerbeetle demo data set to tests/ 
you are welcome to use it here.



##########
docs/FEATURE_BRANCH.md:
##########
@@ -0,0 +1,52 @@
+# Validating Performance of a Feature Branch
+
+The `hunter regressions` command can work with feature branches.
+
+First you need to tell Hunter how to fetch the data of the tests run against a 
feature branch.
+The `prefix` property of the graphite test definition accepts `%{BRANCH}` 
variable,
+which is substituted at the data import time by the branch name passed to 
`--branch`
+command argument. Alternatively, if the prefix for the main branch of your 
product is different
+from the prefix used for feature branches, you can define an additional 
`branch_prefix` property.
+
+```yaml
+my-product.test-1:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-1]
+  prefix: performance-tests.daily.%{BRANCH}.my-product.test-1
+  inherit: common-metrics
+
+my-product.test-2:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-2]
+  prefix: performance-tests.daily.master.my-product.test-2
+  branch_prefix: performance-tests.feature.%{BRANCH}.my-product.test-2
+  inherit: common-metrics
+```
+
+Now you can verify if correct data are imported by running
+`hunter analyze <test> --branch <branch>`.
+
+The `--branch` argument also works with `hunter regressions`. In this case a 
comparison will be made
+between the tail of the specified branch and the tail of the main branch (or a 
point of the
+main branch specified by one of the `--since` selectors).
+
+```
+$ hunter regressions <test or group> --branch <branch> 
+$ hunter regressions <test or group> --branch <branch> --since <date>
+$ hunter regressions <test or group> --branch <branch> --since-version 
<version>
+$ hunter regressions <test or group> --branch <branch> --since-commit <commit>
+```
+
+Sometimes when working on a feature branch, you may run the tests multiple 
times,
+creating more than one data point. To ignore the previous test results, and 
compare
+only the last few points on the branch with the tail of the main branch,
+use the `--last <n>` selector. E.g. to check regressions on the last run of 
the tests
+on the feature branch:

Review Comment:
   The optimal way here is to take all the points after the last change point 
in the feature branch itself.
   
   This could also be the default behavior if it isn't yet. Meaning, hunter 
should start with computing change points for both branches, them compare the 
stable tails for both. (If it isn't already doing that, do you mind filing a 
ticket?)
   
   Note btw that this isn't just for the feature branch. You want to compare 
stable regions both for main and feature branch.



##########
docs/GETTING_STARTED.md:
##########
@@ -0,0 +1,129 @@
+# Getting Started
+
+## Installation
+
+Hunter requires Python 3.8.  If you don't have python 3.8,
+use pyenv to install it.
+
+Use pipx to install hunter:
+
+```
+pipx install git+ssh://g...@github.com/apache/hunter
+```
+
+## Setup
+
+Copy the main configuration file `resources/hunter.yaml` to 
`~/.hunter/hunter.yaml` and adjust data source configuration.
+
+> [!TIP]
+> See docs on specific data sources to learn more about their configuration - 
[CSV](CSV.md), [Graphite](GRAPHITE.md),
+[PostgreSQL](POSTGRESQL.md), or [BigQuery](BIGQUERY.md).
+
+Alternatively, it is possible to leave the config file as is, and provide 
credentials in the environment
+by setting appropriate environment variables.
+Environment variables are interpolated before interpreting the configuration 
file.
+
+## Defining tests
+
+All test configurations are defined in the main configuration file.
+Hunter supports reading data from and publishing results to a CSV file, 
[Graphite](https://graphiteapp.org/),
+[PostgreSQL](https://www.postgresql.org/), and 
[BigQuery](https://cloud.google.com/bigquery).
+
+Tests are defined in the `tests` section. For example, the following 
definition will import results of the test from a CSV file:
+
+```yaml
+tests:
+  local.sample:
+    type: csv
+    file: tests/resources/sample.csv
+    time_column: time
+    metrics: [metric1, metric2]
+    attributes: [commit]
+    csv_options:
+      delimiter: ","
+      quote_char: "'"
+```
+
+The `time_column` property points to the name of the column storing the 
timestamp
+of each test-run. The data points will be ordered by that column.
+
+The `metrics` property selects the columns tha hold the values to be analyzed. 
These values must
+be numbers convertible to floats. The `metrics` property can be not only a 
simple list of column
+names, but it can also be a dictionary configuring other properties of each 
metric,
+the column name or direction:
+
+```yaml
+metrics:
+  resp_time_p99:
+    direction: -1
+    column: p99
+```
+
+Direction can be 1 or -1. If direction is set to 1, this means that the higher 
the metric, the
+better the performance is. If it is set to -1, higher values mean worse 
performance.
+
+The `attributes` property describes any other columns that should be attached 
to the final
+report. Special attribute `version` and `commit` can be used to query for a 
given time-range.
+
+> [!TIP] To learn how to avoid repeating the same configuration in multiple 
tests, see [Avoiding test definition duplication](TEMPLATES.md).
+
+## Listing Available Tests
+
+```
+hunter list-groups
+hunter list-tests [group name]
+```
+
+## Listing Available Metrics for Tests
+
+To list all available metrics defined for the test:
+```
+hunter list-metrics <test>
+```
+
+## Finding Change Points
+
+> [!TIP]
+> For more details, see docs about [Finding Change Points](ANALYZE.md), 
[Validating Performance against Baseline](VALIDATING_PERF.md),
+> and [Validating Performance of a Feature Branch](FEATURE_BRANCH.md).
+
+```
+hunter analyze <test>...
+hunter analyze <group>...
+```
+
+This command prints interesting results of all runs of the test and a list of 
change-points.
+
+A change-point is a moment when a metric value starts to differ significantly 
from the values of the earlier runs and
+when the difference is consistent enough that it is unlikely to happen by 
chance.
+
+Hunter calculates the probability (P-value) that the change point was caused 
by chance - the closer to zero, the more

Review Comment:
   Here chance is ok



##########
docs/FEATURE_BRANCH.md:
##########
@@ -0,0 +1,52 @@
+# Validating Performance of a Feature Branch
+
+The `hunter regressions` command can work with feature branches.
+
+First you need to tell Hunter how to fetch the data of the tests run against a 
feature branch.
+The `prefix` property of the graphite test definition accepts `%{BRANCH}` 
variable,
+which is substituted at the data import time by the branch name passed to 
`--branch`
+command argument. Alternatively, if the prefix for the main branch of your 
product is different
+from the prefix used for feature branches, you can define an additional 
`branch_prefix` property.
+
+```yaml
+my-product.test-1:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-1]
+  prefix: performance-tests.daily.%{BRANCH}.my-product.test-1
+  inherit: common-metrics
+
+my-product.test-2:
+  type: graphite
+  tags: [perf-test, daily, my-product, test-2]
+  prefix: performance-tests.daily.master.my-product.test-2
+  branch_prefix: performance-tests.feature.%{BRANCH}.my-product.test-2
+  inherit: common-metrics
+```
+
+Now you can verify if correct data are imported by running
+`hunter analyze <test> --branch <branch>`.
+
+The `--branch` argument also works with `hunter regressions`. In this case a 
comparison will be made
+between the tail of the specified branch and the tail of the main branch (or a 
point of the
+main branch specified by one of the `--since` selectors).
+
+```
+$ hunter regressions <test or group> --branch <branch> 
+$ hunter regressions <test or group> --branch <branch> --since <date>
+$ hunter regressions <test or group> --branch <branch> --since-version 
<version>
+$ hunter regressions <test or group> --branch <branch> --since-commit <commit>
+```
+
+Sometimes when working on a feature branch, you may run the tests multiple 
times,
+creating more than one data point. To ignore the previous test results, and 
compare
+only the last few points on the branch with the tail of the main branch,
+use the `--last <n>` selector. E.g. to check regressions on the last run of 
the tests
+on the feature branch:
+
+```
+$ hunter regressions <test or group> --branch <branch> --last 1  
+```
+
+Please beware that performance validation based on a single data point is 
quite weak
+and Hunter might miss a regression if the point is not too much different from
+the baseline. 

Review Comment:
   I would continue saying that accuracy improves as more data points 
accumulate and it is a normal way of using Hunter to just merge a feature and 
then revert if it is later flagged by Hunter.



##########
docs/INSTALL.md:
##########
@@ -0,0 +1,19 @@
+# Installation

Review Comment:
   I think this is all covered in the README? Could be removed or merged to 
save space.



##########
docs/ANALYZE.md:
##########
@@ -0,0 +1,46 @@
+# Finding Change Points
+
+```
+hunter analyze <test>... 
+hunter analyze <group>...
+```
+
+This command prints interesting results of all
+runs of the test and a list of change-points.
+A change-point is a moment when a metric value starts to differ significantly
+from the values of the earlier runs and when the difference
+is consistent enough that it is unlikely to happen by chance.  
+Hunter calculates the probability (P-value) that the change point was caused
+by chance - the closer to zero, the more "sure" it is about the regression or
+performance improvement. The smaller is the actual magnitude of the change,
+the more data points are needed to confirm the change, therefore Hunter may
+not notice the regression after the first run that regressed.

Review Comment:
   "immediately after"
   
   I would also add that Hunter will eventually find the commit that caused the 
regression, rather than just focusing on the HEAD of a branch.



##########
README.md:
##########
@@ -4,409 +4,38 @@ Hunter – Hunts Performance Regressions
 _This is an unsupported open source project created by DataStax employees._
 
 
-Hunter performs statistical analysis of performance test results stored 
-in CSV files or Graphite database. It finds change-points and notifies about 
-possible performance regressions.  
- 
-A typical use-case of hunter is as follows: 
+Hunter performs statistical analysis of performance test results stored
+in CSV files, PostgreSQL, BigQuery, or Graphite database. It finds 
change-points and notifies about
+possible performance regressions.
+
+A typical use-case of hunter is as follows:
 
 - A set of performance tests is scheduled repeatedly.
-- The resulting metrics of the test runs are stored in a time series database 
(Graphite) 
-   or appended to CSV files. 
-- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded 
+- The resulting metrics of the test runs are stored in a time series database 
(Graphite)
+   or appended to CSV files.
+- Hunter is launched by a Jenkins/Cron job (or an operator) to analyze the 
recorded
   metrics regularly.
 - Hunter notifies about significant changes in recorded metrics by outputting 
text reports or
   sending Slack notifications.
-  
-Hunter is capable of finding even small, but systematic shifts in metric 
values, 
+
+Hunter is capable of finding even small, but systematic shifts in metric 
values,
 despite noise in data.
-It adapts automatically to the level of noise in data and tries not to notify 
about changes that 
-can happen by random. Unlike in threshold-based performance monitoring 
systems, 
-there is no need to setup fixed warning threshold levels manually for each 
recorded metric.  
-The level of accepted probability of false-positives, as well as the 
-minimal accepted magnitude of changes are tunable. Hunter is also capable of 
comparing 
+It adapts automatically to the level of noise in data and tries not to notify 
about changes that
+can happen by random. Unlike in threshold-based performance monitoring systems,
+there is no need to setup fixed warning threshold levels manually for each 
recorded metric.
+The level of accepted probability of false-positives, as well as the
+minimal accepted magnitude of changes are tunable. Hunter is also capable of 
comparing
 the level of performance recorded in two different periods of time – which is 
useful for
-e.g. validating the performance of the release candidate vs the previous 
release of your product.    
+e.g. validating the performance of the release candidate vs the previous 
release of your product.
 
-This is still work-in-progress, unstable code. 
-Features may be missing. 
+This is still work-in-progress, unstable code.

Review Comment:
   In fact, except for the last line I would remove all of this. This list is 
true for all software.
   
   Instead we can communicate the same with semantic versioning scheme. We can 
use versions < 1.0 to make it clear this is semi-mature.



##########
docs/README.md:
##########
@@ -0,0 +1,21 @@
+# Table of Contents
+
+## Getting Started
+- [Installation](docs/INSTALL.md)
+- [Getting Started](docs/GETTING_STARTED.md)
+- [Contributing](docs/CONTRIBUTING.md)
+
+## Basics

Review Comment:
   Possibly all of Basics could be a single file?



##########
docs/ANALYZE.md:
##########
@@ -0,0 +1,46 @@
+# Finding Change Points
+
+```
+hunter analyze <test>... 
+hunter analyze <group>...
+```
+
+This command prints interesting results of all
+runs of the test and a list of change-points.
+A change-point is a moment when a metric value starts to differ significantly
+from the values of the earlier runs and when the difference
+is consistent enough that it is unlikely to happen by chance.  

Review Comment:
   "is persistent and statistically significant"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hunter.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Introduce docs and reproducible examples [hunter]

Reply via email to