Re: [PR] SOLR-15625 Improve documentation for the benchmark module. [solr]

via GitHub Mon, 09 Dec 2024 11:00:55 -0800


markrmiller commented on code in PR #406:
URL: https://github.com/apache/solr/pull/406#discussion_r1876519771



##########
solr/benchmark/docs/jmh-profilers-setup.md:
##########
@@ -0,0 +1,406 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+        http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+ -->
+
+# JMH Profiler Setup (Async-Profiler and Perfasm)
+
+JMH ships with a number of built-in profiler options that have grown in number 
over time. The profiler system is also pluggable,
+allowing for "after-market" profiler implementations to be added on the fly.
+
+Many of these profilers, most often the ones that stay in the realm of Java, 
will work across platforms and architectures and do
+so right out of the box. Others may be targeted at a specific OS, though there 
is a good chance a similar profiler for other OS's
+may exist where possible. A couple of very valuable profilers also require 
additional setup and environment to either work fully
+or at all.
+
+[TODO: link to page that only lists commands with simple section]
+
+- [JMH Profiler Setup (Async-Profiler and 
Perfasm)](#jmh-profiler-setup-async-profiler-and-perfasm)
+    - [Async-Profiler](#async-profiler)
+      - [Install async-profiler](#install-async-profiler)
+      - [Install Java Debug Symbols](#install-java-debug-symbols)
+        - [Ubuntu](#ubuntu)
+        - [Arch](#arch)
+  - [Perfasm](#perfasm)
+    - [Arch](#arch-1)
+    - [Ubuntu](#ubuntu-1)
+
+<br/>
+This guide will cover setting up both the async-profiler and the Perfasm 
profiler. Currently, we roughly cover two Linux family trees,
+but much of the information can be extrapolated or help point in the right 
direction for other systems.
+
+<br/> <br/>
+
+|<b>Path 1: Arch, Manjaro, etc</b>|<b>Path 2: Debian, Ubuntu, etc</b>|
+| 
:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:
 | 
:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:
 |
+| <image 
src="https://user-images.githubusercontent.com/448788/137563725-0195a732-da40-4c8b-a5e8-fd904a43bb79.png"/><image
 
src="https://user-images.githubusercontent.com/448788/137563722-665de88f-46a4-4939-88b0-3f96e56989ea.png"/>
 | <image 
src="https://user-images.githubusercontent.com/448788/137563909-6c2d2729-2747-47a0-b2bd-f448a958b5be.png"/><image
 
src="https://user-images.githubusercontent.com/448788/137563908-738a7431-88db-47b0-96a4-baaed7e5024b.png"/>
 |
+
+<br/>
+
+If you run `jmh.sh` with the `-lprof` argument, it will make an attempt to 
only list the profilers that it detects will work in your particular 
environment.
+
+You should do this first to see where you stand.
+
+
+<div style="z-index: 8;  background-color: #364850; border-style: solid; 
border-width: 1px; border-color: #3b4d56;border-radius: 0px; margin: 0px 5px 
3px 10px; padding-bottom: 1px;padding-top: 5px;" data-code-wrap="true">
+
+![](https://user-images.githubusercontent.com/448788/137610116-eff6d0b7-e862-40fb-af04-452aaf585387.png)
+
+```Shell
+./jmh.sh -lprof` 
+```
+
+</div>
+
+
+<br/>
+
+In our case, we will start with very **minimal** Arch and Ubuntu clean 
installations, and so we already know there is _**no chance**_ that 
async-profiler or Perfasm
+are going to run.
+
+In fact, first we have to install a few project build requirements before 
thinking too much about JMH profiler support.
+
+We will run on **Arch/Manjaro**, but there should not be any difference than 
on **Debian/Ubuntu** for this stage.
+
+<div style="z-index: 8;  background-color: #364850; border-style: solid; 
border-width: 1px; border-color: #3b4d56;border-radius: 0px; margin: 5px 10px 
10px;padding-bottom: 1px;padding-top: 5px;" data-code-wrap="true">
+
+![](https://user-images.githubusercontent.com/448788/137610116-eff6d0b7-e862-40fb-af04-452aaf585387.png)
+
+```Shell
+sudo pacman -S wget jdk-openjdk11
+```
+
+</div>
+
+<br/>
+
+Here we give **async-profiler** a try on **Arch** anyway and observe the 
failure indicating that we need to obtain the async-profiler library and
+put it in the correct location at a minimum.
+
+<div style="z-index: 8;  background-color: #364850; border-style: solid; 
border-width: 1px; border-color: #3b4d56;border-radius: 0px; margin: 0px 5px 
3px 10px; padding-bottom: 1px;padding-top: 5px;" data-code-wrap="true">
+
+![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+
+```Shell
+./jmh.sh BenchMark -prof async
+```
+
+<pre>
+   <image 
src="https://user-images.githubusercontent.com/448788/137534191-01c2bc7a-5c1f-42a2-8d66-a5d1a5280db4.png"/>
  Profilers failed to initialize, exiting.
+
+    Unable to load async-profiler. Ensure asyncProfiler library is on 
LD_LIBRARY_PATH (Linux)
+    DYLD_LIBRARY_PATH (Mac OS), or -Djava.library.path.
+
+    Alternatively, point to explicit library location with: '-prof 
async:libPath={path}'
+
+    no asyncProfiler in java.library.path: [/usr/java/packages/lib, 
/usr/lib64, /lib64, /lib, /usr/lib]
+    </pre>
+
+</div>
+
+### Async-Profiler
+
+#### Install async-profiler
+
+<div style="z-index: 8;  background-color: #364850; border-style: solid; 
border-width: 1px; border-color: #3b4d56;border-radius: 0px; margin: 0px 5px 
3px 10px; padding-bottom: 1px;padding-top: 5px;" data-code-wrap="true">
+
+![](https://user-images.githubusercontent.com/448788/137610116-eff6d0b7-e862-40fb-af04-452aaf585387.png)
+
+```Shell
+wget -c 
https://github.com/jvm-profiling-tools/async-profiler/releases/download/v2.5/async-profiler-2.5-linux-x64.tar.gz
 -O - | tar -xz
+sudo mkdir -p /usr/java/packages/lib
+sudo cp async-profiler-2.5-linux-x64/build/* /usr/java/packages/lib
+```
+
+</div>
+
+<br/>
+
+That should work out better, but there is still an issue that will prevent a 
successful profiling run. async-profiler relies on Linux's perf,
+and in any recent Linux kernel, perf is restricted from doing its job without 
some configuration loosening.
+
+Manjaro should have perf available, but you may need to install it in the 
other cases.
+
+<br/>
+
+![](https://user-images.githubusercontent.com/448788/137563908-738a7431-88db-47b0-96a4-baaed7e5024b.png)
+
+<div style="z-index: 8;  background-color: #364850; border-style: solid; 
border-width: 1px; border-color: #3b4d56;border-radius: 0px; margin: 0px 5px 
3px 10px; padding-bottom: 1px;padding-top: 5px;" data-code-wrap="true">
+
+![](https://user-images.githubusercontent.com/448788/137610566-883825b7-e66c-4d8b-a6a5-61542bc08d23.png)
+
+```Shell
+apt-get install linux-tools-common linux-tools-generic linux-tools-`uname -r`
+```
+
+</div>
+
+<br/>
+
+![](https://user-images.githubusercontent.com/448788/137563725-0195a732-da40-4c8b-a5e8-fd904a43bb79.png)
+
+<div style="z-index: 8;  background-color: #364850; border-style: solid; 
border-width: 1px; border-color: #3b4d56;border-radius: 0px; margin: 0px 5px 
3px 10px; padding-bottom: 1px;padding-top: 5px;" data-code-wrap="true">
+
+![](https://user-images.githubusercontent.com/448788/137610566-883825b7-e66c-4d8b-a6a5-61542bc08d23.png)
+
+```Shell
+pacman -S perf
+```
+
+</div>
+
+
+<br/>
+
+And now the permissions issue. The following changes will  persist across 
restarts, and that is likely how you should leave things.
+
+<div style="z-index: 8;  background-color: #364850; border-style: solid; 
border-width: 1px; border-color: #3b4d56;border-radius: 0px; margin: 0px 5px 
3px 10px; padding-bottom: 1px;padding-top: 5px;" data-code-wrap="true">
+
+```zsh
+sudo sysctl -w kernel.kptr_restrict=0
+sudo sysctl -w kernel.perf_event_paranoid=1
+```
+
+</div>
+
+<br/>
+
+Now we **should** see some success:
+
+<div style="z-index: 8;  background-color: #364850; border-style: solid; 
border-width: 1px; border-color: #3b4d56;border-radius: 0px; margin: 0px 5px 
3px 10px; padding-bottom: 1px;padding-top: 5px;" data-code-wrap="true">
+
+![](https://user-images.githubusercontent.com/448788/137610116-eff6d0b7-e862-40fb-af04-452aaf585387.png)
+
+```Shell
+./jmh.sh FuzzyQuery -prof async:output=flamegraph
+```
+
+</div>
+
+<br/>
+
+![](https://user-images.githubusercontent.com/448788/138650315-82adeb18-54cd-43ee-810e-24f1e22719c7.png)
+
+<br/>
+
+But you will also find an important _warning_ if you look closely at the logs.
+
+<br/>
+
+![](https://user-images.githubusercontent.com/448788/137613526-a188ff03-545c-465d-928d-bc433d2d204f.png)
+<span style="color: yellow; margin-left: 5px;">[WARN]</span> `Install JVM 
debug symbols to improve profile accuracy`
+
+<br/>
+
+We do not want **debug symbols** stripped from Java for the best experience.

Review Comment:
   maybe something a little more specific, like "... stripped from Java for 
optimal profiling accuracy and heap allocation analysis"



##########
solr/benchmark/README.md:
##########
@@ -1,339 +1,423 @@
-JMH-Benchmarks module
-=====================
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
 
-This module contains benchmarks written using 
[JMH](https://openjdk.java.net/projects/code-tools/jmh/) from OpenJDK.
-Writing correct micro-benchmarks in Java (or another JVM language) is 
difficult and there are many non-obvious
-pitfalls (many due to compiler optimizations). JMH is a framework for running 
and analyzing benchmarks (micro or macro)
-written in Java (or another JVM language).
+        http://www.apache.org/licenses/LICENSE-2.0
+  
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+ -->
 
-* [JMH-Benchmarks module](#jmh-benchmarks-module)
-  * [Running benchmarks](#running-benchmarks)
-    * [Using JMH with async profiler](#using-jmh-with-async-profiler)
-    * [Using JMH GC profiler](#using-jmh-gc-profiler)
-    * [Using JMH Java Flight Recorder 
profiler](#using-jmh-java-flight-recorder-profiler)
-    * [JMH Options](#jmh-options)
-  * [Writing benchmarks](#writing-benchmarks)
-    * [SolrCloud MiniCluster Benchmark 
Setup](#solrcloud-minicluster-benchmark-setup)
-      * [MiniCluster Metrics](#minicluster-metrics)
-    * [Benchmark Repeatability](#benchmark-repeatability)
+# Solr JMH Benchmark Module
 
-## Running benchmarks
+![](https://user-images.githubusercontent.com/448788/140059718-de183e23-414e-4499-883a-34ec3cfbd2b6.png)
 
-If you want to set specific JMH flags or only run certain benchmarks, passing 
arguments via gradle tasks is cumbersome.
-The process has been simplified by the provided `jmh.sh` script.
+**_`profile, compare and introspect`_**
 
-The default behavior is to run all benchmarks:
+<samp>**A flexible, developer-friendly, microbenchmark framework**</samp>
 
-`./jmh.sh`
+![](https://img.shields.io/badge/developer-tool-blue)
 
-Pass a pattern or name after the command to select the benchmarks:
+## Table Of Content
 
-`./jmh.sh CloudIndexing`
+- [](#)
+  - [Table Of Content](#table-of-content)
+  - [Overview](#overview)
+  - [The Module Contains a few Distinct 
Focuses](#the-module-contains-a-few-distinct-focuses)
+  - [Getting Started](#getting-started)
+    - [Running `jmh.sh` with no Arguments](#running-jmhsh-with-no-arguments)
+    - [Pass a regex pattern or name after the command to select the 
benchmark(s) to 
run](#pass-a-regex-pattern-or-name-after-the-command-to-select-the-benchmarks-to-run)
+    - [The argument `-l` will list all the available 
benchmarks](#the-argument--l-will-list-all-the-available-benchmarks)
+    - [Check which benchmarks will run by entering a pattern after the -l 
argument](#check-which-benchmarks-will-run-by-entering-a-pattern-after-the--l-argument)
+    - [Further Pattern Examples](#further-pattern-examples)
+    - [`jmh.sh` accepts all the standard arguments that the standard JMH 
main-class 
handles](#jmhsh-accepts-all-the-standard-arguments-that-the-standard-jmh-main-class-handles)
+    - [Overriding Benchmark Parameters](#overriding-benchmark-parameters)
+    - [Format and Write Results to Files](#format-and-write-results-to-files)
+  - [JMH Command-Line Arguments](#jmh-command-line-arguments)
+    - [The JMH Command-Line Syntax](#the-jmh-command-line-syntax)
+    - [The Full List of JMH Arguments](#the-full-list-of-jmh-arguments)
+  - [Writing JMH benchmarks](#writing-jmh-benchmarks)
+  - [Continued Documentation](#continued-documentation)
+
+---
 
-Check which benchmarks match the provided pattern:
-
-`./jmh.sh -l CloudIndexing`
-
-Run a specific test and overrides the number of forks, iterations and sets 
warm-up iterations to `2`:
-
-`./jmh.sh -f 2 -i 2 -wi 2 CloudIndexing`
-
-Run a specific test with async and GC profilers on Linux and flame graph 
output:
-
-`./jmh.sh -prof gc -prof 
async:libPath=/path/to/libasyncProfiler.so\;output=flamegraph\;dir=profile-results
 CloudIndexing`
-
-### Using JMH with async profiler
-
-It's good practice to check profiler output for micro-benchmarks in order to 
verify that they represent the expected
-application behavior and measure what you expect to measure. Some example 
pitfalls include the use of expensive mocks or
-accidental inclusion of test setup code in the benchmarked code. JMH includes
-[async-profiler](https://github.com/jvm-profiling-tools/async-profiler) 
integration that makes this easy:
-
-`./jmh.sh -prof 
async:libPath=/path/to/libasyncProfiler.so\;dir=profile-results`
-
-With flame graph output:
-
-`./jmh.sh -prof 
async:libPath=/path/to/libasyncProfiler.so\;output=flamegraph\;dir=profile-results`
-
-Simultaneous cpu, allocation and lock profiling with async profiler 2.0 and 
jfr output:
-
-`./jmh.sh -prof 
async:libPath=/path/to/libasyncProfiler.so\;output=jfr\;alloc\;lock\;dir=profile-results
 CloudIndexing`
-
-A number of arguments can be passed to configure async profiler, run the 
following for a description:
-
-`./jmh.sh -prof async:help`
-
-You can also skip specifying libPath if you place the async profiler lib in a 
predefined location, such as one of the
-locations in the env variable `LD_LIBRARY_PATH` if it has been set (many Linux 
distributions set this env variable, Arch
-by default does not), or `/usr/lib` should work.
-
-#### OS Permissions for Async Profiler
-
-Async Profiler uses perf to profile native code in addition to Java code. It 
will need the following for the necessary
-access.
-
-```bash
-echo 0 > /proc/sys/kernel/kptr_restrict
-echo 1 > /proc/sys/kernel/perf_event_paranoid
-```
-
-or
-
-```bash
-sudo sysctl -w kernel.kptr_restrict=0
-sudo sysctl -w kernel.perf_event_paranoid=1
-```
-
-### Using JMH GC profiler
-
-You can run a benchmark with `-prof gc` to measure its allocation rate:
-
-`./jmh.sh -prof gc:dir=profile-results`
-
-Of particular importance is the `norm` alloc rates, which measure the 
allocations per operation rather than allocations
-per second.
-
-### Using JMH Java Flight Recorder profiler
-
-JMH comes with a variety of built-in profilers. Here is an example of using 
JFR:
-
-`./jmh.sh -prof jfr:dir=profile-results\;configName=jfr-profile.jfc`
-
-In this example we point to the included configuration file with configName, 
but you could also do something like
-settings=default or settings=profile.
-
-### Benchmark Outputs
-
-By default, output that benchmarks generate is created in the build/work 
directory. You can change this location by setting the workBaseDir system 
property like this:
-
-    -jvmArgsAppend -DworkBaseDir=/data3/bench_work
-
-If a profiler generates output, it will generally be written to the current 
working directory - that is the benchmark module directory itself. You can 
usually change this via the dir option, for example:
-
-    ./jmh.sh -prof jfr:dir=build/work/profile-results JsonFaceting
-
-### Using a Separate MiniCluster Base Directory
-
-If you have a special case MiniCluster you have generated, such as one you 
have prepared with very large indexes for a search benchmark run, you can 
change the base directory used by the profiler
-for the MiniCluster with the miniClusterBaseDir system property. This is for 
search based benchmarks in general and the MiniCluster wil not be removed 
automatically by the benchmark.
-
-### JMH Options
-
-Some common JMH options are:
-
-```text
+## Overview
+
+JMH is a Java **microbenchmark** framework from some of the developers that 
work on
+OpenJDK. Not surprisingly, OpenJDK is where you will find JMH's home today, 
alongside some
+other useful little Java libraries such as JOL (Java Object Layout).
+
+The significant value in JMH is that you get to stand on the shoulders of some 
brilliant
+engineers that have done some tricky groundwork that many an ambitious Java 
benchmark writer
+has merrily wandered past.
+
+Rather than simply providing a boilerplate framework for driving iterations 
and measuring
+elapsed times, which JMH does happily do, the focus is on the many forces that
+deceive and disorient the earnest benchmark enthusiast.
+
+From spinning your benchmark into all new generated source code
+in an attempt to avoid falling victim to undesirable optimizations, to offering
+**BlackHoles** and a solid collection of convention and cleverly thought out 
yet
+simple boilerplate, the goal of JMH is to lift the developer off the
+microbenchmark floor and at least to their knees.
+
+JMH reaches out a hand to both the best and most regular among us in a solid, 
cautious
+effort to promote the willing into the real-world often obscured game of the 
microbenchmark.
+
+## Code Organization Breakdown
+
+![](https://img.shields.io/badge/data-...move-blue)
+
+- **JMH:** microbenchmark classes and some common base code to support them.
+
+- **Random Data:** a framework for easily generating specific and repeatable 
random data.
+
+## Getting Started
+
+Running **JMH** is handled via the `jmh.sh` shell script. This script uses 
Gradle to
+extract the correct classpath and configures a handful of helpful Java
+command prompt arguments and system properties. For the most part, `jmh.sh` 
script
+will pass any arguments it receives directly to JMH. You run the script
+from the root JMH module directory.
+
+### Running `jmh.sh` with no Arguments
+
+> 
![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+>
+> ```zsh
+> # run all benchmarks found in subdirectories
+> ./jmh.sh
+> ```
+
+### Pass a regex pattern or name after the command to select the benchmark(s) 
to run
+
+> 
![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+>
+> ```zsh
+> ./jmh.sh BenchmarkClass 
+> ```
+
+### The argument `-l` will list all the available benchmarks
+
+> 
![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+>
+> ```zsh
+> ./jmh.sh -l
+> ```
+
+### Check which benchmarks will run by entering a pattern after the -l argument
+
+Use the full benchmark class name, the simple class name, the benchmark
+method name, or a substring.
+
+> 
![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+>
+> ```zsh
+> ./jmh.sh -l Ben
+> ```
+
+### Further Pattern Examples
+
+> 
![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+>
+> ```shell
+>./jmh.sh -l org.apache.solr.benchmark.search.BenchmarkClass
+>./jmh.sh -l BenchmarkClass
+>./jmh.sh -l BenchmarkClass.benchmethod
+>./jmh.sh -l Bench
+>./jmh.sh -l benchme
+
+### The JMH Script Accepts _ALL_ of the Standard JMH Arguments
+
+Here we tell JMH to run the trial iterations twice, forking a new JVM for each
+trial. We also explicitly set the number of warmup iterations and the
+measured iterations to 2.
+
+> 
![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+>
+> ```zsh
+> ./jmh.sh -f 2 -wi 2 -i 2 BenchmarkClass
+> ```
+
+### Overriding Benchmark Parameters
+
+> ![](https://img.shields.io/badge/overridable-params-blue)
+>
+> ```java
+> @Param("1000")
+> private int numDocs;
+> ```
+
+The state objects that can be specified in benchmark classes will often have a
+number of input parameters that benchmark method calls will access. The 
notation
+above will default numDocs to 1000 and also allow you to override that value
+using the `-p` argument. A benchmark might also use a @Param annotation such 
as:
+
+> ![](https://img.shields.io/badge/sequenced-params-blue)
+>
+> ```java
+> @Param("1000","5000","1000")
+> private int numDocs;
+> ```
+
+By default, that would cause the benchmark
+to be run enough times to use each of the specified values. If multiple input
+parameters are specified this way, the number of runs needed will quickly
+expand. You can pass multiple `-p`
+arguments and each will completely replace the behavior of any default
+annotation values.
+
+> 
![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+>
+> ```zsh
+> # use 2000 docs instead of 1000
+> ./jmh.sh BenchmarkClass -p numDocs=2000
+>
+>
+> # use 5 docs, then 50, then 500
+> ./jmh.sh BenchmarkClass -p numDocs=5,50,500
+>
+>
+> # run the benchmark enough times to satisfy every combination of two
+> multi-valued input parameters
+> ./jmh.sh BenchmarkClass -p numDocs=10,20,30 -p docSize 250,500
+> ```
+
+### Format and Write Results to Files
+
+Rather than just dumping benchmark results to the console, you can specify the
+\-rf argument to control the output format; for example, you can choose CSV or
+JSON. The -off argument will dictate the filename and output location.
+
+> 
![](https://user-images.githubusercontent.com/448788/137607441-f083e1fe-b3e5-4326-a9ca-2109c9cef985.png)
+>
+> ```zsh
+> # format output to JSON and write the file to the `work` directory relative 
to
+> # the JMH working directory.
+> ./jmh.sh BenchmarkClass -rf json -rff work/jmh-results.json
+> ```
+>
+> 💡 **If you pass only the `-rf` argument, JMH will write out a file to the
+> current working directory with the appropriate extension, e.g.,** 
`jmh-results.csv`.
+
+## JMH Command-Line Arguments
+
+### The JMH Command-Line Syntax
+
+> ![](https://img.shields.io/badge/Help-output-blue)
+>
+> ```zsh
+> Usage: ./jmh.sh [regexp*] [options]
+> [opt] means optional argument.
+> <opt> means required argument.
+> "+" means comma-separated list of values.
+> "time" arguments accept time suffixes, like "100ms".
+>
+> Command-line options usually take precedence over annotations.
+> ```
+
+### The Full List of JMH Arguments
+
+```zsh
 
  Usage: ./jmh.sh [regexp*] [options]
  [opt] means optional argument.
  <opt> means required argument.
- "+" means comma-separated list of values.
+ "+" means a comma-separated list of values.
  "time" arguments accept time suffixes, like "100ms".
 
-Command line options usually take precedence over annotations.
+Command-line options usually take precedence over annotations.
 
   [arguments]                 Benchmarks to run (regexp+). (default: .*) 
 
-  -bm <mode>                  Benchmark mode. Available modes are: 
[Throughput/thrpt, 
-                              AverageTime/avgt, SampleTime/sample, 
SingleShotTime/ss, 
+  -bm <mode>                  Benchmark mode. Available modes are: 
+                              [Throughput/thrpt,  AverageTime/avgt, 
+                              SampleTime/sample, SingleShotTime/ss, 
                               All/all]. (default: Throughput) 
 
   -bs <int>                   Batch size: number of benchmark method calls per 
                               operation. Some benchmark modes may ignore this 
-                              setting, please check this separately. (default: 
-                              1) 
+                              setting; please check this separately. 
+                              (default: 1) 
 
   -e <regexp+>                Benchmarks to exclude from the run. 
 
-  -f <int>                    How many times to fork a single benchmark. Use 0 
to 
-                              disable forking altogether. Warning: disabling 
-                              forking may have detrimental impact on benchmark 
-                              and infrastructure reliability, you might want 
-                              to use different warmup mode instead. (default: 
-                              5) 
-
-  -foe <bool>                 Should JMH fail immediately if any benchmark had 
-                              experienced an unrecoverable error? This helps 
-                              to make quick sanity tests for benchmark suites, 
-                              as well as make the automated runs with checking 
error 
+  -f <int>                    How many times to fork a single benchmark. Use 0
+                              to  disable forking altogether. Warning:
+                              disabling  forking may have a detrimental impact 
on
+                              benchmark and infrastructure reliability. You 
might
+                              want to use a different warmup mode instead. 
(default: 1) 
+
+  -foe <bool>                 Should JMH fail immediately if any benchmark has
+                              experienced an unrecoverable error? Failing fast
+                              helps to make quick sanity tests for benchmark
+                              suites and allows automated runs to do error
+                              checking.
                               codes. (default: false) 
 
   -gc <bool>                  Should JMH force GC between iterations? Forcing 
-                              the GC may help to lower the noise in GC-heavy 
benchmarks, 
-                              at the expense of jeopardizing GC ergonomics 
decisions. 
+                              GC may help lower the noise in GC-heavy 
benchmarks
+                              at the expense of jeopardizing GC ergonomics
+                              decisions. 
                               Use with care. (default: false) 
 
-  -h                          Display help, and exit. 
+  -h                          Displays this help output and exits. 
 
-  -i <int>                    Number of measurement iterations to do. 
Measurement 
-                              iterations are counted towards the benchmark 
score. 
-                              (default: 1 for SingleShotTime, and 5 for all 
other 
-                              modes) 
+  -i <int>                    Number of measurement iterations to do.
+                              Measurement 
+                              iterations are counted towards the benchmark
+                              score. 
+                              (default: 1 for SingleShotTime, and 5 for all
+                              other modes) 
 
-  -jvm <string>               Use given JVM for runs. This option only affects 
forked 
-                              runs. 
+  -jvm <string>               Use given JVM for runs. This option only affects
+                              forked  runs. 
 
-  -jvmArgs <string>           Use given JVM arguments. Most options are 
inherited 
-                              from the host VM options, but in some cases you 
want 
-                              to pass the options only to a forked VM. Either 
single 
-                              space-separated option line, or multiple options 
-                              are accepted. This option only affects forked 
runs. 
+  -jvmArgs <string>           Use given JVM arguments. Most options are 
+                              inherited from the host VM options, but in some
+                              cases, you want to pass the options only to a 
forked
+                              VM. Either single space-separated option line or 
+                              multiple options are accepted. This option only
+                              affects forked runs. 
 
-  -jvmArgsAppend <string>     Same as jvmArgs, but append these options after 
the 
-                              already given JVM args. 
+  -jvmArgsAppend <string>     Same as jvmArgs, but append these options after
+                              the  already given JVM args. 
 
   -jvmArgsPrepend <string>    Same as jvmArgs, but prepend these options 
before 
                               the already given JVM arg. 
 
-  -l                          List the benchmarks that match a filter, and 
exit. 
+  -l                          List the benchmarks that match a filter and 
exit. 
 
-  -lp                         List the benchmarks that match a filter, along 
with 
+  -lp                         List the benchmarks that match a filter, along
+  with 
                               parameters, and exit. 
 
-  -lprof                      List profilers, and exit. 
+  -lprof                      List profilers and exit. 
 
-  -lrf                        List machine-readable result formats, and exit. 
+  -lrf                        List machine-readable result formats and exit. 
 
   -o <filename>               Redirect human-readable output to a given file. 
 
-  -opi <int>                  Override operations per invocation, see 
@OperationsPerInvocation 
-                              Javadoc for details. (default: 1) 
+  -opi <int>                  Override operations per invocation, see 
+                              @OperationsPerInvocation  Javadoc for details.
+                              (default: 1) 
 
-  -p <param={v,}*>            Benchmark parameters. This option is expected to 
-                              be used once per parameter. Parameter name and 
parameter 
-                              values should be separated with equals sign. 
Parameter 
-                              values should be separated with commas. 
+  -p <param={v,}*>            Benchmark parameters. This option is expected to
+                              be used once per parameter. The parameter name 
and
+                              parameter values should be separated with an
+                              equal sign. Parameter values should be separated
+                              with commas. 
 
-  -prof <profiler>            Use profilers to collect additional benchmark 
data. 
-                              Some profilers are not available on all JVMs 
and/or 
-                              all OSes. Please see the list of available 
profilers 
-                              with -lprof. 
+  -prof <profiler>            Use profilers to collect additional benchmark
+  data. 
+                              Some profilers are not available on all JVMs or
+                              all OSes.  '-lprof' will list the available 
+                              profilers that are available and that can run
+                              with the current OS configuration and installed 
dependencies.
 
-  -r <time>                   Minimum time to spend at each measurement 
iteration. 
-                              Benchmarks may generally run longer than 
iteration 
-                              duration. (default: 10 s) 
+  -r <time>                   Minimum time to spend at each measurement
+                              iteration. Benchmarks may generally run longer
+                              than the iteration duration. (default: 10 s) 
 
   -rf <type>                  Format type for machine-readable results. These 
-                              results are written to a separate file (see 
-rff). 
-                              See the list of available result formats with 
-lrf. 
+                              results are written to a separate file
+                              (see -rff).  See the list of available result
+                              formats with -lrf. 
                               (default: CSV) 
 
   -rff <filename>             Write machine-readable results to a given file. 
-                              The file format is controlled by -rf option. 
Please 
-                              see the list of result formats for available 
formats. 
+                              The -rf option controls the file format. Please
+                              see  the list of result formats available. 
                               (default: jmh-result.<result-format>) 
 
-  -si <bool>                  Should JMH synchronize iterations? This would 
significantly 
-                              lower the noise in multithreaded tests, by 
making 
-                              sure the measured part happens only when all 
workers 
-                              are running. (default: true) 
+  -si <bool>                  Should JMH synchronize iterations? Doing so would
+                              significantly lower the noise in multithreaded
+                              tests by ensuring that the measured part happens
+                              when all workers are running.
+                              (default: true) 
 
-  -t <int>                    Number of worker threads to run with. 'max' 
means 
-                              the maximum number of hardware threads available 
-                              on the machine, figured out by JMH itself. 
(default: 
-                              1) 
+  -t <int>                    Number of worker threads to run with. 'max' means
+                              the maximum number of hardware threads available
+                              the machine, figured out by JMH itself. 
+                              (default:  1) 
 
   -tg <int+>                  Override thread group distribution for 
asymmetric 
                               benchmarks. This option expects a 
comma-separated 
-                              list of thread counts within the group. See 
@Group/@GroupThreads 
+                              list of thread counts within the group. See 
+                              @Group/@GroupThreads 
                               Javadoc for more information. 
 
-  -to <time>                  Timeout for benchmark iteration. After reaching 
-                              this timeout, JMH will try to interrupt the 
running 
-                              tasks. Non-cooperating benchmarks may ignore 
this 
+  -to <time>                  Timeout for benchmark iteration. After reaching
+                              this timeout, JMH will try to interrupt the 
running
+                              tasks. Non-cooperating benchmarks may ignore 
this =
                               timeout. (default: 10 min) 
 
-  -tu <TU>                    Override time unit in benchmark results. 
Available 
-                              time units are: [m, s, ms, us, ns]. (default: 
SECONDS) 
+  -tu <TU>                    Override time unit in benchmark results. 
Available
+                              time units are: [m, s, ms, us, ns].
+                              (default: SECONDS) 
 
-  -v <mode>                   Verbosity mode. Available modes are: [SILENT, 
NORMAL, 
-                              EXTRA]. (default: NORMAL) 
+  -v <mode>                   Verbosity mode. Available modes are: [SILENT,
+                              NORMAL, EXTRA]. (default: NORMAL) 
 
-  -w <time>                   Minimum time to spend at each warmup iteration. 
Benchmarks 
+  -w <time>                   Minimum time to spend at each warmup iteration.
+                              Benchmarks 
                               may generally run longer than iteration 
duration. 
                               (default: 10 s) 
 
-  -wbs <int>                  Warmup batch size: number of benchmark method 
calls 
-                              per operation. Some benchmark modes may ignore 
this 
-                              setting. (default: 1) 
+  -wbs <int>                  Warmup batch size: number of benchmark method
+                              calls  per operation. Some benchmark modes may
+                              ignore this  setting. (default: 1) 
 
-  -wf <int>                   How many warmup forks to make for a single 
benchmark. 
-                              All iterations within the warmup fork are not 
counted 
-                              towards the benchmark score. Use 0 to disable 
warmup 
-                              forks. (default: 0) 
+  -wf <int>                   How many warmup forks to make for a single
+                              benchmark.   All benchmark iterations within the
+                              warmup fork do not count towards the benchmark 
score.
+                              Use 0 to disable warmup forks. (default: 0) 
 
-  -wi <int>                   Number of warmup iterations to do. Warmup 
iterations 
-                              are not counted towards the benchmark score. 
(default: 
-                              0 for SingleShotTime, and 5 for all other modes) 
+  -wi <int>                   Number of warmup iterations to do. Warmup
+                              iterations do not count towards the benchmark
+                              score. 
+                              (default:  0 for SingleShotTime, and 5 for all 
other
+                              modes) 
 
   -wm <mode>                  Warmup mode for warming up selected benchmarks. 
-                              Warmup modes are: INDI = Warmup each benchmark 
individually, 
-                              then measure it. BULK = Warmup all benchmarks 
first, 
-                              then do all the measurements. BULK_INDI = Warmup 
-                              all benchmarks first, then re-warmup each 
benchmark 
-                              individually, then measure it. (default: INDI) 
-
-  -wmb <regexp+>              Warmup benchmarks to include in the run in 
addition 
-                              to already selected by the primary filters. 
Harness 
-                              will not measure these benchmarks, but only use 
them 
-                              for the warmup. 
+                              Warmup modes are INDI = Warmup each benchmark
+                              individually, 
+                              then measure it. BULK = Warm up all benchmarks
+                              first, then do all the measurements. BULK_INDI =
+                              warmup all benchmarks first, then re-warm up each
+                              benchmark individually, then measure it. 
+                              (default: INDI) 
+
+  -wmb <regexp+>              Warmup benchmarks to include in the run, in
+                              addition to already selected by the primary 
filters.
+                              The harness will not measure these benchmarks 
but only
+                              use them for the warmup. 
 ```
 
-## Writing benchmarks
+</details>
 
-For help in writing correct JMH tests, the best place to start is
+---
+
+## Writing JMH benchmarks
+
+For additional insight around writing correct JMH tests, the best place to 
start is
 the [sample 
code](https://hg.openjdk.java.net/code-tools/jmh/file/tip/jmh-samples/src/main/java/org/openjdk/jmh/samples/)
 provided by the JMH project.
 
-JMH is highly configurable and users are encouraged to look through the 
samples for suggestions on what options are
-available. A good tutorial for using JMH can be
+JMH is highly configurable, and users are encouraged to look through the 
samples
+for exposure around what options are available. A good tutorial for learning 
JMH basics is
 found 
[here](http://tutorials.jenkov.com/java-performance/jmh.html#return-value-from-benchmark-method)
 
-Many Solr JMH benchmarks are actually closer to a full integration benchmark 
in that they run a single action against a
-full Solr mini cluster.
-
-See 
[org.apache.solr.bench.index.CloudIndexing](https://github.com/apache/solr/blob/main/solr/benchmark/src/java/org/apache/solr/bench/index/CloudIndexing.java)
-for an example of this.
-
-### SolrCloud MiniCluster Benchmark Setup
-
-#### MiniCluster Setup
-
-- CloudIndexing.java
-
-```java
-    @Setup(Level.Trial)
-    public void doSetup(MiniClusterState.MiniClusterBenchState 
miniClusterState) throws Exception {
-      System.setProperty("mergePolicyFactory", 
"org.apache.solr.index.NoMergePolicyFactory");
-      miniClusterState.startMiniCluster(nodeCount);
-      miniClusterState.createCollection(COLLECTION, numShards, numReplicas);
-    }
-```
-
-#### MiniCluster Metrics
-
-After every iteration, the metrics collected by Solr will be dumped to the 
build/work/metrics-results folder. You can
-disable metrics collection using the metricsEnabled method of the 
MiniClusterState, in which case the same output files
-will be dumped, but the values will all be 0/null.
-
-### Benchmark Repeatability
-
-Indexes created for the benchmarks often involve randomness when generating 
terms, term length and number of terms in a
-field. In order to make benchmarks repeatable, a static seed is used for 
randoms. This allows for generating varying
-data while ensuring that data is consistent across runs.
-
-You can vary that seed by setting a system property to explore a wider range 
of variation in the benchmark:
-
-`-jvmArgsAppend -Dsolr.bench.seed=6624420638116043983`
-
-The seed used for a given benchmark run will be printed out near the top of 
the output.
-
-> --> benchmark random seed: 6624420638116043983
-
-You can also specify where to place the mini-cluster with a system property:
+## Continued Documentation
 
-`-jvmArgsAppend -DminiClusterBaseDir=/benchmark-data/mini-cluster`
+### 📚 Profilers
 
-In this case, new data will not be generated for the benchmark, even if you 
change parameters. The use case for this if
-you are running a query based benchmark and want to create a large index for 
testing and reuse (say hundreds of GB's).
-Be aware that with this system property set, that same mini-cluster will be 
reused for any benchmarks run, regardless of
-if that makes sense or not.
+- 📒 [docs/jmh-profilers.md](docs/docs/jmh-profilers.md)
+- 📒 [docs/jmh-profilers-setup.md](docs/docs/jmh-profilers-setup.md)

Review Comment:
   is docs/docs correct for these two links?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Re: [PR] SOLR-15625 Improve documentation for the benchmark module. [solr]

Reply via email to