(tinkerpop) 24/27: Update developer docs for the AsciidoctorJ documentation build (tinkerpop-6jq.11)

colegreer Mon, 08 Jun 2026 10:17:18 -0700

This is an automated email from the ASF dual-hosted git repository.

Cole-Greer pushed a commit to branch docs-3.7
in repository https://gitbox.apache.org/repos/asf/tinkerpop.git


commit a5d924a5aaa673ba9cc3faf40798db5a1e416840
Author: Cole Greer <[email protected]>
AuthorDate: Thu Jun 4 10:38:33 2026 -0700

    Update developer docs for the AsciidoctorJ documentation build 
(tinkerpop-6jq.11)
    
    The old shell/AWK preprocessor and postprocessor directories have been
    removed, but the developer documentation still described that system.
    Rewrite the "Documentation Environment" section to describe the
    Maven-based AsciidoctorJ extension: it now states the build is
    Maven-driven, runs OLAP examples against the local filesystem
    (fs.defaultFS=file:///) so no Hadoop cluster is required, notes the
    Spark-on-YARN recipe is rendered from pre-captured output, and adds the
    prerequisite distribution build and --dryRun option. Drop the obsolete
    pseudo-distributed Hadoop / yarn-site / mapred-site instructions and the
    AWK/GNU-utils requirements. Point the OLAP jar-conflict note at the new
    per-book plugin exclusion mechanism, and update stale "preprocessor"
    wording in the committer docs.
    
    Assisted-by: Kiro:claude-opus-4.8 [kiro-cli]
---
 CHANGELOG.asciidoc                                 |   1 +
 .../dev/developer/development-environment.asciidoc | 126 ++++++---------------
 docs/src/dev/developer/for-committers.asciidoc     |   6 +-
 3 files changed, 39 insertions(+), 94 deletions(-)

diff --git a/CHANGELOG.asciidoc b/CHANGELOG.asciidoc
index 2ca89883f0..6dc2f95779 100644
--- a/CHANGELOG.asciidoc
+++ b/CHANGELOG.asciidoc
@@ -26,6 +26,7 @@ 
image::https://raw.githubusercontent.com/apache/tinkerpop/master/docs/static/ima
 === TinkerPop 3.7.7 (Release Date: NOT OFFICIALLY RELEASED YET)
 
 * Restart the documentation build's Gremlin Console with conflicting plugins 
excluded per-book (via the `gremlin-docs-plugins-exclude` attribute), so Neo4j 
(Scala 2.11) and Spark (Scala 2.12) no longer collide on a shared classpath.
+* Updated the developer documentation to describe the Maven/AsciidoctorJ 
documentation build and removed references to the retired shell/AWK 
preprocessor.
 
 * Fixed conjoin has incorrect null handling.
 * Expanded `gremlin-python` CI matrix to test against Python 3.9, 3.10, 3.11, 
3.12, and 3.13.
diff --git a/docs/src/dev/developer/development-environment.asciidoc 
b/docs/src/dev/developer/development-environment.asciidoc
index 5e667fe434..4fd92f9be1 100644
--- a/docs/src/dev/developer/development-environment.asciidoc
+++ b/docs/src/dev/developer/development-environment.asciidoc
@@ -122,95 +122,38 @@ an issue when working with SNAPSHOT dependencies.
 [[documentation-environment]]
 === Documentation Environment
 
-The documentation generation process is not Maven-based and uses shell scripts 
to process the project's asciidoc. The
-scripts should work on Mac and Linux. Javadocs should be built using Java 11.
+The documentation generation process is Maven-based: an 
link:https://asciidoctor.org/[AsciidoctorJ] extension
+(`tools/tinkerpop-docs`) walks each AsciiDoc book, executes the 
`[gremlin-groovy]` code blocks against a long-lived
+Gremlin Console subprocess, and renders the console output as tabbed HTML. The 
orchestration script
+`bin/process-docs.sh` wraps this: it validates the Gremlin Console and Gremlin 
Server distributions, installs the
+required plugins into the console, starts a Gremlin Server (for the `:remote` 
examples) and a Gephi mock, then invokes
+Maven to run the extension. Javadocs should be built using Java 11.
+
+NOTE: A previous implementation of this process was not Maven-based and 
instead relied on a pipeline of shell and AWK
+scripts under `docs/preprocessor` and `docs/postprocessor`. Those scripts have 
been removed; the console session scope
+also changed from per-file to per-book as a result (see 
<<docs-plugin-exclusions>>).
+
+The build runs Spark/Hadoop OLAP examples against the local filesystem 
(`fs.defaultFS=file:///`), so a running Hadoop
+cluster is *not* required for an ordinary documentation build. The one 
exception is the
+link:https://tinkerpop.apache.org/docs/x.y.z/recipes/#olap-spark-yarn[Spark-on-YARN
 recipe], whose example targets a
+real YARN cluster and is therefore rendered from pre-captured output rather 
than executed live. `bin/process-docs.sh`
+sets `HADOOP_GREMLIN_LIBS` for the console automatically. The YARN recipe text 
also references the `zip` program, so
+install it if you do not already have it.
+
+Before generating documentation, build the Gremlin Console and Gremlin Server 
distributions that the process consumes
+(include the Neo4j artifacts so the Neo4j examples can run):
 
-TIP: We recommend performing documentation generation on Linux. For the 
scripts to work on Mac, you will need to
-install GNU versions of the utility programs via `homebrew`, e.g.`grep`, 
`awk`, `sed`, `findutils`, and `diffutils`.
-
-To generate documentation, it is required that 
link:https://hadoop.apache.org[Hadoop 3.3.x] is running in
-link:https://hadoop.apache.org/docs/r3.3.1/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation[pseudo-distributed]
-mode. Be sure to set the `HADOOP_GREMLIN_LIBS` environment variable as 
described in the
-link:https://tinkerpop.apache.org/docs/x.y.z/reference/#hadoop-gremlin[reference
 documentation]. It is also important
-to set the `CLASSPATH` to point at the directory containing the Hadoop 
configuration files, like `mapred-site.xml`.
-
-The `/etc/hadoop/yarn-site.xml` file prefers this configuration over the one 
provided in the Hadoop documentation
-referenced above:
-
-[source,xml]
-----
-<configuration>
-  <property>
-    <name>yarn.nodemanager.aux-services</name>
-    <value>mapreduce_shuffle</value>
-  </property>
-  <property>
-    <name>yarn.nodemanager.vmem-check-enabled</name>
-    <value>false</value>
-  </property>
-  <property>
-    <name>yarn.nodemanager.vmem-pmem-ratio</name>
-    <value>4</value>
-  </property>
-</configuration>
-----
-
-The `/etc/hadoop/mapred-site.xml` file prefers the following configuration:
-
-[source,xml]
-----
-<configuration>
-  <property>
-    <name>mapreduce.framework.name</name>
-    <value>yarn</value>
-  </property>
-  <property>
-    <name>mapred.map.tasks</name>
-    <value>4</value>
-  </property>
-  <property>
-    <name>mapred.reduce.tasks</name>
-    <value>4</value>
-  </property>
-  <property>
-    <name>mapreduce.job.counters.limit</name>
-    <value>1000</value>
-  </property>
-  <property>
-    <name>mapreduce.jobtracker.address</name>
-    <value>localhost:9001</value>
-  </property>
-  <property>
-    <name>mapreduce.map.memory.mb</name>
-    <value>2048</value>
-  </property>
-  <property>
-    <name>mapreduce.reduce.memory.mb</name>
-    <value>4096</value>
-  </property>
-  <property>
-    <name>mapreduce.map.java.opts</name>
-    <value>-Xmx2048m</value>
-  </property>
-  <property>
-    <name>mapreduce.reduce.java.opts</name>
-    <value>-Xmx4096m</value>
-  </property>
-</configuration>
-----
-
-Also note that link:http://www.grymoire.com/Unix/Awk.html[awk] version `4.0.1` 
is required for documentation generation.
-The link:https://tinkerpop.apache.org/docs/x.y.z/recipes/#olap-spark-yarn[YARN 
recipe] also uses the `zip` program to
-create an archive so that needs to be installed, too, if you don't have it 
already.
-
-The Hadoop 3.3.x installation instructions call for installing `pdsh` but 
installing that seems to cause permission
-problems when executing `sbin/start-dfs.sh`. Skipping that prerequisite seems 
to solve the problem.
+[source,text]
+mvn clean install -pl :gremlin-console,:gremlin-server -am -DskipTests 
-DincludeNeo4j
 
-Documentation can be generated locally with:
+Documentation can then be generated locally with:
 
 [source,text]
 bin/process-docs.sh
 
+A `--dryRun` option renders the books without starting a console or server and 
without executing any code blocks,
+which is useful for quickly checking AsciiDoc/formatting changes.
+
 Documentation is generated to the `target/docs` directory. It is also possible 
to generate documentation locally with
 Docker. `docker/build.sh -d`.
 
@@ -219,14 +162,15 @@ failed`. It often helps in this case to delete the 
directories for the dependenc
 in the `.m2` (`~/.m2/`) and in the `grapes` (`~/.groovy/grapes/`) cache. E.g., 
if the error is about
 `asm#asm;3.2!asm.jar`, then remove the `asm/asm` sub directory in both 
directories.
 
-NOTE: Unexpected failures with OLAP often point to a jar conflict that arises 
in scenarios where Hadoop or Spark
-dependencies (or other dependencies for that matter) are modified and 
conflict. It is not picked up by the enforcer
-plugin because the inconsistency arises through plugin installation in Gremlin 
Console at document generation time.
-Making adjustments to the various paths by way of the `<manifestEntries>` on 
the jar given the functionality provided
-by the `DependencyGrabber` class which allows you to manipulate (typically 
deleting conflicting files from `/lib` and
-`/plugin`) plugin loading will usually resolve it, though it could also be a 
more general environmental problem with
-Spark or Hadoop. The easiest way to see the error is to simply run the 
examples in the Gremlin Console which more
-plainly displays the error than the failure of the documentation generation 
process.
+NOTE: Unexpected failures with OLAP often point to a jar conflict that arises 
when Hadoop, Spark, or Neo4j
+dependencies are modified and collide on the console's classpath. It is not 
picked up by the enforcer plugin because
+the inconsistency arises through plugin installation in the Gremlin Console at 
document generation time. The most
+common case -- Neo4j (Scala 2.11) and Spark (Scala 2.12) -- is handled by the 
per-book plugin exclusion mechanism
+described in <<docs-plugin-exclusions>>. For other conflicts, the 
`<manifestEntries>` (`Gremlin-Plugin-Paths` /
+`Gremlin-Lib-Paths`) on the plugin jar control how the `DependencyGrabber` 
lays jars out under `ext/<plugin>/plugin`
+and `ext/<plugin>/lib`, which can be adjusted to resolve ordering or 
duplicate-jar problems. The easiest way to see
+the underlying error is to run the offending example directly in the Gremlin 
Console, which displays it more plainly
+than the documentation build does.
 
 [[docs-plugin-exclusions]]
 ==== Per-book Plugin Exclusions
diff --git a/docs/src/dev/developer/for-committers.asciidoc 
b/docs/src/dev/developer/for-committers.asciidoc
index 5d40a18757..3e03b9a1b1 100644
--- a/docs/src/dev/developer/for-committers.asciidoc
+++ b/docs/src/dev/developer/for-committers.asciidoc
@@ -873,9 +873,9 @@ of the Apache "Licensing How-to" for more information.
 
 The documentation for TinkerPop is stored in the git repository in `docs/src/` 
and are then split into several
 subdirectories, each representing a "book" (or its own publishable body of 
work). If a new AsciiDoc file is added to
-a book, then it should also be included in the `index.asciidoc` file for that 
book, otherwise the preprocessor will
-ignore it. Likewise, if a whole new book (subdirectory) is added, it must 
include an `index.asciidoc` file to be
-recognized by the AsciiDoc preprocessor.
+a book, then it should also be included in the `index.asciidoc` file for that 
book, otherwise the documentation build
+will ignore it. Likewise, if a whole new book (subdirectory) is added, it must 
include an `index.asciidoc` file to be
+recognized by the documentation build.
 
 Adding a book also requires a change to the root `pom.xml` file. Find the 
"asciidoc" Maven profile and add a new
 `<execution>` to the `asciidoctor-maven-plugin` configuration. For each book 
in `docs/src/`, there should be a

(tinkerpop) 24/27: Update developer docs for the AsciidoctorJ documentation build (tinkerpop-6jq.11)

Reply via email to