Cole-Greer opened a new pull request, #3418:
URL: https://github.com/apache/tinkerpop/pull/3418
## Replace AWK/shell docs preprocessing with AsciidoctorJ extension
### Overview
This PR introduces the `gremlin-docs` module — a custom AsciidoctorJ
TreeProcessor extension that replaces the entire AWK/shell preprocessing
pipeline used to build TinkerPop documentation. The old system consisted of 10
AWK scripts, 5 shell scripts, and required a running Gremlin Server, Hadoop
daemons, and a built Gremlin Console distribution. The new system runs entirely
within the Maven build via an embedded `GremlinGroovyScriptEngine`.
### What it does
- **Executes `[gremlin-groovy]` code blocks** in an embedded script engine
during Asciidoctor rendering, producing live console output with `gremlin>`
prompts and `==>` results
- **Auto-generates language variant tabs** (Java, Python, JavaScript, C#,
Go) for every translatable Gremlin example using the ANTLR-based
`GremlinTranslator`
- **Supports Hadoop/Spark examples** via local-mode Spark with a sandboxed
filesystem for `hdfs` operations
- **Handles standalone tab groups** (`[source,LANG,tab]` blocks) for
manually-authored multi-language examples
- **Detects console-only blocks** (`:remote`, `:>`, `:submit`) and renders
them as static code
- **Loads GremlinPlugin customizers** via SPI (hadoop-gremlin,
spark-gremlin, gremlin-console) for imports, bindings, and utility functions
like `describeGraph()`
### How to use
```bash
bin/process-docs-new.sh # full build with live gremlin execution
bin/process-docs-new.sh --dry-run # skip execution (fast, for layout
checks)
```
### Changes
- New `gremlin-docs/` module (not in the Maven reactor — built separately by
the build script)
- New `bin/process-docs-new.sh` build entry point
- Root `pom.xml`: added `gremlin-docs` as `asciidoctor-maven-plugin`
dependency, switched syntax highlighter from CodeRay to highlight.js 11.9.0,
added `tabs-1` CSS rule
- `docs/stylesheets/tinkerpop.css`: added `.tabs-1` CSS for single-tab blocks
### Known Issues
**Callout numbers not rendering in code blocks.** The circled callout
numbers (①②③) that appear inline in code examples are not visible. The `<b
class="conum">` elements are present in the HTML but highlight.js destroys them
when it processes the `<code>` block innerHTML. The callout explanation lists
below the code blocks render correctly — only the inline markers are affected.
**Path and other datatype formatting differs from console output.** The
Gremlin Console applies custom string conversions for certain datatypes (e.g.,
`Path` renders as `[v[1],v[2],vadas]` in the console but `path[v[1], v[2],
vadas]` in the embedded engine). This affects the displayed results but not
their correctness. Other datatypes with console-specific formatting may also
differ.
**SPARQL examples fail (~18 errors).** The `sparql-gremlin` plugin's
`SparqlTraversalSource` and `g.sparql()` calls fail because the SPARQL plugin
is not fully loaded in the embedded engine. These blocks fall back to dry-run
output showing `gremlin>` prompts without results.
**Neo4j examples fail (~6 errors).** Blocks using `Neo4jGraph`,
`graph.cypher()`, `vertex.addLabel()`, and other Neo4j-specific APIs fail
because Neo4j is not on the classpath. These fall back to dry-run output.
**Sugar plugin examples fail (~8 errors).** Groovy sugar syntax (`g.V.name`,
`g.V[0..2]`, `g.V.outE.weight`) fails because the Sugar plugin's AST
transformations are not active in the embedded engine.
**Remote connection examples fail (~10 errors).** Blocks using
`Cluster.open()`, `DriverRemoteConnection`, and
`traversal().with('conf/remote-graph.properties')` fail because there is no
running Gremlin Server. These are detected and rendered as static code when
they contain `:remote`/`:>` commands, but some blocks use the driver API
directly without console commands.
**A few Hadoop `hdfs.head()` calls fail.** The `hdfs.head('output/~g')` and
`hdfs.head('output', GryoInputFormat)` calls fail because the Spark computation
output format differs slightly in local mode vs a real cluster.
**No `x.y.z` version replacement in tab content.** The version placeholder
`x.y.z` is replaced in the post-processing step via sed, but content inside
tabs generated by the extension may not be caught if the placeholder appears in
translated code.
### Not changed
- The old `bin/process-docs.sh` and AWK pipeline are untouched — both
systems can coexist
- All AsciiDoc source files are unchanged
- Document structure, sections, anchors, and static content are preserved
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]