This is an automated email from the ASF dual-hosted git repository. Cole-Greer pushed a commit to branch 3-7/threat-model-2026-06-05 in repository https://gitbox.apache.org/repos/asf/tinkerpop.git
commit b8e73abdfa771ee995ea62ea393475a314c03bc0 Author: Jarek Potiuk <[email protected]> AuthorDate: Fri Jun 5 03:47:38 2026 +0200 Add THREAT_MODEL.md + SECURITY.md and wire AGENTS.md for security-model discoverability Adds a draft threat model (ASF Security team v0, for the PMC to own and refine), a SECURITY.md pointing to it, and a Security section in AGENTS.md so the AGENTS.md -> SECURITY.md -> THREAT_MODEL.md discoverability chain resolves. Documentation only; no code or behaviour changes. Assisted-by: Claude Code:claude-opus-4-8 --- AGENTS.md | 8 +- SECURITY.md | 28 +++++ THREAT_MODEL.md | 348 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 383 insertions(+), 1 deletion(-) diff --git a/AGENTS.md b/AGENTS.md index 4cd7d9b5f8..186da57b90 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -241,4 +241,10 @@ If AGENTS.md does not clearly cover a situation: 3. Surface the question to human maintainers (for example, by leaving a comment, or drafting a minimal PR that asks for guidance). This file is intended to help tools act like a careful, well‑informed contributor. When in doubt, defer to human -judgment and the canonical project documentation. \ No newline at end of file +judgment and the canonical project documentation. + +## Security + +For Apache TinkerPop's threat model — trust boundaries, in-scope / out-of-scope, the security properties +the project does and does not provide, and known non-findings — see [SECURITY.md](SECURITY.md), which +points to [THREAT_MODEL.md](THREAT_MODEL.md). Consult it before triaging or reporting security issues. diff --git a/SECURITY.md b/SECURITY.md new file mode 100644 index 0000000000..825da47158 --- /dev/null +++ b/SECURITY.md @@ -0,0 +1,28 @@ +<!-- +Licensed to the Apache Software Foundation (ASF) under one or more +contributor license agreements. See the NOTICE file distributed with +this work for additional information regarding copyright ownership. +The ASF licenses this file to You under the Apache License, Version 2.0 +(the "License"); you may not use this file except in compliance with +the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +--> + +# Security Policy + +Apache TinkerPop's threat model — the assumptions, trust boundaries, what is in and out of scope, the +security properties the project does and does not provide, and known non-findings — is documented in +[THREAT_MODEL.md](THREAT_MODEL.md). Please read it before reporting a security issue. + +## Reporting a Vulnerability + +Please report security vulnerabilities privately following the +[ASF security process](https://www.apache.org/security/) — email +[[email protected]](mailto:[email protected]). Do not open public GitHub issues for security reports. diff --git a/THREAT_MODEL.md b/THREAT_MODEL.md new file mode 100644 index 0000000000..cdea5cae82 --- /dev/null +++ b/THREAT_MODEL.md @@ -0,0 +1,348 @@ +<!-- +Licensed to the Apache Software Foundation (ASF) under one or more +contributor license agreements. See the NOTICE file distributed with +this work for additional information regarding copyright ownership. +The ASF licenses this file to You under the Apache License, Version 2.0 +(the "License"); you may not use this file except in compliance with +the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +--> + +# Apache TinkerPop — Threat Model (v0 draft) + +## §1 Header + +- **Project:** Apache TinkerPop (`apache/tinkerpop`) — a graph computing framework: the Gremlin query + language, the traversal machine, Gremlin Server (remote query execution), the GraphSON / Gryo / + GraphBinary serialization formats, and the Gremlin Language Variants (Java, Python, .NET, Go, JS). + *(documented — `README.md`, repo layout)* +- **Scope of this model:** the **`apache/tinkerpop` monorepo**, active branches `master`, `3.7-dev`, + `3.8-dev` *(maintainer — colegreer, scope confirmation)*. The model focuses on the network-facing and + deserialization surfaces (see §2); provider graph databases that embed TinkerPop are out of scope (§3). +- **Date:** 2026-06-05. **Status:** DRAFT v0 — first pass by the ASF Security team via the + threat-model-producer rubric, for the TinkerPop PMC to react to. **Author:** ASF Security team, for PMC + ratification. +- **Version binding:** versioned with the project; a report against release *N* is triaged against the + model as it stood at *N*, not against `HEAD`. +- **Reporting cross-reference:** §8-property violations → report privately per the ASF process + (`[email protected]`); §3 / §9 findings are closed citing this document. +- **Provenance legend:** *(documented)* = stated in TinkerPop's own docs / reference / source; + *(maintainer)* = confirmed by a TinkerPop PMC member through this process; *(inferred)* = reasoned from + the reference docs, architecture, and graph-server domain knowledge, **not yet confirmed** — every + *(inferred)* claim has a matching §14 open question. +- **Draft confidence:** a v0 with **no PMC confirmation folded in yet** — the deployment posture, the + default auth/TLS stance, the script-execution sandboxing story, and the serialization (Gryo) + disposition are the highest-value items for the PMC to confirm in §14. +- **What TinkerPop is:** TinkerPop is an embeddable graph-computing framework. Applications and graph + databases (providers) embed `gremlin-core` to run Gremlin traversals; **Gremlin Server** exposes that + capability over the network (WebSocket sub-protocol + HTTP), accepting both **string-based Gremlin + scripts** (evaluated by the `gremlin-groovy` Groovy script engine) and **bytecode-based traversals** + (the GLVs). *(documented — reference docs)* + +## §2 Scope and intended use + +- **Primary use:** an operator-deployed graph-traversal engine. In production it is typically embedded by + a graph database (a "provider") or run as **Gremlin Server** behind the application tier, on a + **trusted network** *(inferred — confirm deployment posture, §14 Q1)*. Gremlin Server is the main + network-facing trust boundary. +- **Caller roles** (Gremlin Server is a network service): + - **remote client** — connects over the WebSocket sub-protocol or HTTP, submits Gremlin (script or + bytecode). Trust level depends entirely on whether the operator enabled authentication, authorization, + and script restrictions. *(inferred — §14 Q2)* + - **embedding application / provider** — links `gremlin-core` in-process and drives traversals directly. + Trusted within the compile-time + configuration boundary. *(inferred)* + - **operator** — runs Gremlin Server, controls `gremlin-server.yaml`, serializers, auth, TLS, the script + engine configuration, and the host. Trusted for the instance. *(inferred)* + +**Component-family table** *(in/out = in/out of this model; all rows inferred unless noted)*: + +| Family | Entry point | Touches outside the process | In model? | +| --- | --- | --- | --- | +| **Gremlin Server** (`gremlin-server`) | WebSocket sub-protocol + HTTP request handlers | network (listens), invokes the script engine + traversal machine | **In** *(documented: remote execution endpoint)* | +| **Script engine** (`gremlin-groovy`) | `GremlinGroovyScriptEngine` evaluating string scripts | runs supplied Groovy in-process | **In — central code-execution surface; see §9** *(documented: scripts are evaluated)* | +| **Gremlin language parser** (`gremlin-language`) | ANTLR grammar for string Gremlin (script-engine-free) | — | **In** — parser robustness on untrusted Gremlin strings *(inferred)* | +| **Core traversal machine + structure API** (`gremlin-core`) | `GraphTraversal`, strategies, `Vertex`/`Edge` | filesystem (IO formats) | **In** *(documented)* | +| **Serialization** (`gremlin-core`/`gremlin-util`/`gremlin-shaded`) | GraphSON (JSON), GraphBinary, **Gryo (Kryo-based)** readers | deserializes wire/file bytes | **In — deserialization of untrusted bytes; see §9** *(documented: 3 formats; Gryo wraps shaded Kryo)* | +| **Java driver** (`gremlin-driver`) | client connecting to Gremlin Server; deserializes server responses | network (connects) | **In** (client side) *(inferred)* | +| **Gremlin Language Variants** (`gremlin-python`, `-dotnet`, `-go`, `-js`) | build bytecode, serialize/deserialize | network (connect) | **In, lower priority** — client-side; deserialize server responses *(inferred — §14 Q9)* | +| **Reference graph** (`tinkergraph-gremlin`) | in-memory graph implementation | filesystem (persistence option) | **In** — the reference provider *(inferred)* | +| **OLAP** (`hadoop-gremlin`, `spark-gremlin`) | `GraphComputer` over Hadoop/Spark | network, filesystem, cluster | **In if deployed; operator cluster infra** *(inferred — §14 Q10)* | +| **Operator tooling** (`gremlin-console`) | interactive REPL run by the operator | — | **Out** — operator-trusted local tool *(inferred)* | +| `gremlin-test`, `gremlin-examples`, `gremlin-tools`, `gremlin-annotations` | test/build/example code | — | **Out** *(see §3)* | + +## §3 Out of scope (explicit non-goals) + +- **Provider graph databases that embed TinkerPop** (e.g. third-party graph DBs implementing the + structure/`GraphComputer` SPI). A vulnerability in a provider's own code is routed to that provider, not + here; this model covers TinkerPop's own code and its *use* of the SPI. *(inferred — §14 Q3)* +- **The embedding application's own authentication / authorization of its end users.** TinkerPop has no + concept of the embedding application's end users. *(inferred)* +- **Attackers who already control the host, the Gremlin Server process, `gremlin-server.yaml`, or the + graph data directory.** They have the operator's authority by definition. *(inferred)* +- **`gremlin-test/`, `gremlin-examples/`, build/distribution tooling, `gremlin-console`** as a production + trust surface. *(inferred)* +- **Confidentiality of data in transit when the operator has not enabled TLS** — see §10; the TLS posture + is the operator's deployment responsibility unless the project claims TLS-by-default (§14 Q5). *(inferred)* + +## §4 Trust boundaries and data flow + +- **Primary trust boundary: the Gremlin Server remote request surface.** Bytes arriving over the WebSocket + sub-protocol or HTTP — whether a **string script** or **bytecode traversal**, plus the serialized request + payload — are untrusted. The script engine, traversal machine, and structure/storage layer sit behind + this boundary. *(documented: remote endpoint; trust posture inferred — §14 Q2)* +- **The script-execution boundary (the highest-stakes one).** A string-based request is evaluated by the + Groovy script engine. Absent a configured sandbox / allow-list, **evaluating an attacker-supplied script + is arbitrary code execution on the server** — this is inherent to the script-based interface, not a bug + (§9). The question for triage is whether a given deployment restricts scripting (bytecode-only, + allow-list, sandbox) and who is authorized to submit scripts. *(documented: scripts are evaluated; + sandbox specifics inferred — §14 Q4)* +- **The deserialization boundary.** GraphSON, GraphBinary, and Gryo readers parse untrusted request bytes + (and, on the driver/GLV side, untrusted server-response bytes). Gryo is Kryo-based; Kryo deserialization + of untrusted input is a well-known RCE class unless type registration is locked down. *(documented: + formats exist; default-enabled set + Gryo hardening inferred — §14 Q6)* +- **Reachability preconditions** (the test a triager applies first): + - A finding reachable only by submitting a **script** is in-model only subject to the §9 ruling: if the + deployment allows scripting to the principal, server-side code execution is by-design. *(inferred)* + - A finding in the **Gremlin parser / bytecode / traversal machine / deserializers** reachable from a + client operating within its privileges (or pre-auth) is **in-model** for memory-safety / bounded- + resource robustness. *(inferred)* + - A finding requiring control of `gremlin-server.yaml` or host access is **out-of-model: trusted-input**. + *(inferred)* + +## §5 Assumptions about the environment + +- **Runtime:** JVM (server, core, groovy, driver); the GLVs run on their respective runtimes + (CPython, .NET, Go, Node). *(documented — README / build)* +- **Deployment:** Gremlin Server is assumed to run inside a **trusted network**, fronted by the + application tier, not exposed directly to the public internet. *(inferred — §14 Q1)* +- **Filesystem:** the graph data / config directories are private to the server process and not writable by + untrusted local users. *(inferred)* +- **Concurrency:** the server is multi-threaded and serves concurrent sessions; thread-safety of the + traversal/IO path is a correctness assumption. *(inferred)* +- **What the server does to its host** (negative inventory — predominantly inferred, a confirmation + target): listens on network ports; reads/writes its configured graph + data directories; reads config; + **executes supplied Groovy when script requests are enabled**; loads provider graph implementations the + operator configured. *(inferred — §14 Q4)* + +## §5a Build-time and configuration variants + +Knobs that change which security properties hold (Gremlin Server, `gremlin-server.yaml`): + +- **Authentication** — `Authenticator` (PlainText/SASL with a credentials graph; `Krb5Authenticator` for + Kerberos). Whether authentication is **off by default** (the shipped sample/getting-started configs) is + the key question. *(documented: mechanisms exist; default-off inferred — §14 Q2)* +- **Authorization** — an `Authorizer` interface (e.g. allow-listing which Gremlin a principal may run). + In-model only when configured; default likely none. *(documented: section exists; default inferred — §14 Q2)* +- **TLS/SSL** — configurable on the server connector; whether it is **off by default** drives the + transport-confidentiality property (§9). *(inferred — §14 Q5)* +- **Script execution restriction** — "Protecting Script Execution": sandbox / compilation customizers / + allow-list / preferring bytecode-only. Whether any restriction is on by default, or scripting is + unrestricted out of the box, is the single highest-stakes config question. *(documented: guidance exists; + default inferred — §14 Q4)* +- **Enabled serializers** — which of GraphSON / GraphBinary / **Gryo** are registered by default, and + whether Gryo (Kryo) is locked to registered types. *(documented: formats exist; defaults inferred — §14 Q6)* + +## §6 Assumptions about inputs + +Per-surface trust table *(inferred unless noted)*: + +| Surface | Input | Attacker-controllable? | Caller/operator must enforce | +| --- | --- | --- | --- | +| Gremlin Server — string script request | Groovy/Gremlin script text | **yes** (pre-auth if auth off) | auth; script restriction / sandbox / bytecode-only; who may script | +| Gremlin Server — bytecode/traversal request | serialized traversal bytecode | **yes**, within privileges | auth; traversal-step allow-list; resource limits | +| Request deserialization (GraphSON / GraphBinary / Gryo) | serialized bytes | **yes** (pre-auth) | enabled-serializer choice; Gryo type-registration lockdown | +| Gremlin string parser (`gremlin-language` ANTLR) | Gremlin string | **yes** | parser robustness (no crash/OOM/hang) | +| Driver / GLV — server response | serialized bytes from the server | yes if the server (or a MITM without TLS) is hostile | TLS; trust in the server | +| `gremlin-server.yaml`, host, data dir | local | no — operator-trusted | filesystem permissions | + +- **Shape / rate:** whether Gremlin Server bounds per-request CPU/memory, result-set size, traversal depth, + or concurrent requests — and the line between a bug and operator-managed capacity — is an open item + (§8 resource line, §14 Q7). *(inferred)* + +## §7 Adversary model + +- **Primary adversary:** a network client that can reach the Gremlin Server port from within the deployment + — either **unauthenticated** (if auth is off / pre-auth) or **authenticated with limited privileges** — + trying to execute code on the server (via scripts), read/write graph data outside its intent, crash or + exhaust the server with malformed requests or expensive traversals, or exploit a deserializer. *(inferred + — §14 Q2/Q7)* +- **Capabilities assumed:** can open connections, send arbitrary protocol bytes / scripts / bytecode / + serialized payloads, and supply large/malformed input. *(inferred)* +- **Out of scope:** anyone with operator/host/config control (already authoritative); a client that only + reaches the server because it was directly publicly exposed against guidance (non-supported posture, §3); + side-channel/timing adversaries unless the PMC wants them in. *(inferred — §14 Q1)* + +## §8 Security properties the project provides + +*(Inferred pending PMC confirmation — a property only counts once the project commits to it.)* + +- **Authentication + authorization enforcement (when configured).** With an `Authenticator`/`Authorizer` + set, an unauthenticated or unauthorized client cannot execute requests beyond its grants. *Violation + symptom:* auth/authz bypass. *Severity:* security-critical. *(inferred — §14 Q2)* +- **Memory / availability safety on the request + deserialization surface.** Malformed or pre-auth input + (protocol frames, Gremlin strings, serialized payloads) yields a clean error, not a crash, OOM, hang, or + unbounded allocation of the server. *Violation symptom:* server crash / unbounded allocation / deadlock + from malformed or pre-auth input. *Severity:* security-critical (remote DoS) if pre-auth. *(inferred — + §14 Q7)* +- **Deserializer integrity.** GraphSON / GraphBinary / Gryo reading attacker bytes does not lead to + arbitrary object instantiation / code execution beyond the documented type set. *Violation symptom:* + deserialization gadget / RCE. *Severity:* critical. **The strength of this property for Gryo (Kryo) + depends on type-registration lockdown** — see §9 / §14 Q6. *(inferred)* +- **Resource bounds — split, not unspecified.** Malformed/pre-auth input that crashes/OOMs/hangs the server + is **in-model** (above). Ordinary expensive traversals are **operator capacity/resource management**, NOT + in-model — unless a specific bug applies (super-linear amplification, a missing limit where one is + expected, an unbounded traversal). *(inferred — §14 Q7)* + +## §9 Security properties the project does *not* provide + +*(The highest-value section for integrators — inferred unless tagged; confirm each.)* + +- **Script execution is arbitrary code execution by design — not a sandbox.** When string-script requests + are enabled, the Groovy script engine evaluates attacker-supplied code in the server process. Submitting + a script that runs server-side code is **`BY-DESIGN`** for a principal the deployment permits to script; + it is the operator's job to restrict scripting (bytecode-only, allow-list, sandbox) and to authenticate + who may script. A scan reporting "Gremlin Server allows arbitrary code execution via scripts" is + by-design unless it bypasses a *configured* restriction. *(inferred — §14 Q4)* +- **No transport confidentiality/integrity unless TLS is enabled.** If TLS is off (see §5a/§14 Q5), the + server does not defend against a network attacker reading/modifying client traffic. *(inferred)* +- **No authentication or authorization by default (assumed).** If the shipped/default configuration runs + with auth off, an exposed server is reachable by anyone on the network — an operator deployment concern, + not a code bug. *(inferred — §14 Q2)* +- **Gryo / Kryo deserialization is not a safe boundary against untrusted input** unless type registration is + locked down. Operators who enable Gryo on an untrusted surface own that risk; prefer GraphBinary. *(inferred + — §14 Q6)* +- **Ordinary resource exhaustion is not a defended property.** Expensive traversals / large results that + consume CPU/memory are an operator capacity concern unless a specific bug applies (§8). *(inferred — §14 Q7)* +- **No defense against a malicious operator / host.** *(inferred)* +- **False friends:** + - "Authentication is available" does **not** mean it is **on** — it must be configured (§5a). *(inferred)* + - **Bytecode is safer than scripts but is not a sandbox** — a bytecode traversal still executes traversal + steps server-side; injection/abuse via an embedding app that builds traversals from its own untrusted + input is the embedding app's responsibility. *(inferred)* + - "Gremlin is just a query language" — the **string** form runs through a Groovy engine, so it is code, + not a constrained query. *(inferred)* + +## §10 Downstream responsibilities (operator/deployer) + +*(Inferred unless tagged — confirm.)* + +- Deploy Gremlin Server inside a trusted network; do **not** expose it directly to an untrusted/public + network, especially with scripting enabled and auth off. *(inferred — §14 Q1)* +- Enable authentication (`Authenticator`) and authorization (`Authorizer`) for any non-trivial deployment. + *(inferred — §14 Q2)* +- **Restrict script execution:** prefer bytecode-based traversals; if scripts are needed, apply the + "Protecting Script Execution" controls (sandbox / compilation customizers / allow-list) and restrict who + may submit scripts. *(documented: guidance exists — §14 Q4)* +- Enable TLS where traffic crosses an untrusted segment. *(inferred — §14 Q5)* +- Prefer **GraphBinary**; only enable **Gryo** on a trusted surface, with type registration locked down. + *(inferred — §14 Q6)* +- Apply per-request / per-traversal resource limits and result-size caps appropriate to capacity. *(inferred + — §14 Q7)* +- Set filesystem permissions so only the server user can read the config / data directories. *(inferred)* + +## §11 Known misuse patterns + +*(Draft one-liners — expand before publishing.)* + +- Exposing Gremlin Server to an untrusted network with scripting enabled and authentication off. *(inferred)* +- Treating the script-engine sandbox as a complete RCE boundary rather than restricting who may script. + *(inferred)* +- Building Gremlin (string or bytecode) by concatenating the embedding application's untrusted input + (Gremlin-injection). *(inferred)* +- Enabling Gryo on an untrusted request surface without type-registration lockdown. *(inferred)* + +## §11a Known non-findings (recurring false positives) + +*(Inferred unless tagged; the PMC's confirmations here are the highest-leverage suppression input.)* + +- "Gremlin Server executes arbitrary code via scripts" — by-design when scripting is enabled for the + principal (§9); not a finding unless it bypasses a *configured* restriction. *(inferred — §14 Q4)* +- "No authentication / no TLS by default" — operator deployment responsibility (§9/§10); not a code bug in + itself. *(inferred — §14 Q2/Q5)* +- "Gryo/Kryo deserialization can be exploited" — when the operator enabled Gryo on an untrusted surface; + operator responsibility, prefer GraphBinary (§9/§10). *(inferred — §14 Q6)* +- "Expensive traversal consumes CPU/memory" — operator capacity concern, unless a specific bug applies + (§8/§9). *(inferred — §14 Q7)* +- "Embedding app builds a traversal from untrusted input (Gremlin-injection)" — the embedding app's + responsibility (§9). *(inferred)* +- Findings in `gremlin-test/`, `gremlin-examples/`, tooling — out of scope (§3). *(inferred)* + +## §12 Conditions that would change this model + +- A change to the default auth / TLS / scripting posture (e.g. auth-on-by-default, scripts-off-by-default). +- A new client-reachable surface or protocol on Gremlin Server. +- A change to the default-enabled serializer set, or Gryo type-registration policy. +- Promoting `gremlin-console` / `gremlin-examples` into a production trust surface. +- A change to the OLAP (Hadoop/Spark) trust posture. +- A report that cannot be routed to a single §13 disposition (→ revise the model). + +## §13 Triage dispositions + +| Disposition | Meaning | Licensed by | +| --- | --- | --- | +| `VALID` | Violates a §8 property via an in-scope adversary/input (auth/authz bypass, pre-auth/malformed-input crash/OOM/hang, deserializer RCE beyond the documented type set, parser memory-safety). | §8, §6, §7 | +| `VALID-HARDENING` | No §8 property broken, but a §11 misuse is easy enough to harden. | §11 | +| `OUT-OF-MODEL: trusted-input` | Requires operator/host/config control. | §6, §7 | +| `OUT-OF-MODEL: adversary-not-in-scope` | Requires a capability the model excludes (host control, side channel, direct public exposure against guidance). | §3, §7 | +| `OUT-OF-MODEL: unsupported-component` | Lands in `gremlin-test/`, `gremlin-examples/`, tooling, or a separate provider graph DB. | §3 | +| `OUT-OF-MODEL: non-default-build` | Only manifests under a discouraged/non-default §5a setting (e.g. Gryo enabled on an untrusted surface, scripting unrestricted where the deployment intends bytecode-only). | §5a | +| `BY-DESIGN: property-disclaimed` | Concerns a §9-disclaimed property (script execution within its grant, no-TLS/no-auth default, ordinary resource exhaustion, malicious operator). | §9 | +| `KNOWN-NON-FINDING` | Matches a §11a entry. | §11a | +| `MODEL-GAP` | Cannot be cleanly routed — triggers a §12 revision. | §12 | + +## §14 Open questions for the maintainers + +Every *(inferred)* claim in the body maps to one of these. Proposed answers are inline; please confirm, +correct, or strike. + +1. **Deployment posture.** Is "Gremlin Server inside a trusted network, behind the app tier, not directly + public" the right §2/§5 framing? *Proposed: yes.* +2. **Default auth/authz posture.** Do the shipped/default Gremlin Server configs run with authentication + and authorization **off**, leaving it to the operator to enable? Is an unauthenticated exposed server an + operator-misconfiguration (`OUT-OF-MODEL`) rather than a code bug? *Proposed: yes — auth/authz are + opt-in.* +3. **Provider SPI boundary.** Confirm that vulnerabilities in third-party provider graph databases that + embed TinkerPop route to those providers, not here (this model covers TinkerPop's own code + its use of + the SPI). *Proposed: yes.* +4. **Script execution (highest-stakes).** Confirm that string-script evaluation = server-side code + execution by design, that restricting it (sandbox / allow-list / bytecode-only / who-may-script) is the + operator's responsibility, and what — if any — restriction is **on by default**. Is "Gremlin Server runs + arbitrary code via scripts" `BY-DESIGN` unless a *configured* restriction is bypassed? *Proposed: yes, + by-design; restriction is operator-configured.* +5. **TLS default.** Is TLS **off by default** on the server connector? Is no-TLS-by-default an operator + responsibility (§9/§10)? *Proposed: off by default; operator enables.* +6. **Serialization / Gryo.** Which serializers are registered by default? Is **Gryo (Kryo)** locked to + registered types, and is a Gryo-deserialization finding on an operator-enabled untrusted surface + `OUT-OF-MODEL: non-default-build` (with GraphBinary the recommended default)? Conversely, is a + deserializer flaw in the **default** set `VALID`? *Proposed: prefer GraphBinary; Gryo-on-untrusted is + operator responsibility; a flaw in the default set is VALID.* +7. **Resource line.** Confirm the split: malformed/pre-auth input causing crash/OOM/hang is `VALID`; + ordinary expensive traversals / large results are operator capacity unless a specific bug applies + (super-linear amplification, missing-expected-limit, unbounded traversal/recursion). Are there built-in + per-request limits (timeout, result cap, traversal depth)? *Proposed: split as stated.* +8. **Parser robustness.** Confirm that memory-safety / bounded-resource on the `gremlin-language` ANTLR + parser and the bytecode path against malformed input is a property TinkerPop commits to (§8). *Proposed: + yes.* +9. **GLV (driver) scope.** Are the Gremlin Language Variants (`gremlin-python`/`-dotnet`/`-go`/`-js`) in + scope for this batch, or deferred? They are client-side and deserialize server responses. *Proposed: + in-scope but lower priority; flag if any should be deferred.* +10. **OLAP scope.** Are `hadoop-gremlin` / `spark-gremlin` (`GraphComputer` over a cluster) in scope, or + treated as operator cluster infrastructure out of this model? *Proposed: operator infra; flag if you + want them fully in.* +11. **§11a seeds.** What do scanners/researchers most often report that the PMC considers a non-finding, + beyond the seed list above? *(seeds §11a)* +12. **Canonical location / triage policy.** Confirm this model lives as root `THREAT_MODEL.md` referenced + from a new `SECURITY.md`, wired from `AGENTS.md`, and that the PMC owns revisions. *Proposed: yes.* + +## §15 Machine-readable companion + +Deferred for v0. A `threat-model.yaml` can later encode the §6 trust table, §2/§3 component scoping, §8 +property/severity/symptom rows, §9 false friends, §11a non-findings, and §13 dispositions for automated +triage.
