This is an automated email from the ASF dual-hosted git repository.

kenhuuu pushed a commit to branch 3.7-dev
in repository https://gitbox.apache.org/repos/asf/tinkerpop.git


The following commit(s) were added to refs/heads/3.7-dev by this push:
     new 8e286c5d97 Update session reuse and Python set deserialization 
documentation CTR
8e286c5d97 is described below

commit 8e286c5d971c3f6d0a7e32d9a50df97f04ba509f
Author: Ken Hu <[email protected]>
AuthorDate: Tue Mar 31 13:38:38 2026 -0700

    Update session reuse and Python set deserialization documentation CTR
---
 docs/src/dev/provider/index.asciidoc             |  10 ++
 docs/src/reference/gremlin-applications.asciidoc |   2 +-
 docs/src/reference/gremlin-variants.asciidoc     | 122 ++++++++++++++++++++--
 docs/src/upgrade/release-3.7.x.asciidoc          | 127 ++++++++++++++++++++++-
 4 files changed, 250 insertions(+), 11 deletions(-)

diff --git a/docs/src/dev/provider/index.asciidoc 
b/docs/src/dev/provider/index.asciidoc
index d12335388b..c77d67c2bb 100644
--- a/docs/src/dev/provider/index.asciidoc
+++ b/docs/src/dev/provider/index.asciidoc
@@ -1191,6 +1191,16 @@ in 3.5.0, but has been added back as of 3.5.2. Servers 
wishing to be compatible
 this message (which is what Gremlin Server does as of 3.5.0). Drivers wishing 
to be compatible with servers prior to
 3.3.11 may continue to send the message on calls to `close()`, otherwise such 
code can be removed.
 
+NOTE: As of 3.7.6/3.8.1, the session lifecycle contract described above where 
sessions are cleaned up only when the
+underlying connection closes can be modified by the `closeSessionPostGraphOp` 
server setting. When enabled, the
+server will close a session after a successful TX_COMMIT or TX_ROLLBACK 
bytecode request, independent of the
+connection state. This allows multiple short-lived transaction sessions to 
share a single WebSocket connection over
+time, which is required by the Java GLV's `reuseConnectionsForSessions` 
option. This setting defaults to `false`
+to preserve the established 3.5.0 behavior. Providers implementing session 
support should be aware that when this
+setting is enabled, sessions and connections no longer have a one-to-one 
lifecycle relationship as a single connection
+may host many sessions sequentially. Providers wishing to support this 
capability are recommended to use the same
+`closeSessionPostGraphOp` configuration name for consistency across the 
TinkerPop ecosystem.
+
 **`authentication` operation arguments**
 
 [width="100%",cols="2,2,9",options="header"]
diff --git a/docs/src/reference/gremlin-applications.asciidoc 
b/docs/src/reference/gremlin-applications.asciidoc
index e76b718e81..338467bafb 100644
--- a/docs/src/reference/gremlin-applications.asciidoc
+++ b/docs/src/reference/gremlin-applications.asciidoc
@@ -1017,7 +1017,7 @@ The following table describes the various YAML 
configuration options that Gremli
 |authorization.authorizer |The fully qualified classname of an `Authorizer` 
implementation to use. |_none_
 |authorization.config |A `Map` of configuration settings to be passed to the 
`Authorizer` when it is constructed.  The settings available are dependent on 
the implementation. |_none_
 |channelizer |The fully qualified classname of the `Channelizer` 
implementation to use.  A `Channelizer` is a "channel initializer" which 
Gremlin Server uses to define the type of processing pipeline to use.  By 
allowing different `Channelizer` implementations, Gremlin Server can support 
different communication protocols (e.g. WebSocket). |`WebSocketChannelizer`
-|closeSessionPostGraphOp |Controls whether a `Session` will be closed by the 
server after a successful TX_COMMIT or TX_ROLLBACK bytecode request. |_false_
+|closeSessionPostGraphOp |Controls whether a `Session` will be closed by the 
server after a successful TX_COMMIT or TX_ROLLBACK bytecode request. This 
setting should be enabled when clients use the `reuseConnectionsForSessions` 
option (see <<gremlin-java-connection-reuse>>), which allows transaction 
sessions to share pooled connections. Without this setting, sessions opened by 
`reuseConnectionsForSessions` will not be cleaned up after commit or rollback 
and will remain open on the server [...]
 |enableAuditLog |The `AuthenticationHandler`, `AuthorizationHandler` and 
processors can issue audit logging messages with the authenticated user, remote 
socket address and requests with a gremlin query. For privacy reasons, the 
default value of this setting is false. The audit logging messages are logged 
at the INFO level via the `audit.org.apache.tinkerpop.gremlin.server` logger, 
which can be configured using the `logback.xml` file. |_false_
 |graphManager |The fully qualified classname of the `GraphManager` 
implementation to use.  A `GraphManager` is a class that adheres to the 
TinkerPop `GraphManager` interface, allowing custom implementations for storing 
and managing graph references, as well as defining custom methods to open and 
close graphs instantiations. To prevent Gremlin Server from starting when all 
graphs fails, the `CheckedGraphManager` can be used.|`DefaultGraphManager`
 |graphs |A `Map` of `Graph` configuration files where the key of the `Map` 
becomes the name to which the `Graph` will be bound and the value is the file 
name of a `Graph` configuration file. |_none_
diff --git a/docs/src/reference/gremlin-variants.asciidoc 
b/docs/src/reference/gremlin-variants.asciidoc
index 254d1f1e23..cbb7ab9369 100644
--- a/docs/src/reference/gremlin-variants.asciidoc
+++ b/docs/src/reference/gremlin-variants.asciidoc
@@ -883,6 +883,112 @@ Please see the 
link:https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/
 Transactions with Java are best described in <<transactions,The Traversal - 
Transactions>> section of this
 documentation as Java covers both embedded and remote use cases.
 
+[[gremlin-java-connection-reuse]]
+=== Connection Reuse for Transactions
+
+By default, each call to `g.tx()` opens a new dedicated WebSocket connection 
for the session that backs the
+transaction. For workloads that issue many short-lived transactions, the 
overhead of repeatedly establishing and
+tearing down WebSocket connections can become significant, particularly when 
the client and server are separated by
+network latency.
+
+The `reuseConnectionsForSessions` option on `Cluster.Builder` changes this 
behavior so that transaction sessions
+borrow connections from the existing connection pool instead of creating 
dedicated ones. When a transaction commits or
+rolls back, the borrowed connection is returned to the pool and becomes 
available for the next transaction.
+
+==== Enabling Connection Reuse
+
+This feature requires configuration on both the client and the server (if 
using Gremlin Server).
+
+On the client, enable `reuseConnectionsForSessions` when building the 
`Cluster`:
+
+[source,java]
+----
+Cluster cluster = Cluster.build("localhost")
+        .reuseConnectionsForSessions(true)
+        .create();
+GraphTraversalSource g = 
traversal().withRemote(DriverRemoteConnection.using(cluster));
+----
+
+The same setting can be specified in a YAML configuration file used with 
`Cluster.open()`:
+
+[source,yaml]
+----
+reuseConnectionsForSessions: true
+----
+
+On servers based on the Gremlin Server, enable `closeSessionPostGraphOp` so 
that sessions are closed immediately after
+a commit or rollback completes:
+
+[source,yaml]
+----
+# gremlin-server.yaml
+closeSessionPostGraphOp: true
+----
+
+IMPORTANT: Both settings must be configured together. If 
`reuseConnectionsForSessions` is enabled on the client but
+`closeSessionPostGraphOp` is not enabled on the server, sessions will not be 
cleaned up after commit or rollback.
+These leaked sessions will accumulate on the server until the configured 
session timeout is reached, consuming server
+resources unnecessarily.
+
+NOTE: Some Remote Gremlin Providers may handle session cleanup automatically 
and may not require explicit
+`closeSessionPostGraphOp` configuration. Consult the provider's documentation 
to determine whether this behavior is
+enabled by default, requires explicit configuration, or is unsupported.
+
+==== Usage
+
+The transaction API itself does not change when connection reuse is enabled. 
The standard pattern of `begin`, mutate,
+and `commit` or `rollback` applies:
+
+[source,java]
+----
+Cluster cluster = Cluster.build("localhost")
+        .reuseConnectionsForSessions(true)
+        .create();
+GraphTraversalSource g = 
traversal().withRemote(DriverRemoteConnection.using(cluster));
+
+GraphTraversalSource gtx = g.tx().begin();
+gtx.addV("person").property("name", "marko").iterate();
+gtx.addV("software").property("name", "lop").iterate();
+gtx.tx().commit();
+----
+
+After `commit()` or `rollback()`, the connection is returned to the pool. A 
subsequent call to `g.tx().begin()` will
+borrow a connection from the pool again, potentially reusing the same 
underlying WebSocket connection:
+
+A `GraphTraversalSource` obtained from `begin()` cannot be reused after its 
transaction has been committed or rolled
+back. Attempting to do so will result in an exception. A fresh call to 
`g.tx().begin()` is required for each new
+transaction.
+
+==== Concurrent Transactions
+
+Multiple transactions can be open simultaneously. Each transaction gets its 
own server-side session regardless of
+whether the underlying connections are shared. Because the underlying 
connection is borrowed rather than created, other
+settings on the `Cluster` such as `minConnectionPoolSize` and 
`maxSimultaneousUsagePerConnection` will have an effect
+on how the connection gets borrowed. These settings may need to be tweaked if 
there are many concurrent transactions.
+
+==== Restrictions
+
+Connection reuse for transactions has the following restrictions:
+
+* It is designed for short-lived transaction sessions that follow the 
begin/mutate/commit-or-rollback pattern. It
+  should not be used for classic long-running sessions such as those used with 
a remote console. For long-running
+  sessions, use the standard `cluster.connect(sessionId)` approach described 
in the
+  <<sessions,Considering Sessions>> Section.
+* It is not compatible with `HttpChannelizer`. Attempting to call `tx()` when 
the driver is configured with
+  `HttpChannelizer` will throw an `IllegalStateException`. This restriction 
applies regardless of the
+  `reuseConnectionsForSessions` setting.
+
+==== When to Use
+
+Connection reuse provides the greatest benefit when:
+
+* Network latency between client and server is significant (e.g. cross-region 
deployments).
+* Transactions are lightweight (few operations per transaction).
+* Many short-lived transactions are issued in sequence or concurrently.
+
+For local deployments or transactions that perform substantial graph 
mutations, the connection setup overhead is a
+smaller proportion of the total transaction time and the benefit is 
correspondingly smaller.
+
 [[gremlin-java-serialization]]
 === Serialization
 
@@ -2849,15 +2955,13 @@ therefore cardinality functions that take a value like 
`list()`, `set()`, and `s
 [[gremlin-python-limitations]]
 === Limitations
 
-* Traversals that return a `Set` *might* be coerced to a `List` in Python. In 
the case of Python, number equality
-is different from JVM languages which produces different `Set` results when 
those types are in use. When this case
-is detected during deserialization, the `Set` is coerced to a `List` so that 
traversals return consistent
-results within a collection across different languages. If a `Set` is needed 
then convert `List` results
-to `Set` manually.
-* Traversals that return a `Set` containing non-hashable items, such as 
`Dictionary`, `Set` and `List`, will be coerced
-into a `List` during deserialization. Python requires set elements to be 
hashable, for which Gremlin does not. If a
-`Set` is needed, convert elements to hashable equivalents manually (e.g. 
`dict` to `HashableDict`, `list` to `tuple`,
-`set` to `frozenset`).
+* Traversals that return a `Set` may be coerced to a `List` in Python in two 
cases. First, when the `Set` contains
+mixed numeric types (e.g. `int` and `float`), because Python number equality 
differs from the JVM — a Java `Set` of
+`[1, 1.0d]` has two elements, but Python considers `1 == 1.0` and would 
collapse them to one, so the `Set` is coerced to
+a `List` to preserve all elements consistently across languages. Second, when 
the `Set` contains non-hashable items such
+as `Dictionary`, `Set`, or `List`, because Python requires set elements to be 
hashable while Gremlin does not, the `Set`
+is also coerced to a `List`. For this case, if a `Set` is needed, convert 
elements to hashable equivalents manually
+(e.g. `dict` to `HashableDict`, `list` to `tuple`, `set` to `frozenset`).
 * Gremlin is capable of returning `Dictionary` results that use non-hashable 
keys (e.g. Dictionary as a key) and Python
 does not support that at a language level. Using GraphSON 3.0 or GraphBinary 
(after 3.5.0) makes it possible to return
 such results. In all other cases, Gremlin that returns such results will need 
to be re-written to avoid that sort of
diff --git a/docs/src/upgrade/release-3.7.x.asciidoc 
b/docs/src/upgrade/release-3.7.x.asciidoc
index 57fb43be18..17258b50d9 100644
--- a/docs/src/upgrade/release-3.7.x.asciidoc
+++ b/docs/src/upgrade/release-3.7.x.asciidoc
@@ -61,7 +61,7 @@ Gremlin Javascript now supports Node 22 and 24 alongside Node 
20.
 
 Gremlin Go has been upgraded to Go version 1.25.
 
-==== Python Set Deserialization with Non-Hashable Elements
+==== Python Set-to-List Fallback
 
 Traversals that return a `Set` containing non-hashable items (such as 
`Dictionary`, `Set`, or `List`) previously caused
 a `TypeError` during deserialization in Gremlin-Python. These results are now 
coerced to a `List` to avoid errors. This
@@ -70,10 +70,135 @@ Python hashable types manually (e.g. `dict` to 
`HashableDict`, `list` to `tuple`
 
 See: link:https://issues.apache.org/jira/browse/TINKERPOP-3232[TINKERPOP-3232]
 
+==== Remote Transaction Improvements
+
+The Java driver now supports reusing existing pooled WebSocket connections for 
session-based requests rather than
+establishing a dedicated connection per session. This behavior is controlled 
by the `Cluster.Builder` option
+`reuseConnectionsForSessions`, which defaults to `false`.
+
+When enabled, a `Client.SessionedChildClient` will attempt to borrow a 
connection from the connection pool of a standard
+`Client` rather than opening its own WebSocket connection. This avoids the 
overhead of the TCP handshake and WebSocket
+upgrade for each session, which can be significant when issuing many 
short-lived transactions.
+
+[source,java]
+----
+// Enable connection reuse for sessions
+Cluster cluster = Cluster.build(host)
+        .reuseConnectionsForSessions(true)
+        .create();
+----
+
+This feature was designed specifically for use with remote transactions, where 
sessions are short-lived and terminate
+after a `commit()` or `rollback()`. It should not be used for classic 
long-running session use cases where a session
+is used for purposes other than transactions such as remote console.
+
+===== Server Configuration
+
+When using `reuseConnectionsForSessions`, the server should be configured to 
close sessions immediately after a graph
+operation such as commit() or rollback() completes. Without this behavior, 
sessions may remain open until the session
+timeout expires, potentially leading to a buildup of idle sessions on the 
server side.
+
+Some remote graph providers handle this automatically and require no 
additional configuration. For the reference Gremlin
+Server, this is controlled by the `closeSessionPostGraphOp` setting, which 
should be set to true. Users of other graph
+providers should consult their provider's documentation to determine whether 
this behavior is enabled by default,
+requires explicit configuration or is unsupported.
+
+[source,yaml]
+----
+# gremlin-server.yaml
+closeSessionPostGraphOp: true
+----
+
+IMPORTANT: Failing to enable `closeSessionPostGraphOp` on the server when 
using `reuseConnectionsForSessions` on the
+client will result in sessions that are not properly cleaned up. These leaked 
sessions will accumulate until the
+configured `sessionLifetimeTimeout` is reached, consuming server resources 
unnecessarily.
+
+===== Performance
+
+Performance was measured with an ad-hoc benchmark application. The application 
executes a configurable number of
+complete transaction lifecycles (begin, mutate, commit) and reports throughput 
and latency percentiles. Each transaction
+opens a session, submits one or more `addV()` operations, commits, and closes 
the session.
+
+The benchmark varies the following parameters:
+
+* *Concurrent clients* (`threads`): The number of threads issuing transactions 
simultaneously. A value of 1 means
+  transactions are executed sequentially by a single client. Higher values 
simulate multiple application threads or
+  service instances issuing transactions concurrently against the same server.
+* *Connection pool size* (`pool`): The number of WebSocket connections 
maintained in the pool when
+  `reuseConnectionsForSessions` is enabled. When reuse is disabled, each 
session creates its own dedicated connection
+  and this parameter does not apply (shown as `n/a`).
+* *Transaction weight* (`weight`): "light" transactions perform a single 
`addV()` plus commit. "heavy" transactions
+  perform ten `addV()` operations plus commit, simulating a more substantial 
unit of work per transaction.
+
+Tests were conducted both locally (client and server on the same machine) and 
remotely (client on the US west coast,
+server on the US east coast) to isolate the effect of network latency on 
connection setup overhead. Each scenario
+executed 1000 transactions after a warmup phase of 50 transactions.
+
+*Local Results (same machine)*
+
+[cols="3,1,1,1", options="header"]
+|=========================================================
+|Configuration |No-Reuse (tx/s) |Best-Reuse (tx/s) |Speedup
+|1 client, light |23.1 |26.7 |1.16x
+|8 clients, light |25.2 |28.5 |1.13x
+|16 clients, light |25.4 |27.9 |1.10x
+|1 client, heavy |26.0 |26.9 |1.03x
+|8 clients, heavy |26.4 |27.9 |1.06x
+|16 clients, heavy |25.8 |26.5 |1.03x
+|=========================================================
+
+*Remote Results (west coast to east coast)*
+
+[cols="3,1,1,1", options="header"]
+|=========================================================
+|Configuration |No-Reuse (tx/s) |Best-Reuse (tx/s) |Speedup
+|1 client, light |3.6 |7.6 |2.10x
+|8 clients, light |15.6 |23.0 |1.48x
+|16 clients, light |15.4 |25.3 |1.64x
+|1 client, heavy |1.4 |1.8 |1.26x
+|8 clients, heavy |9.2 |10.8 |1.17x
+|16 clients, heavy |14.5 |15.9 |1.10x
+|=========================================================
+
+The "Best-Reuse" column reflects the highest throughput observed across all 
tested pool sizes (2, 4, and 8 connections)
+for each scenario.
+
+The benefit of connection reuse is most pronounced in remote scenarios with 
light transactions. When the network
+round-trip cost is high and the transaction payload is small, the WebSocket 
connection setup overhead represents a
+larger proportion of the total transaction time. In the single-client remote 
light workload, connection reuse yielded a
+2.10x throughput improvement, as the connection handshake cost dominated the 
per-transaction time. With 16 concurrent
+clients in the same remote light scenario, throughput improved from 15.4 tx/s 
to 25.3 tx/s (1.64x), as the connection
+pool amortized the setup cost across many parallel sessions.
+
+As transaction weight increases, the relative benefit diminishes because the 
graph operations themselves become the
+bottleneck rather than connection setup. In the local heavy workload 
scenarios, the improvement was only 3-6%, as the
+connection overhead was already negligible relative to the cost of the graph 
mutations. Even in the remote heavy
+scenarios, the improvement ranged from 10-26%, as the ten `addV()` operations 
per transaction shifted the time
+distribution toward server-side processing.
+
+In summary, `reuseConnectionsForSessions` provides the greatest benefit when:
+
+* Network latency between client and server is significant (remote deployments)
+* Transactions are lightweight (few operations per transaction)
+* Many short-lived transactions are issued in sequence or concurrently
+
+See: link:https://issues.apache.org/jira/browse/TINKERPOP-3213[TINKERPOP-3213]
+
 === Upgrading for Providers
 
 ==== Graph System Providers
 
+===== Session Changes
+
+An option has been added to the Java GLV (`reuseConnectionsForSessions`) that 
allows for borrowing open WebSocket
+connections for sessions. This is primarily to reduce the overhead of new 
connection setup per session. This can lead
+to large performance gains in remote transaction scenarios where there are 
many small mutation traversals.
+
+This option is disabled by default on the driver but providers may want to add 
an option that will allow sessions to end
+on the successful completion of a graph operation (commit/rollback). This will 
prevent a buildup of sessions if a user
+has enabled this option as the driver will *not* close the underlying 
WebSocket connection as a signal to end the
+session. Gremlin Server has added an option like this called 
`closeSessionPostGraphOp`. Remote graph providers are
+encouraged to add the same functionality.
 
 ==== Graph Driver Providers
 

Reply via email to