Niel Markwick created BEAM-10259:
------------------------------------
Summary: Spanner Session leak/overload in Streaming Dataflow
Key: BEAM-10259
URL: https://issues.apache.org/jira/browse/BEAM-10259
Project: Beam
Issue Type: Bug
Components: io-java-gcp
Affects Versions: 2.22.0, 2.21.0, 2.19.0, 2.18.0
Reporter: Niel Markwick
Assignee: Niel Markwick
Fix For: 2.23.0
SpannerIO.WriteToSpannerFn connects to Spanner every time @Setup is called, and
closes the connection every time @Teardown is called.
This actually creates a separate Spanner connection and session pool for each
WriteToSpannerFn, which generally speaking is one per thread
In single-threaded runners (eg batch dataflow on a single vCPU machine) this is
not an issue, as there is normally only one WriteToSpannerFn per node/process.
In multi-threaded runners (eg streaming dataflow, or batch on multiple CPU
machines), this can cause a problem with many session pools created (1 per
thread) which can cause a respource leak, and is in general wasteful.
Spanner connections (and session pools) should be shared among all threads of a
single process. so that the connection is only opened and closed once.
[~alxavier]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)