Niel Markwick created BEAM-10259:
------------------------------------

             Summary: Spanner Session leak/overload in Streaming Dataflow
                 Key: BEAM-10259
                 URL: https://issues.apache.org/jira/browse/BEAM-10259
             Project: Beam
          Issue Type: Bug
          Components: io-java-gcp
    Affects Versions: 2.22.0, 2.21.0, 2.19.0, 2.18.0
            Reporter: Niel Markwick
            Assignee: Niel Markwick
             Fix For: 2.23.0


SpannerIO.WriteToSpannerFn connects to Spanner every time @Setup is called, and 
closes the connection every time @Teardown is called. 

This actually creates a separate Spanner connection and session pool for each 
WriteToSpannerFn, which generally speaking is one per thread

In single-threaded runners (eg batch dataflow on a single vCPU machine) this is 
not an issue, as there is normally only one WriteToSpannerFn per node/process.

In multi-threaded runners (eg streaming dataflow, or batch on multiple CPU 
machines), this can cause a problem with many session pools created (1 per 
thread) which can cause a respource leak, and is in general wasteful.

Spanner connections (and session pools) should be shared among all threads of a 
single process. so that the connection is only opened and closed once.

[~alxavier]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to