Hey folks, I'm interested in streaming some data to Segment
<https://segment.com/docs/sources/server/java/>, using their existing java
library. This is a pretty high throughput stream, so I wanted for each
parallel operator to have its own instance of the segment client. From what
I could tell, defining a custom SinkFunction should be able to satisfy as
it as each parallel operator gets its own SinkFunction object
automatically. So my code looks like this:

class SegmentSink() extends SinkFunction[Data] {

  @transient
  val segmentClient: Analytics = Analytics.builder("key").build()

  override def invoke(value: Data, context: SinkFunction.Context[_]): Unit = {
    segmentClient.enqueue(...)
  }
}

Can anyone verify if this is the right pattern for me to use? Is there any
risk of the SinkFunction getting repeatedly serialized/deserialized which
results in new segment clients getting created each time?

Thanks,
Li

Reply via email to