That explains a lot. For example, why I was able to intercept calls to PubsubJsonClient by AspectJ locally but not in Dataflow. And why custom changes in the beam-sdks-java-io-google-cloud-platform artifact are ignored in Dataflow.
Thank you, Boyuan Zhang! [image: Brandmark_small.jpg] Andrei Litvinov, Senior Java Developer Grid Dynamics 5000 Executive Parkway # 520, San Ramon, CA 94583 Cell: +1 (510) 366 4205 Read Grid Dynamics' Tech Blog <http://blog.griddynamics.com/?utm_campaign=Big%20Data%20Blog%20social%20media%20promotion&utm_medium=CTA&utm_source=Email> This email message (and any attachments) is confidential and may be privileged or otherwise protected from disclosure by applicable law. If you are not the intended recipient or have received this in error please notify the system manager, secur...@griddynamics.com and remove this message and any attachments from your system. Any unauthorized dissemination, copying or other use of this message and/or any attachments is strictly prohibited and may constitute a breach of civil or criminal law. Grid Dynamics may monitor email traffic data and also the content of email. On Tue, Nov 24, 2020 at 5:00 PM Boyuan Zhang <boyu...@google.com> wrote: > Hi Andrei, > > Dataflow doesn't execute the pubsub write transform on the user worker but > as an internal operation. In that way, it is not possible to log messages > when pubsub write happens. > > But you can use --experiments=enable_custom_pubsub_sink to enforce the > Dataflow to use the expansion > <https://github.com/apache/beam/blob/d7655c12f7e0d575513c937ad53c45fbe1339c2d/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSink.java#L411-L436> > which > will be executed on the user worker. Then you can add your log somewhere > here > <https://github.com/apache/beam/blob/d7655c12f7e0d575513c937ad53c45fbe1339c2d/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSink.java#L246> > . > > If you believe there is something wrong with Dataflow, please contact gcp > customer support. > > On Tue, Nov 24, 2020 at 11:34 AM Andrei Litvinov < > alitvi...@griddynamics.com> wrote: > >> Hello Beam Community, >> >> We are using a streaming Dataflow job writing to Google Pub Sub through >> PubsubIO.Write. Does anybody know how we can log IDs which Pub Sub assigns >> to messages? >> We need that to localize where messages might be lost. >> Case 1. A Pub Sub topic is used as an interface between two projects. >> We are working on the sending side and do not control the receiving side. >> When the job was deployed the first time, Pub Sub did not accept messages >> (because of a permission problem), the job got stuck, but we did not get >> any errors in Dataflow logs. Neither were we able to tell if the messages >> were accepted by Pub Sub. >> Case 2. We suspect some messages are lost while going end-to-end >> through our system, we do not know exactly where. We would like to know >> IDs of messages which were actually accepted by Pub Sub to trace missing >> messages. >> >> Thank you, >> Andrei Litvinov >> >> >> >>