Hi, community!
Our team has prepared SparkReceiverIO Read via SDF PR [1]. We have started 
working on integration tests for the SparkReceiverIO connector which will allow 
to read data from Custom Spark Receivers in Apache Beam pipeline.

A general Apache Beam recommendation is to implement “ write then read” style 
integration tests. But in our case, only the Read interface was implemented 
because Spark Receivers couldn't be used for the write.

Since SparkReceiverIO is an abstract IO working with Spark Receivers, there is 
no exact implementation for a particular source. Therefore, we think to choose 
RabbitMQ as a test source for the following reasons:

  *   It’s possible to implement a Custom Spark Receiver on RabbitMQ as a test 
streaming receiver
  *   RabbitMQ is lightweight and easy to deploy
  *   There is a test container for RabbitMQ
  *   It’s possible to generate as much test input to the RabbitMQ as we need
  *   Apache Beam has a RabbitMQ IO [2]  that could hypothetically be used in 
the “write” step of the test

Cons of this choice are:

  *   We would need a RabbitMQ test container and additional Kubernetes 
configuration in ./test-infra
  *   The RabbitMQ peak throughput is less compared with Kafka, for example [3]


Based on this, two questions arise:

  1.  Are there any restrictions when choosing a test source? Can we use 
RabbitMQ in our case?

  2.  If RabbitMQ is suitable for our purposes, can we use the RabbitMQ IO to 
write data in the integration test “write” step or should we use RabbitMQ API 
directly without adding a dependency on Apache Beam RabbitMQ IO?


Any ideas or comments would be greatly appreciated!


Thank you in advance,

Elizaveta


[1] [BEAM-14378] [CdapIO] SparkReceiverIO Read via SDF #17828 – 
https://github.com/apache/beam/pull/17828

[2] Apache Beam RabbitMQ IO – 
https://github.com/apache/beam/tree/master/sdks/java/io/rabbitmq

[3] Benchmarking Apache Kafka, RabbitMQ article (2020 year) – 
https://www.confluent.io/blog/kafka-fastest-messaging-system/


Reply via email to