Ok. This makes sense.
I will try it out.
One more question in terms of performance which of the two connector would
scan the existing collection faster. Say existing collection has 10 million
records and in terms of storage size it is 1GB.
Thanks
Sachin
On Fri, 16 Aug 2024 at 4:09 PM, Jiabao Sun
Yes, you can use flink-connector-mongodb-cdc to process both existing and new
data.
See
https://nightlies.apache.org/flink/flink-cdc-docs-release-3.1/docs/connectors/flink-sources/mongodb-cdc/#startup-reading-position
Best,
Jiabao
On 2024/08/16 10:26:55 Sachin Mittal wrote:
> Hi Jiabao,
> My u
Hi Jiabao,
My usecase is that when I start my flink job it should load and process all
the existing data in a collection and also wait and process any new data
that comes along the way.
As I notice that flink-connector-mongodb would process all the existing
data, so do I still need this connector o
Hi Sachin,
flink-connector-mongodb supports batch reading and writing to MongoDB, similar
to flink-connector-jdbc, while flink-connector-mongodb-cdc supports streaming
MongoDB changes.
If you need to stream MongoDB changes, you should use
flink-connector-mongodb-cdc.
You can refer to the fol
Hi,
I have a scenario where I load a collection from MongoDB inside Flink using
flink-connector-mongodb.
What I additionally want is any future changes (insert/updates) to that
collection is also streamed inside my Flink Job.
What I was thinking of is to use a CDC connector to stream data to my Fl