Hi Stefan,
the problem is that you cannot directly influence the scheduling of tasks
to nodes to ensure that you can read the data that you put in the local
filesystems of your nodes. HDFS gives a shared file system which means that
each node can read data from anywhere in the cluster.
I assumed t
Hi Fabian,
I think we might have a misunderstanding here. I have already copied the
first file to five nodes, and the second file to five other nodes, outside
of Flink. In the open() method of the operator, I just read that file via
normal Java means. I do not see, why this is tricky or how HDFS s
Hey Martin,
I don't think anybody used Google Cloud Pub/Sub with Flink yet.
There are no tutorials for implementing streaming sources and sinks, but
Flink has a few connectors that you can use as a reference.
For the sources, you basically have to extend RichSourceFunction (or
RichParallelSourceFu
I don't think its working.
According to the Kafka documentation (
https://kafka.apache.org/documentation.html#upgrade):
0.8, the release in which added replication, was our first
> backwards-incompatible release: major changes were made to the API,
> ZooKeeper data structures, and protocol, and co