Hi Arvid, Thanks for the suggestion! I will tryout to see how it works.
Best, Eleanore On Mon, May 18, 2020 at 8:04 AM Arvid Heise <ar...@ververica.com> wrote: > Hi Eleanore, > > The question in general is what you understand under edge data centers as > the term is pretty fuzzy. Since Flink is running on Java, it's not suitable > for embedded clusters as of now. There is plenty of work done already to > tests that Flink runs on ARM clusters [1]. > > If you just mean in general moving away from a monolithic hub cluster to > smaller clusters, then this is easily done with Flink on the compute side. > The question is rather how data storage should look in such an edge setting > and how the interfaces look. > > From your example, it seems as if you want to use Flink as a reactive > server, possibly easily scalable. If so, then yes it is possible with > Flink, even though I'd say it's not the primary use case for Flink. In any > case, synchronous requests will be a bit difficult/unnatural. I'd probably > go for an async job pattern. So Flink listens to some port for requests ( > socketTextStream [2]) with a job id, processes data and keeps the data in > state keyed by job id. The client then uses the job id to fetch the job > state through queryable state [2]. The responses eventually time out > through TTL [4]. > > Of course, you'd put a small proxy in front of that composited job > (separate input/query port) that translates the queries from the client to > the Flink job. The proxy would most likely also generate the job id and > return it to the client. Ultimately, that proxy could offer a synchronous > interface and pull for the result itself, but that makes the proxy suddenly > quite heavy. > > The proxy setup can be reused for different edge clusters making it a one > time investment. Note that there are other software stacks for reactive > servers that offer the functionality out of the box. > > [1] > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-ARM-support-for-Flink-td30298.html > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/dev/datastream_api.html#data-sources > [3] > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html > [4] > https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state.html#state-time-to-live-ttl > > On Mon, May 18, 2020 at 4:39 AM Eleanore Jin <eleanore....@gmail.com> > wrote: > >> Hi Community, >> >> Currently we are running flink in 'hub' data centers where data is >> ingested into the platform via kafka, and flink job will read from kafka, >> do the transformations, and publish to another kafka topic. >> >> I would also like to see if the same logic (read input message -> do >> transformation -> return output message) can be applied on 'edge' data >> centers. >> >> The requirement for run on 'edge' is to return the response >> synchronously. Like the synchronous http based request/response. >> >> Can you please provide some guidance/thoughts on this? >> >> Thanks a lot! >> Eleanore >> >> > > -- > > Arvid Heise | Senior Java Developer > > <https://www.ververica.com/> > > Follow us @VervericaData > > -- > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > -- > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > -- > Ververica GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji > (Toni) Cheng >