Re: Any plans to make Flink configurable with pure data?

Arvid Heise Thu, 11 Feb 2021 03:03:18 -0800

Hi Pilgrim,

it sounds to me as if you are planning to use Flink for batch processing by
having some centralized server to where you submit your queries.

While you can use Flink SQL for that (and it's used in this way in larger
companies), the original idea of Flink was to be used in streaming
applications. For these applications, the query is not just configuration;
the query IS the application. For these applications, you'd usually spawn a
few K8s pods and let your application+Flink run. Hence, the many code
interfaces.

Even if you use SQL on streaming, you'd usually start a new ad-hoc cluster
for your application to have better isolation. There are quite a few
deployments that use YARN or Mesos to provide a large number of nodes which
are then used by a large number of Flink batch and streaming jobs, but I'd
usually not recommend that for newer users. I'd even go as far as to say
that most of these organizations wouldn't use that stack if they'd create a
cluster now and instead would also go for a K8s solution.

On Tue, Feb 9, 2021 at 1:05 PM Yun Gao <yungao...@aliyun.com> wrote:

> Hi Pilgrim,
>
> Currently table indeed could not using low level api like timer, would a
> mixture of sql & datastream
> could satisfy the requirements? A job might be created via multiple sqls,
> and connected via datastream
> operations.
>
> Best,
>  Yun
>
>
> ------------------------------------------------------------------
> Sender:Pilgrim Beart<pilgrim.be...@devicepilot.com>
> Date:2021/02/09 02:22:46
> Recipient:<user@flink.apache.org>
> Theme:Any plans to make Flink configurable with pure data?
>
> To a naive Flink newcomer (me) it's a little surprising that there is no
> pure "data" mechanism for specifying a Flink pipeline, only "code"
> interfaces. With the DataStream interface I can use Java, Scala or Python
> to set up a pipeline and then execute it - but that doesn't really seem to 
> *need
> *a programming model, it seems like configuration, which could be done
> with data? OK, one does need occasionally to specify some custom code, e.g.
> a ProcessFunction, but for any given use-case, a relatively static library
> of such functions would seem fine.
>
> My use case is that I have lots of customers, and I'm doing a similar job
> for each of them, so I'd prefer to have a library of common code (e.g.
> ProcessFunctions), and then specify each customer's specific requirements
> in a single config file.  To do that in Java, I'd have to do
> metaprogramming (to build various pieces of Java out of that config file).
>
> Flink SQL seems to be the closest solution, but doesn't appear to support
> fundamental Flink concepts such as timers (?). Is there a plan to evolve
> Flink SQL to support timers? Timeouts is my specific need.
>
> Thanks,
>
> -Pilgrim
> --
> Learn more at https://devicepilot.com @devicepilot
> <https://t.sidekickopen70.com/s2t/c/5/f18dQhb0S7kv8cpgQZVc6VPt59hl3kW7_k2842PjkFxW2R1KhZ7v4vclW2Rxbb82bzNKzf7GYHvr01?te=W3R5hFj4cm2zwW4fQ47l4fGCmnW3Fbt5S3H4THtF3F6jFSWsSg1&si=5987503666495488&pi=527a1892-cb03-476e-bce3-95b7b9783178>
>
>
>
>

Re: Any plans to make Flink configurable with pure data?

Reply via email to