Re: Flink and Presto integration

Jingsong Li Mon, 27 Jan 2020 19:35:51 -0800

Hi Flavio,

Your requirement should be to use blink batch to read the tables in Presto?
I'm not familiar with Presto's catalog. Is it like hive Metastore?


If so, what needs to be done is similar to the hive connector.
You need to implement a catalog of presto, which translates the Presto
table into a Flink table. You may need to deal with partitions, statistics,
and so on.

Best,
Jingsong Lee

On Mon, Jan 27, 2020 at 9:58 PM Itamar Syn-Hershko <
ita...@bigdataboutique.com> wrote:

> Yes, Flink does batch processing by "reevaluating a stream" so to speak.
> Presto doesn't have sources and sinks, only catalogs (which are always
> allowing reads, and sometimes also writes).
>
> Presto catalogs are a configuration - they are managed on the node
> filesystem as a configuration file and nowhere else. Flink sources/sinks
> are programmatically configurable and are compiled into your Flink program.
> So that is not possible at the moment, and all that's possible to do is get
> that info form the API of both products and visualize that. Definitely not
> managing them from a single place.
>
> On Mon, Jan 27, 2020 at 3:54 PM Flavio Pompermaier <pomperma...@okkam.it>
> wrote:
>
>> Both Presto and Flink make use of a Catalog in order to be able to
>> read/write data from a source/sink.
>> I don't agree about " Flink is about processing data streams" because
>> Flink is competitive also for the batch workloads (and this will be further
>> improved in the next releases).
>> I'd like to register my data sources/sinks in one single catalog (E.g.
>> Presto) and then being able to reuse it also in Flink (with a simple
>> translation).
>> My idea of integration here is thus more at catalog level since I would
>> use Presto for exploring data from UI and Flink to process it because once
>> the configuration part has finished (since I have many Flink jobs that I
>> don't want to throw away or rewrite).
>>
>> On Mon, Jan 27, 2020 at 2:30 PM Itamar Syn-Hershko <
>> ita...@bigdataboutique.com> wrote:
>>
>>> Hi Flavio,
>>>
>>> Presto contributor and Starburst Partners here.
>>>
>>> Presto and Flink are solving completely different challenges. Flink is
>>> about processing data streams as they come in; Presto is about ad-hoc /
>>> periodic querying of data sources.
>>>
>>> A typical architecture would use Flink to process data streams and write
>>> data and aggregations to some data stores (Redis, MemSQL, SQLs,
>>> Elasticsearch, etc) and then using Presto to query those data stores (and
>>> possible also others using Query Federation).
>>>
>>> What kind of integration will you be looking for?
>>>
>>> On Mon, Jan 27, 2020 at 1:44 PM Flavio Pompermaier <pomperma...@okkam.it>
>>> wrote:
>>>
>>>> Hi all,
>>>> is there any integration between Presto and Flink? I'd like to use
>>>> Presto for the UI part (preview and so on) while using Flink for the batch
>>>> processing. Do you suggest something else otherwise?
>>>>
>>>> Best,
>>>> Flavio
>>>>
>>>
>>>
>>> --
>>>
>>> [image: logo] <https://bigdataboutique.com/>
>>> Itamar Syn-Hershko
>>> CTO, Founder
>>> +972-54-2467860
>>> ita...@bigdataboutique.com
>>> https://bigdataboutique.com
>>> <https://www.linkedin.com/in/itamar-syn-hershko-78b25013>
>>> <https://twitter.com/synhershko>
>>> <https://www.youtube.com/channel/UCBHr7lM2u6SCWPJvcKug-Yg>
>>>
>>
>>
>
> --
>
> [image: logo] <https://bigdataboutique.com/>
> Itamar Syn-Hershko
> CTO, Founder
> +972-54-2467860
> ita...@bigdataboutique.com
> https://bigdataboutique.com
> <https://www.linkedin.com/in/itamar-syn-hershko-78b25013>
> <https://twitter.com/synhershko>
> <https://www.youtube.com/channel/UCBHr7lM2u6SCWPJvcKug-Yg>
>


-- 
Best, Jingsong Lee

Re: Flink and Presto integration

Reply via email to