Thanks Till for your suggestions.

Personally, I like flink-warehouse, this is what we want to convey to
the user, but it indicates a bit too much scope.

How about just calling it flink-store?
Simply to convey an impression: this is flink's store project,
providing a built-in store for the flink compute engine, which can be
used by flink-table as well as flink-datastream.

Best,
Jingsong

On Tue, Dec 28, 2021 at 5:15 PM Till Rohrmann <trohrm...@apache.org> wrote:
>
> Hi Jingsong,
>
> I think that developing flink-dynamic-storage as a separate sub project is
> a very good idea since it allows us to move a lot faster and decouple
> releases from Flink. Hence big +1.
>
> Do we want to name it flink-dynamic-storage or shall we use a more
> descriptive name? dynamic-storage sounds a bit generic to me and I wouldn't
> know that this has something to do with letting Flink manage your tables
> and their storage. I don't have a very good idea but maybe we can call it
> flink-managed-tables, flink-warehouse, flink-olap or so.
>
> Cheers,
> Till
>
> On Tue, Dec 28, 2021 at 9:49 AM Martijn Visser <mart...@ververica.com>
> wrote:
>
> > Hi Jingsong,
> >
> > That sounds promising! +1 from my side to continue development under
> > flink-dynamic-storage as a Flink subproject. I think having a more in-depth
> > interface will benefit everyone.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Tue, 28 Dec 2021 at 04:23, Jingsong Li <jingsongl...@gmail.com> wrote:
> >
> >> Hi all,
> >>
> >> After some experimentation, we felt no problem putting the dynamic
> >> storage outside of flink, and it also allowed us to design the
> >> interface in more depth.
> >>
> >> What do you think? If there is no problem, I am asking for PMC's help
> >> here: we want to propose flink-dynamic-storage as a flink subproject,
> >> and we want to build the project under apache.
> >>
> >> Best,
> >> Jingsong
> >>
> >>
> >> On Wed, Nov 24, 2021 at 8:10 PM Jingsong Li <jingsongl...@gmail.com>
> >> wrote:
> >> >
> >> > Hi Stephan,
> >> >
> >> > Thanks for your reply.
> >> >
> >> > Data never expires automatically.
> >> >
> >> > If there is a need for data retention, the user can choose one of the
> >> > following options:
> >> > - In the SQL for querying the managed table, users filter the data by
> >> themselves
> >> > - Define the time partition, and users can delete the expired
> >> > partition by themselves. (DROP PARTITION ...)
> >> > - In the future version, we will support the "DELETE FROM" statement,
> >> > users can delete the expired data according to the conditions.
> >> >
> >> > So to answer your question:
> >> >
> >> > > Will the VMQ send retractions so that the data will be removed from
> >> the table (via compactions)?
> >> >
> >> > The current implementation is not sending retraction, which I think
> >> > theoretically should be sent, currently the user can filter by
> >> > subsequent conditions.
> >> > And yes, the subscriber would not see strictly a correct result. I
> >> > think this is something we can improve for Flink SQL.
> >> >
> >> > > Do we want time retention semantics handled by the compaction?
> >> >
> >> > Currently, no, Data never expires automatically.
> >> >
> >> > > Do we want to declare those types of queries "out of scope" initially?
> >> >
> >> > I think we want users to be able to use three options above to
> >> > accomplish their requirements.
> >> >
> >> > I will update FLIP to make the definition clearer and more explicit.
> >> >
> >> > Best,
> >> > Jingsong
> >> >
> >> > On Wed, Nov 24, 2021 at 5:01 AM Stephan Ewen <ewenstep...@gmail.com>
> >> wrote:
> >> > >
> >> > > Thanks for digging into this.
> >> > > Regarding this query:
> >> > >
> >> > > INSERT INTO the_table
> >> > >   SELECT window_end, COUNT(*)
> >> > >     FROM (TUMBLE(TABLE interactions, DESCRIPTOR(ts), INTERVAL '5'
> >> MINUTES))
> >> > > GROUP BY window_end
> >> > >   HAVING now() - window_end <= INTERVAL '14' DAYS;
> >> > >
> >> > > I am not sure I understand what the conclusion is on the data
> >> retention question, where the continuous streaming SQL query has retention
> >> semantics. I think we would need to answer the following questions (I will
> >> call the query that computed the managed table the "view materializer
> >> query" - VMQ).
> >> > >
> >> > > (1) I guess the VMQ will send no updates for windows beyond the
> >> "retention period" is over (14 days), as you said. That makes sense.
> >> > >
> >> > > (2) Will the VMQ send retractions so that the data will be removed
> >> from the table (via compactions)?
> >> > >   - if yes, this seems semantically better for users, but it will be
> >> expensive to keep the timers for retractions.
> >> > >   - if not, we can still solve this by adding filters to queries
> >> against the managed table, as long as these queries are in Flink.
> >> > >   - any subscriber to the changelog stream would not see strictly a
> >> correct result if we are not doing the retractions
> >> > >
> >> > > (3) Do we want time retention semantics handled by the compaction?
> >> > >   - if we say that we lazily apply the deletes in the queries that
> >> read the managed tables, then we could also age out the old data during
> >> compaction.
> >> > >   - that is cheap, but it might be too much of a special case to be
> >> very relevant here.
> >> > >
> >> > > (4) Do we want to declare those types of queries "out of scope"
> >> initially?
> >> > >   - if yes, how many users are we affecting? (I guess probably not
> >> many, but would be good to hear some thoughts from others on this)
> >> > >   - should we simply reject such queries in the optimizer as "not
> >> possible to support in managed tables"? I would suggest that, always better
> >> to tell users exactly what works and what not, rather than letting them be
> >> surprised in the end. Users can still remove the HAVING clause if they want
> >> the query to run, and that would be better than if the VMQ just silently
> >> ignores those semantics.
> >> > >
> >> > > Thanks,
> >> > > Stephan
> >> > >
> >> >
> >> >
> >> > --
> >> > Best, Jingsong Lee
> >>
> >>
> >>
> >> --
> >> Best, Jingsong Lee
> >>
> >



-- 
Best, Jingsong Lee

Reply via email to