Re: [DISCUSS] FLIP-146: Improve new TableSource and TableSink interfaces

Kurt Young Thu, 24 Sep 2020 01:15:42 -0700

Yeah, JDBC is definitely a popular use case we should consider.

Best,
Kurt



On Thu, Sep 24, 2020 at 4:11 PM Flavio Pompermaier <pomperma...@okkam.it>
wrote:

> Hi Kurt, in the past we had a very interesting use case in this regard: our
> customer (oracle) db was quite big and running too many queries in parallel
> was too heavy and it was causing the queries to fail.
> So we had to limit the source parallelism to 2 threads. After the fetching
> of the data the other operators could use the max parallelism as usual..
>
> Best,
> Flavio
>
> On Thu, Sep 24, 2020 at 9:59 AM Kurt Young <ykt...@gmail.com> wrote:
>
> > Thanks Jingsong for driving this, this is indeed a useful feature and
> lots
> > of users are asking for it.
> >
> > For setting a fixed source parallelism, I'm wondering whether this is
> > necessary. For kafka,
> > I can imagine users would expect Flink will use the number of partitions
> as
> > the parallelism. If it's too
> > large, one can use the max parallelism to make it smaller.
> > But for ES, which doesn't have ability to decide a reasonable parallelism
> > on its own, it might make sense
> > to introduce a user specified parallelism for such table source.
> >
> > So I think it would be better to reorganize the document a little bit, to
> > explain the connectors one by one. Briefly
> > introduce use cases and what kind of options are needed in your opinion.
> >
> > Regarding the interface `DataStreamScanProvider`, a concrete example
> would
> > help the discussion. What kind
> > of scenarios do you want to support? And what kind of connectors need
> such
> > an interface?
> >
> > Best,
> > Kurt
> >
> >
> > On Wed, Sep 23, 2020 at 9:30 PM admin <17626017...@163.com> wrote:
> >
> > > +1,it’s a good news
> > >
> > > > 2020年9月23日 下午6:22，Jingsong Li <jingsongl...@gmail.com> 写道：
> > > >
> > > > Hi all,
> > > >
> > > > I'd like to start a discussion about improving the new TableSource
> and
> > > > TableSink interfaces.
> > > >
> > > > Most connectors have been migrated to FLIP-95, but there are still
> the
> > > > Filesystem and Hive that have not been migrated. They have some
> > > > requirements on table connector API. And users also have some
> > additional
> > > > requirements：
> > > > - Some connectors have the ability to infer parallelism, the
> > parallelism
> > > is
> > > > good for most cases.
> > > > - Users have customized parallelism configuration requirements for
> > source
> > > > and sink.
> > > > - The connectors need to use topology to build their source/sink
> > instead
> > > of
> > > > a single function. Like JIRA[1], Partition Commit feature and File
> > > > Compaction feature.
> > > >
> > > > Details are in [2].
> > > >
> > > > [1]https://issues.apache.org/jira/browse/FLINK-18674
> > > > [2]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-146%3A+Improve+new+TableSource+and+TableSink+interfaces
> > > >
> > > > Best,
> > > > Jingsong
> > >
> > >
>

Re: [DISCUSS] FLIP-146: Improve new TableSource and TableSink interfaces

Reply via email to