Thanks Konstantin for your Faker link.
It looks very interesting and very real.
We can add this generator to datagen source.
Best,
Jingsong Lee
On Fri, May 1, 2020 at 1:00 AM Konstantin Knauf wrote:
> Hi Jark,
>
> my gut feeling is 1), because of its consistency with other connectors
> (does no
Hi Jark,
my gut feeling is 1), because of its consistency with other connectors
(does not add two secret keywords) although it is more verbose.
Best,
Konstantin
On Thu, Apr 30, 2020 at 5:01 PM Jark Wu wrote:
> Hi Konstantin,
>
> Thanks for the link of Java Faker. It's an intereting project
Hi Konstantin,
Thanks for the link of Java Faker. It's an intereting project and
could benefit to a comprehensive datagen source.
What the discarding and printing sink look like in your thought?
1) manually create a table with a `blackhole` or `print` connector, e.g.
CREATE TABLE my_sink (
a I
Hi everyone,
sorry for reviving this thread at this point in time. Generally, I think,
this is a very valuable effort. Have we considered only providing a very
basic data generator (+ discarding and printing sink tables) in Apache
Flink and moving a more comprehensive data generating table source
Hi all,
I created https://issues.apache.org/jira/browse/FLINK-16743 for follow-up
discussion. FYI.
Best,
Jingsong Lee
On Tue, Mar 24, 2020 at 2:20 PM Bowen Li wrote:
> I agree with Jingsong that sink schema inference and system tables can be
> considered later. I wouldn’t recommend to tackle t
I agree with Jingsong that sink schema inference and system tables can be
considered later. I wouldn’t recommend to tackle them for the sake of
simplifying user experience to the extreme. Providing the above handy
source and sink implementations already offer users a ton of immediate
value.
On Mo
Hi Benchao,
> do you think we need to add more columns with various types?
I didn't list all types, but we should support primitive types, varchar,
Decimal, Timestamp and etc...
This can be done continuously.
Hi Benchao, Jark,
About console and blackhole, yes, they can have no schema, the schema
Hi Jingsong,
Regarding (2) and (3), I was thinking to ignore manually DDL work, so users
can use them directly:
# this will log results to `.out` files
INSERT INTO console
SELECT ...
# this will drop all received records
INSERT INTO blackhole
SELECT ...
Here `console` and `blackhole` are system
Hi Jingsong,
Thanks for bring this up. Generally, it's a very good proposal.
About data gen source, do you think we need to add more columns with
various types?
About print sink, do we need to specify the schema?
Jingsong Li 于2020年3月23日周一 下午1:51写道:
> Thanks Bowen, Jark and Dian for your feedb
Thanks Bowen, Jark and Dian for your feedback and suggestions.
I reorganize with your suggestions, and try to expose DDLs:
1.datagen source:
- easy startup/test for streaming job
- performance testing
DDL:
CREATE TABLE user (
id BIGINT,
age INT,
description STRING
) WITH (
'conne
Thanks Jingsong for bringing up this discussion. +1 to this proposal. I think
Bowen's proposal makes much sense to me.
This is also a painful problem for PyFlink users. Currently there is no
built-in easy-to-use table source/sink and it requires users to write a lot of
code to trying out PyFlin
+1 to Bowen's proposal. I also saw many requirements on such built-in
connectors.
I will leave some my thoughts here:
> 1. datagen source (random source)
I think we can merge the functinality of sequence-source into random source
to allow users to custom their data values.
Flink can generate rand
+1.
I would suggest to take a step even further and see what users really need
to test/try/play with table API and Flink SQL. Besides this one, here're
some more sources and sinks that I have developed or used previously to
facilitate building Flink table/SQL pipelines.
1. random input data s
Hi all,
I heard some users complain that table is difficult to test. Now with SQL
client, users are more and more inclined to use it to test rather than
program.
The most common example is Kafka source. If users need to test their SQL
output and checkpoint, they need to:
- 1.Launch a Kafka standa
14 matches
Mail list logo