Hi Jingsong,
Thanks for the feedback. Can you let me know the concept and timeline of
BulkFormat/ParquetBulkFormat and the difference with ParquetInputFormat?
Our use case is for backfill to process parquet files in case of any data
issue is found in the normal processing of kafka input. Thus, we
Hi Kevin,
Could you share the code of how you register the FlinkKafkaConsumer as a
table?
Regarding your initialization of FlinkKafkaConsumer, I would recommend to
setStartFromEarliest() to guarantee it consumes all the records in
partitions.
Regarding the flush(), it seems it is in the foreach
Just want to chime in here that it would be fantastic to have a way to DI
in Flink. Ideally the injected services themselves don't get serialized at
all since they're just singletons in our case. E.g. we have an API client
that looks up data from our API and caches it for all the functions that
nee
Thanks Henry for the detailed example,
I will explain why so many records at time 5.
That is because the retraction mechanism is per-record triggered in Flink
SQL, so there is record amplification in your case.
At time 5, the LAST_VALUE aggregation for stream a will first emit -(1,
12345, 0) and t
Yes. The dynamism might be a problem.
Does Kafka Connect support discovering new tables and synchronizing them
dynamically?
Best,
Jark
On Thu, 5 Nov 2020 at 04:39, Krzysztof Zarzycki
wrote:
> Hi Jark, thanks for joining the discussion!
> I understand your point of view that SQL environment is p
Hello,
I have two Kafka topics ("A" and "B") which provide similar structure wise data
but with different load pattern, for example hundreds records per second in
first topic while 10 records per second in second topic. Events processed using
same algorithm and output in common sink, currently
Hi Flink Users and Community,
For the first part of the question, the 12 hour time difference is caused
by a time extraction bug myself. I can get the time translated correctly
now. The type cast problem does have some workarounds to solve it..
My major blocker right now is the onTimer part is no
Hi Jark, thanks for joining the discussion!
I understand your point of view that SQL environment is probably not the
best for what I was looking to achieve.
The idea of a configuration tool sounds almost perfect :) Almost , because:
Without the "StatementSet" that you mentioned at the end I would b
This issue had to do with the update strategy for the Flink deployment.
When I changed it to the following, it will work:
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
On Tue, Nov 3, 2020 at 1:39 PM Robert Metzger wrote:
> Thanks a lot for prov
Thank you for the info!
Is there a timetable for when the next version with this change might
release?
On Wed, Nov 4, 2020 at 2:44 AM Timo Walther wrote:
> Hi Rex,
>
> sorry for the late reply. POJOs will have much better support in the
> upcoming Flink versions because they have been fully int
Dear flink developers&users
I have a question about flink sql, It gives me a lot of trouble, Thank
you very much for some help.
Lets's assume we have two data stream, `order` and `order_detail`, they
are from mysql binlog.
Table `order` schema:
id int primary key
Dear flink developers&users
I have a question about flink sql, It gives me a lot of trouble, Thank
you very much for some help.
Lets's assume we have two data stream, `order` and `order_detail`, they
are from mysql binlog.
Table `order` schema:
id int primary key
Hi,
Let's assume we have two stream, order stream& order detail stream, they
are from mysql binlog.
Table `order` schema: id primary key, order_id and order_status
Table `order_detail` schema: id primary key, order_id and quantity
one order item have several order_detail items
if we have follow
Using Flink 1.9.2 with FsStateBackend, Session cluster.
1. Does heap state get cleaned up when a job is cancelled?
We have jobs that we run on a daily basis. We start each morning and cancel
each evening. We noticed that the process size does not seem to shrink. We
are looking at the res
Hi Sidney,
could you describe how your aggregation works and how your current pipeline
looks like? Is the aggregation partially applied before shuffling the data?
I'm a bit lost on how aggregation without keyby looks like.
A decrease in throughput may also be a result of more overhead and less
av
You're right, this is scale problem (for me that's performance).
As for what you were saying about the data skew, that could be it so I tried
running the job without using keyBy and I wrote an aggregator that accumulates
the events per country and then wrote a FlatMap that takes that map and ret
Here it is: https://issues.apache.org/jira/browse/FLINK-19969
Best,
Flavio
On Wed, Nov 4, 2020 at 11:08 AM Kostas Kloudas wrote:
> Could you also post the ticket here @Flavio Pompermaier and we will
> have a look before the upcoming release.
>
> Thanks,
> Kostas
>
> On Wed, Nov 4, 2020 at 10:5
Hi Rex,
sorry for the late reply. POJOs will have much better support in the
upcoming Flink versions because they have been fully integrated with the
new table type system mentioned in FLIP-37 [1] (e.g. support for
immutable POJOs and nested DataTypeHints etc).
For queries, scalar, and table
I'm afraid there's nothing in Flink that would make this possible right now.
Have you thought about if this would be possible by using the vanilla
Kafka Consumer APIs? I'm not sure that it's possible to read messages
with prioritization using their APIs.
Best,
Aljoscha
On 04.11.20 08:34, Rob
Could you also post the ticket here @Flavio Pompermaier and we will
have a look before the upcoming release.
Thanks,
Kostas
On Wed, Nov 4, 2020 at 10:58 AM Chesnay Schepler wrote:
>
> Good find, this is an oversight in the CliFrontendParser; no help is
> printed for the run-application action.
Hi Arvid,
Thank you for your detailed answer. I read your answer and finally found
that I did not understand well on the difference between micro-batch
model and continuous(one-by-one) processing model. I am familiar with
micro-batch model but not with continuous one. So, I will search some
d
Good find, this is an oversight in the CliFrontendParser; no help is
printed for the run-application action.
Can you file a ticket?
On 11/4/2020 10:53 AM, Flavio Pompermaier wrote:
Hello everybody,
I was looking into currently supported application-modes when
submitting a Flink job so I tried
Hello everybody,
I was looking into currently supported application-modes when submitting a
Flink job so I tried to use the CLI help (I'm using Flink 11.0) but I can't
find any help about he action "run-application" at the moment...am I wrong?
Is there any JIRA to address this missing documentation
Hi Krzysztof,
This is a very interesting idea.
I think SQL is not a suitable tool for this use case, because SQL is a
structured query language
where the table schema is fixed and never changes during job running.
However, I think it can be a configuration tool project on top of Flink
SQL.
The
24 matches
Mail list logo