quot; (Flink >= 1.12).
>
> Caizhi Weng 于2021年9月7日周二 下午2:47写道:
>
>> Hi!
>>
>> If you mean batch SQL then you'll need to prepare enough task slots for
>> all subtasks. The number of task slots needed is the sum of parallelism of
>> all subtasks as there is
And My flink version is 1.11.0
lec ssmi 于2021年9月7日周二 下午2:11写道:
> Hi:
>I'm not familar with batch api .And I write a program just like
> "insert into tab_a select * from tab_b".
>From the picture, there are only two tasks, one is the source task
> whic
Hi:
I'm not familar with batch api .And I write a program just like
"insert into tab_a select * from tab_b".
From the picture, there are only two tasks, one is the source task which
is in RUNNING state. And the other one is sink task which is always in
CREATE state.
According to logs, I
Checkpoint can be done synchronously and asynchronously, the latter is
the default .
If you chooese the synchronous way , it may cause this problem.
nick toker 于2020年12月23日周三 下午3:53写道:
> Hi Yun,
>
> Sorry but we didn't understand your questions.
> The delay we are experiencing is on the *read
Hi
> No matter what interval you set, Flink will take care of the
> checkpoints(remove the useless checkpoint when it can), but when you set a
> very small checkpoint interval, there may be much high pressure for the
> storage system(here is RPC pressure of HDFS NN).
>
> Best,
>
Hi, if I set the checkpoint interval to be very small, such as 5 seconds,
will there be a lot of state files on HDFS? In theory, no matter what the
interval is set, every time you checkpoint, the old file will be deleted
and new file will be written, right?
jobs are pre-planned for each
> operator.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Aug 31, 2020 at 1:33 PM lec ssmi wrote:
>
>> HI:
>> Generally speaking, when we submitting the flink program, the number of
>> taskmanager and the memory of each tn
HI:
Generally speaking, when we submitting the flink program, the number of
taskmanager and the memory of each tn will be specified. And the smallest
real execution unit of flink should be operator.
Since the calculation logic corresponding to each operator is different,
some need to save the
First of all, mongodb itself does not support transactions. But you can
init a buffer to save each row when transaction begins, and save all the
buffered records to database once when transaction completes.
C DINESH 于2020年7月15日周三 上午9:28写道:
> Hello all,
>
> Can we implement TwoPhaseCommitPro
The old value is already counted in a partition, and when the above update
occurs, will the count value of the old partition be subtracted by 1, and
then added to the new partition?
Benchao Li 于2020年7月1日周三 下午1:11写道:
> Hi lec ssmi,
>
> > If the type value of a record is up
I think the old value will not retract, because the type value update, it
> will be calculate in new value, the old value will not be updated
>
>
> -原始邮件-----
> *发件人:*"lec ssmi"
> *发送时间:*2020-07-01 09:53:39 (星期三)
> *收件人:* flink-user
> *抄送:*
> *主题:* the g
Hi:
When we use sql for aggregation operation, for example, the following sql
>select count( distinct name) cnt, type from table group by type
Source data can be regarded as bin log data.
If the type value of a record is updated in the database, the values
before and after the
As we known, flink can set certain TTL config for the state, so that
the state can be automatically cleared after being idle for a period of
time.
But if there is no new record coming in after setting the TTL config ,
will the state be automatically cleared after a certain time? Or does it
re
Hi:
I received a java pojo serialized json string from kafka, and I want to
use UDTF to restore it to a table with a similar structure to pojo.
Some member variables of pojo use the List type or Map type
whose generic type is also a pojo .
The sample code as bellow:
public class Car {
Hi:
When encountering Retract, there was a following sql :
*select count(1) count, word group by word*
Suppose the current aggregation result is :
* 'hello'->3*
When there is record to come again, the count of 'hello' will be changed to
4.
The following two records will be generated i
operators.
>
> lec ssmi 于2020年5月12日周二 下午8:43写道:
>
>> Then if I don't write time constraints,
>> will it expire with the TTL time configured by TableConfig?
>> Benchao Li 于 2020年5月12日周二 20:27写道:
>>
>>> The state will be cleaned with watermark movement.
&
Then if I don't write time constraints,
will it expire with the TTL time configured by TableConfig?
Benchao Li 于 2020年5月12日周二 20:27写道:
> The state will be cleaned with watermark movement.
>
> lec ssmi 于2020年5月12日周二 下午5:55写道:
>
>> Hi:
>> If I join two streams in S
Hi:
If I join two streams in SQL, the time range is used as a condition,
similar to the time interval join in DataStream. So, will this join state
expire as the watermark moves, or will it expire with the TTL time
configured by TableConfig? Or both?
Best
Lec Ssmi
Hi:
Is there any way to implements async IO in UDFs (scalar function,
table function, aggregate function)?
ime attribute, the planner will give you
> an Exception if you did that.
>
>
> lec ssmi 于2020年5月5日周二 下午8:34写道:
>
>> As you said, if I select all the time attribute fields from
>> both , which will be the final one?
>>
>> Benchao Li 于 2020年5月5日周二
> from one of the input, then it will be time attribute automatically.
>
> lec ssmi 于2020年5月5日周二 下午4:42写道:
>
>> But I have not found there is any syntax to specify time
>> attribute field and watermark again with pure sql.
>>
>> Fabian Hueske 于
here's no need to feed data back to Kafka just to inject it again to
> assign new watermarks.
>
> Am Di., 5. Mai 2020 um 01:45 Uhr schrieb lec ssmi >:
>
>> I mean using pure sql statement to make it . Can it be possible?
>>
>> Fabian Hueske 于2020年5月4日周一 下午4:04写道
nsures that the watermark will be aligned with both of them.
>
> Best, Fabian
>
> Am Mo., 4. Mai 2020 um 00:48 Uhr schrieb lec ssmi >:
>
>> Thanks for your replay.
>> But as I known, if the time attribute will be retained and the time
>> attribute field of both stre
e will be preserved after time interval join.
> Could you share your DDL and SQL queries with us?
>
> lec ssmi 于2020年4月30日周四 下午5:48写道:
>
>> Hi:
>>I need to join multiple stream tables using time interval join. The
>> problem is that the time attribute will disa
Thanks, but is the bottom layer of the table API really implemented like
this?
Konstantin Knauf 于 2020年4月30日周四 22:02写道:
> Hi Lec Ssmi,
>
> yes, Dastream#connect on two streams both keyed on the productId with a
> KeyedCoProcessFunction is the way to go.
>
> Cheers,
>
&g
Hi:
I need to join multiple stream tables using time interval join. The
problem is that the time attribute will disappear after the jon , and
pure sql cannot declare the time attribute field again . So, to make is
success, I need to insert the last result of join to kafka ,and consume
it
Maybe, the connect method?
lec ssmi 于2020年4月30日周四 下午3:59写道:
> Hi:
> As the following sql:
>
>SELECT * FROM Orders INNER JOIN Product ON Orders.productId =
> Product.id
>
> If we use datastream API instead of sql, how should it be achieved?
> Because the APIs
above state capacity problem in sql
is using TableConfig. But TableConfig itself can only solve the state ttl
problem of non-time operators. So I think the above sql's implementation
is neither tumble window join, nor sliding window join and interval join.
Best Regards
Lec Ssmi
Hi:
can we get data later than watermark in sql ?
Best
Lec Ssmi
and
TableConfig? If I use StateTtlConfig and program with TableAPI, can the
configuration take effect?
Best regards
Lec Ssmi
According to my practice, it seems that declaring the timestamp and
watermark once again will not work, sql seems to have no such syntax.
But it's supported in DataStream API.
Leonard Xu 于2020年4月26日周日 下午6:30写道:
>
>
> 在 2020年4月23日,22:03,lec ssmi 写道:
>
> can assignT
rmark, then the new
> `assignTimestampAndWatermark` will override the previous watermark.
>
> Best,
> Jark
>
> On Thu, 23 Apr 2020 at 22:03, lec ssmi wrote:
>
>> can assignTimestampAndWatermark again on a watermarked table?
>>
>> Jark Wu 于 2020年4月23日周四 2
can assignTimestampAndWatermark again on a watermarked table?
Jark Wu 于 2020年4月23日周四 20:18写道:
> Hi Matyas,
>
> You can create a new table based on the existing table using LIKE syntax
> [1] in the upcoming 1.11 version, e.g.
>
> CREATE TABLE derived_table (
> WATERMARK FOR tstmp AS tsmp
Hi:
Is there an aggregation operation or window operation, the result is
with retract characteristics?
Watermark is updated
> - if currentWatermark - lastWatermark > watermarkInterval
> - emit watermark to downstream, and update lastWatermark
>
> lec ssmi 于2020年4月17日周五 下午4:50写道:
>
>> Maybe you are all right. I was more confused .
>> As the cwiki said, flink could us
Maybe you are all right. I was more confused .
As the cwiki said, flink could use BoundedOutOfOrderTimestamps ,
[image: image.png]
but I have heard about WatermarkAssignerOperator from Blink developers.
Benchao Li 于2020年4月17日周五 下午4:33写道:
> Hi lec ssmi,
>
> It's a good ques
Hi:
In sql API , the declaration of watermark is realized by ddl statement .
But which way is it implemented?
* PeriodicWatermark * or *PunctuatedWatermark*?
There seems to be no explanation on the official website.
Thanks.
appreciating our reply.
Hi:
I always wonder how much instance has been initialized in the whole
flink application.
Suppose there is such a scenario:
I have a UDTF called '*mongo_join'* through which the flink
table can join with external different mongo table according to the
parameters passed in.
In KeyedProcessFunction, we can register a timer based on EventTime, but in
the register method, we don't need to specify the time column. So if the
elements of this KeyedStream are not the classes that originally specified
timestamp and watermark, can this timer run normally?
40 matches
Mail list logo