Sorry I forgot to add user ML. I also would like to gather some users
feedback on this thing.
Since I didn't get any feedback on this topic before from users.
Best,
Kurt
On Thu, Nov 18, 2021 at 6:33 PM Kurt Young wrote:
> (added user ML to this thread)
>
> HI all,
>
> I w
What kind of expectation do you have after you add the "max(a)" aggregation:
a. Keep summing a and start to calculate max(a) after you added. In other
words, max(a) won't take the history data into account.
b. First process all the historical data to get a result of max(a), and
then start to compu
Hi, please use user mailing list only to discuss these issues.
Best,
Kurt
On Sat, May 8, 2021 at 1:05 PM 1095193...@qq.com <1095193...@qq.com> wrote:
> Hi
>I have tried cumalate window function in Flink-1.13 sql to accumulate
> data from Kafka. When I restart a cumulate window sql job, las
t us know.
Best,
Kurt
On Sun, Apr 11, 2021 at 9:18 PM Flavio Pompermaier
wrote:
> Thanks for the suggestions Kurt. Actually I could use Table Api I think,
> it's just that most of our Flink code use DataSet Api.
>
> Il dom 11 apr 2021, 13:44 Kurt Young ha scritto:
>
>&
really migrate legacy code.
> Also reduceGroup would help but less urgent.
> I hope that my feedback as Flink user could be useful.
>
> Best,
> Flavio
>
> On Fri, Apr 9, 2021 at 12:38 PM Kurt Young wrote:
>
>> Converting from table to DataStream in batch mode is indeed a
ment.create(streamEnv, envSettings);
> but in that case I can't use inBatchMode..
>
> On Fri, Apr 9, 2021 at 11:44 AM Kurt Young wrote:
>
>> `format.ignore-first-line` is unfortunately a regression compared to the
>> old one.
>> I've created a ticket [1] to
quot;format.ignore-first-line" but now I can't find
>> another way to skip it.
>> I could set csv.ignore-parse-errors to true but then I can't detect other
>> parsing errors, otherwise I need to manually transofrm the header into a
>> comment adding the # ch
My DDL is:
CREATE TABLE csv (
id BIGINT,
name STRING
) WITH (
'connector' = 'filesystem',
'path' = '.',
'format' = 'csv'
);
Best,
Kurt
On Fri, Apr 9, 2021 at 10:00 AM Kurt Young wrote:
>
Hi Flavio,
We would recommend you to use new table source & sink interfaces, which
have different
property keys compared to the old ones, e.g. 'connector' v.s.
'connector.type'.
You can follow the 1.12 doc [1] to define your csv table, everything should
work just fine.
*Flink SQL> set table.dml-
morrow morning.
>>>
>>> This is a critical fix as now predicate pushdown won't work for any
>>> stream which generates a watermark and wants to push down predicates.
>>>
>>> On Thu, Apr 1, 2021, 10:56 Kurt Young wrote:
>>>
>>>> Thanks D
reen, it's separate from other
> components, and it's a super useful feature for Flink users.
>
> Best,
>
> Arvid
>
> [1] https://github.com/apache/flink/pull/15054
>
> On Thu, Apr 1, 2021 at 6:21 AM Kurt Young wrote:
>
>> Hi Guowei and Dawid,
>>
Hi Guowei and Dawid,
I want to request the permission to merge this feature [1], it's a useful
improvement to sql client and won't affect
other components too much. We were plan to merge it yesterday but met some
tricky multi-process issue which
has a very high possibility hanging the tests. It to
Hi Theo,
Regarding your first 2 questions, the answer is yes Flink supports
streaming write to Hive.
And Flink also supports automatically compacting small files during
streaming write [1].
(Hive and Filesystem shared the same mechanism to do compaction, we forgot
to add a dedicated document for h
Hi Timo,
First of all I want to thank you for introducing this planner design back
in 1.9, this is a great work
that allows lots of blink features to be merged to Flink in a reasonably
short time. It greatly
accelerates the evolution speed of Table & SQL.
Everything comes with a cost, as you sa
cc this to user & user-zh mailing list because this will affect lots of
users, and also quite a lot of users
were asking questions around this topic.
Let me try to understand this from user's perspective.
Your proposal will affect five functions, which are:
- PROCTIME()
- NOW()
- CURREN
Yes, I think this is a bug, feel free to open a jira and a pull request.
Best,
Kurt
On Fri, Oct 16, 2020 at 4:13 PM Jon Alberdi wrote:
> Hello to all,
>
> on flink-1.11.2 the program written at
> https://gist.github.com/yetanotherion/d007fa113d97411226eaea4f20cd4c2d
>
> creates unexpected sta
Congratulations Dian!
Best,
Kurt
On Thu, Aug 27, 2020 at 7:28 PM Rui Li wrote:
> Congratulations Dian!
>
> On Thu, Aug 27, 2020 at 5:39 PM Yuan Mei wrote:
>
>> Congrats!
>>
>> On Thu, Aug 27, 2020 at 5:38 PM Xingbo Huang wrote:
>>
>>> Congratulations Dian!
>>>
>>> Best,
>>> Xingbo
>>>
>>> ji
Hi Kostas,
Thanks for starting this discussion. The first part of this FLIP: "Batch vs
Streaming Scheduling" looks reasonable to me.
However, there is another dimension I think we should also take into
consideration, which is whether checkpointing is enabled.
This option is orthogonal (but not fu
+1, looking forward to the follow up FLIPs.
Best,
Kurt
On Thu, Jul 30, 2020 at 6:40 PM Arvid Heise wrote:
> +1 of getting rid of the DataSet API. Is DataStream#iterate already
> superseding DataSet iterations or would that also need to be accounted for?
>
> In general, all surviving APIs shoul
Hi Flavio,
In 1.11 we have provided an easier way to print table content, after you
got the `table` object,
all you need to to is calling `table.execute().print();`
Best,
Kurt
On Thu, Jul 16, 2020 at 9:35 AM Leonard Xu wrote:
> Hi, Flavio
>
>
> 在 2020年7月16日,00:19,Flavio Pompermaier 写道:
>
> f
+dev
Best,
Kurt
On Fri, May 8, 2020 at 3:35 PM Caizhi Weng wrote:
> Hi Jeff,
>
> Thanks for the response. However I'm using executeAsync so that I can run
> the job asynchronously and get a JobClient to monitor the job. JobListener
> only works for synchronous execute method. Is there other w
A more straightforward way after FLIP-84 would be:
TableResult result = tEnv.executeSql("select xxx ...");
result.print();
And if you are using 1.10 now, you can use TableUtils#collectToList(table)
to collect the
result to a list, and then print rows by yourself.
Best,
Kurt
On Tue, May 5, 2020
IIUC FLIP-122 already delegate the responsibility for designing and parsing
connector properties to connector developers.
So frankly speaking, no matter which style we choose, there is no strong
guarantee for either of these. So it's also possible
that developers can choose a totally different way
Hi Benchao, you can create a jira issue to track this.
Best,
Kurt
On Thu, Apr 23, 2020 at 2:27 PM Benchao Li wrote:
> Hi Jingsong,
>
> Thanks for your quick response. I've CC'ed Chongchen who understands the
> scenario much better.
>
>
> Jingsong Li 于2020年4月23日周四 下午12:34写道:
>
>> Hi, Benchao,
already fixed in 1.9.2 and 1.10.0.
> Could you try it using 1.9.2?
>
> Best,
> Jark
>
> On Mon, 20 Apr 2020 at 21:00, Kurt Young <[hidden email]> wrote:
>
>> Can you reproduce this in a local program with mini-cluster?
>>
>> Best,
>> Kurt
>&
or not it is pretty well defined.
> Maybe there is some fundamental concept that I dont understand, I don't
> have much experience with this to be fair.
>
> Gyula
>
> On Mon, Apr 20, 2020 at 4:03 PM Kurt Young wrote:
>
>> The reason here is Flink doesn't know the
The reason here is Flink doesn't know the hive table is static. After you
create these two tables and
trying to join them, Flink will assume both table will be changing with
time.
Best,
Kurt
On Mon, Apr 20, 2020 at 9:48 PM Gyula Fóra wrote:
> Hi!
>
> The problem here is that I dont have a temp
Can you reproduce this in a local program with mini-cluster?
Best,
Kurt
On Mon, Apr 20, 2020 at 8:07 PM Zahid Rahman wrote:
> You can read this for this type error.
>
>
> https://stackoverflow.com/questions/28189446/i-always-get-this-error-exception-in-thread-main-java-lang-arrayindexoutofbou#
Hi Dev and User,
Blink planner for Table API & SQL is introduced in Flink 1.9 and already be
the default planner for
SQL client in Flink 1.10. And since we already decided not introducing any
new features to the
original Flink planner, it already lacked of so many great features that
the community
I think this requirement can be satisfied by temporal table function [1],
am I missing anything?
[1]
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/temporal_tables.html#temporal-table-function
Best,
Kurt
On Sat, Mar 28, 2020 at 2:47 PM Maatary Okouya
wrote:
> Hi al
esult from flink sql-client
>
> Thanks,
> Lei
>
> ------
> wangl...@geekplus.com.cn
>
> *From:* Kurt Young
> *Date:* 2020-03-18 17:41
> *To:* wangl...@geekplus.com.cn; lirui
> *CC:* user
> *Subject:* Re: flink sql-client read hive orc table
My guess is we haven't support hive bucket table yet.
cc Rui Li for confirmation.
Best,
Kurt
On Wed, Mar 18, 2020 at 5:19 PM wangl...@geekplus.com.cn <
wangl...@geekplus.com.cn> wrote:
> Hive table store as orc format:
> CREATE TABLE `robot_tr`(`robot_id` int, `robot_time` bigint,
> `linear_
Hi, could you share the SQL you written for your original purpose, not the
one you attached ProcessFunction for debugging?
Best,
Kurt
On Tue, Mar 17, 2020 at 3:08 AM Dominik Wosiński wrote:
> Actually, I just put this process function there for debugging purposes.
> My main goal is to join the
> The second reason is this query need to scan the whole table. I think we
can do better :-)
Not necessarily, you said all the changes will trigger a DDB stream, you
can use Flink to consume such
stream incrementally.
For the 1st problem, I think you can use DataStream API and register a
timer on
Hi Jiawai,
Sorry I still didn't fully get your question. What's wrong with your
proposed SQL?
> select vendorId, sum(inventory units)
> from dynamodb
> where today's time - inbound time > 15
> group by vendorId
My guess is that such query would only trigger calculations by new event.
So if a ver
@Gyula Fóra I think your query is right, we should
produce insert only results if you have event time and watermark defined.
I've create https://issues.apache.org/jira/browse/FLINK-16466 to track this
issue.
Best,
Kurt
On Fri, Mar 6, 2020 at 12:14 PM Kurt Young wrote:
> Actually
ctions that
>> happened in the 5 seconds before the query.
>> Every query.id belongs to a single query (event_time, itemid) but still
>> I have to group :/
>>
>> Gyula
>>
>> On Thu, Mar 5, 2020 at 3:45 PM Kurt Young wrote:
>>
>>> I think the i
ie in the together with the
> CustomCommandLine logic)
>
> Gyula
>
> On Thu, Mar 5, 2020 at 3:49 PM Kurt Young wrote:
>
>> IIRC the tricky thing here is not all the config options belong to
>> flink-conf.yaml can be adjust dynamically in user's program.
>&g
User defined formats also sounds like an interesting extension.
Best,
Kurt
On Thu, Mar 5, 2020 at 3:06 PM Jark Wu wrote:
> Hi Lei,
>
> Currently, Flink SQL doesn't support to register a binlog format (i.e.
> just define "order_id" and "order_no", but the json schema has other binlog
> fields).
IIRC the tricky thing here is not all the config options belong to
flink-conf.yaml can be adjust dynamically in user's program.
So it will end up like some of the configurations can be overridden but
some are not. The experience is not quite good for users.
Best,
Kurt
On Thu, Mar 5, 2020 at 10:1
ssages").
>
> As for the data completion, in my above example it is basically an event
> time interval join.
> With watermarks defined Flink should be able to compute results once in
> exactly the same way as for the tumbling window.
>
> Gyula
>
> On Thu, Mar 5, 2020
5, 2020 at 10:24 PM Kurt Young wrote:
> > I also don't completely understand at this point why I can write the
> result of a group, tumble window aggregate to Kafka and not this window
> join / aggregate.
>
> If you are doing a tumble window aggregate with watermark enable
> I also don't completely understand at this point why I can write the
result of a group, tumble window aggregate to Kafka and not this window
join / aggregate.
If you are doing a tumble window aggregate with watermark enabled, Flink
will only fire a final result for
each window at once, no modifi
Hi everyone,
I'm very happy to announce that Jingsong Lee accepted the offer of the
Flink PMC to
become a committer of the Flink project.
Jingsong Lee has been an active community member for more than a year now.
He is
mainly focus on Flink SQL, played an essential role during blink planner
mergi
Congratulations to everyone involved!
Great thanks to Yu & Gary for being the release manager!
Best,
Kurt
On Thu, Feb 13, 2020 at 10:06 AM Hequn Cheng wrote:
> Great thanks to Yu & Gary for being the release manager!
> Also thanks to everyone who made this release possible!
>
> Best, Hequn
>
>
Hi all,
I'd like to bring up a discussion about removing registration of
TableSource and
TableSink in TableEnvironment as well as in ConnectTableDescriptor. The
affected
method would be:
TableEnvironment::registerTableSource
TableEnvironment::fromTableSource
TableEnvironment::registerTableSink
Co
ON l.coListAgentKeyL = ac.ucPKA AND
> l.coListAgentKeyL IS NOT NULL" +
>
>
> I tried this but noticed that it didn't work as the data skew (and heavy load
> on one task) continued. Could you please let me know if I missed anything?
>
>
> Thanks,
>
> Eva
>
&g
Hi,
You can try to filter NULL values with an explicit condition like " is
not NULL".
Best,
Kurt
On Sat, Jan 11, 2020 at 4:10 AM Eva Eva
wrote:
> Thank you both for the suggestions.
> I did a bit more analysis using UI and identified at least one
> problem that's occurring with the job rn
Hi,
Could you try to find out what's the bottleneck of your current job? This
would leads to
different optimizations. Such as whether it's CPU bounded, or you have too
big local
state thus stuck by too many slow IOs.
Best,
Kurt
On Wed, Jan 8, 2020 at 3:53 PM 贺小令 wrote:
> hi sunfulin,
> you ca
ontribute to the Flink codebase.
>
> Anyway, shout out to Jark for resolving the bug and providing a patch! I
> believe this will be a real enabler for CQRS architectures on Flink (we had
> subscriptions with regular joins, and this patch enables querying the same
> thing with very mi
at's
> expected as I need to process latest records, as long as it is sending only
> the record(s) that's been updated.
>
> Thanks,
> RKandoji
>
> On Fri, Jan 3, 2020 at 9:57 PM Kurt Young wrote:
>
>> Hi RKandoji,
>>
>> It looks like you have a data
Hi Benoît,
Before discussing all the options you listed, I'd like understand more
about your requirements.
The part I don't fully understand is, both your fact (Event) and dimension
(DimensionAtJoinTimeX) tables are
coming from the same table, Event or EventRawInput in your case. So it will
resul
Looks like a bug to me, could you fire an issue for this?
Best,
Kurt
On Thu, Jan 2, 2020 at 9:06 PM jeremyji <18868129...@163.com> wrote:
> Two stream as table1, table2. We know that group with regular join won't
> work
> so we have to use time-windowed join. So here is my flink sql looks like:
;>>
>>>>>> Yes, I'm planning to try out DeDuplication when I'm done upgrading to
>>>>>> version 1.9. Hopefully deduplication is done by only one task and reused
>>>>>> everywhere else.
>>>>>>
>>>>&g
BTW, you could also have a more efficient version of deduplicating
user table by using the topn feature [1].
Best,
Kurt
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/sql.html#top-n
On Tue, Dec 31, 2019 at 9:24 AM Jingsong Li wrote:
> Hi RKandoji,
>
> In theory, you
I created a issue to trace this feature:
https://issues.apache.org/jira/browse/FLINK-15440
Best,
Kurt
On Tue, Dec 31, 2019 at 8:00 AM Fanbin Bu wrote:
> Kurt,
>
> Is there any update on this or roadmap that supports savepoints with Flink
> SQL?
>
> On Sun, Nov 3, 2019 at 1
window it keeps
> history of several days . if I want to put the logic of #2 I will need to
> manage it with timers, correct ?
>
> On Thu, Dec 26, 2019 at 6:28 AM Kurt Young wrote:
>
>> *This Message originated outside your organization.*
>>
Hi,
You can merge the logic of #2 into #4, it will be much simpler.
Best,
Kurt
On Wed, Dec 25, 2019 at 7:36 PM Avi Levi wrote:
> Hi ,
>
> I have the following pipeline :
> 1. single hour window that counts the number of records
> 2. single day window that accepts the aggregated data from #1
Hi Eva,
Correct me If i'm wrong. You have an unbounded Task stream and you
want to enrich the User info to the task event. Meanwhile, the User table
is also changing by the time, so you basically want that when task event
comes, join the latest data of User table and emit the results. Even if the
Hi James,
If I understand correctly, you can use `TableEnvironment#sqlQuery` to
achieve
what you want. You can pass the whole sql statement in and get a `Table`
back
from the method. I believe this is the table you want which is semantically
equivalent with the stream you mentioned.
For example,
> OK, I believe I know enough to get my hands dirty with the code. I can
> share later on what I was able to accomplish. And probably more questions
> will show up when I finally start the implementation.
>
> Thanks
> Krzysztof
>
> pon., 16 gru 2019 o 03:14 Kurt Young napisa
that feasible?
> Do I understand correctly, that this option is available only with Blink
> engine and also only with use of Flink SQL, no Table API?
>
> Same question comes up regarding reprocessing: do you think it would be
> possible to use the same logic / SQL for reprocessing?
On Fri, Dec 13, 2019 at 4:37 PM Kurt Young wrote:
> Hi Krzysztof,
>
> What you raised also interested us a lot to achieve in Flink.
> Unfortunately, there
> is no in place solution in Table/SQL API yet, but you have 2 options which
> are both
> close to this thus need some
Hi Krzysztof,
What you raised also interested us a lot to achieve in Flink.
Unfortunately, there
is no in place solution in Table/SQL API yet, but you have 2 options which
are both
close to this thus need some modifications.
1. The first one is use temporal table function [1]. It needs you to wri
Hi Chris,
If you only interest the latest data of the dimension table, maybe you can
try
the temporal table join:
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/sql.html#operations
see "Join with Temporal Table"
Best,
Kurt
On Fri, Dec 6, 2019 at 11:13 PM Fabian Hueske wr
It's not possible for SQL and Table API jobs playing with savepoints yet,
but I
think this is a popular requirement and we should definitely discuss the
solutions
in the following versions.
Best,
Kurt
On Sat, Nov 2, 2019 at 7:24 AM Fanbin Bu wrote:
> Kurt,
>
> What do you recommend for Flink SQ
3 and see if there is a better solution to
> combine these two functions. I am very willing to join this development.
>
>
>
> -- 原始邮件 --
> *发件人:* "Kurt Young";
> *发送时间:* 2019年9月17日(星期二) 中午11:19
> *收件人:* "Jun Zhang"<825875..
Kurt:
> thank you very much.
> I will take a closer look at the FLIP-63.
>
> I develop this PR, the underlying is StreamingFileSink, not
> BuckingSink, but I gave him a name, called Bucket.
>
>
> On 09/17/2019 10:57,Kurt Young
> wrote:
>
> Hi Ju
Hi Jun,
Thanks for bringing this up, in general I'm +1 on this feature. As
you might know, there is another ongoing efforts about such kind
of table sink, which covered in newly proposed partition support
reworking[1]. In this proposal, we also want to introduce a new
file system connector, which
Hi Debasish,
I think there is a good chance to have 1.9.1, the only question is when.
1.9.0 released ~2 weeks ago, and I think some users are still under the
migration if they want to use 1.9.0. Wait another 1 or 2 weeks and also
see whether there are some critical bugs in 1.9.0 sounds reasonable
Great to hear! Thanks Gordon for driving the release, and it's been a
great pleasure to work with you as release managers for the last couple of
weeks. And thanks everyone who contributed to this version, you're making
Flink an even better project!
Best,
Kurt
Yun Tang 于2019年8月23日 周五02:17写道:
> G
Congratulations Rong!
Best,
Kurt
On Thu, Jul 11, 2019 at 10:53 PM Kostas Kloudas wrote:
> Congratulations Rong!
>
> On Thu, Jul 11, 2019 at 4:40 PM Jark Wu wrote:
>
>> Congratulations Rong Rong!
>> Welcome on board!
>>
>> On Thu, 11 Jul 2019 at 22:25, Fabian Hueske wrote:
>>
>>> Hi everyone,
Thanks for being the release manager and great job! @Jincheng
Best,
Kurt
On Wed, Jul 3, 2019 at 10:19 AM Tzu-Li (Gordon) Tai
wrote:
> Thanks for being the release manager @jincheng sun
> :)
>
> On Wed, Jul 3, 2019 at 10:16 AM Dian Fu wrote:
>
>> Awesome! Thanks a lot for being the release ma
t;
> class implementation.
>
> Best,
> Felipe
>
> *--*
> *-- Felipe Gutierrez*
>
> *-- skype: felipe.o.gutierrez*
> *--* *https://felipeogutierrez.blogspot.com
> <https://felipeogutierrez.blogspot.com>*
>
>
> On Wed, Apr 17, 2019 at 4:13 AM Kurt Young w
I mean no particular reason.
Best,
Kurt
On Wed, Apr 17, 2019 at 7:44 PM Kurt Young wrote:
> There is no reason for it, the operator and function doesn't rely on the
> logic which AbstractUdfStreamOperator supplied.
>
> Best,
> Kurt
>
>
> On Wed, Apr 17, 201
/MqttSensorDataSkewedCombinerByKeySkewedDAG.java#L86>
> .
>
> Thanks,
> Felipe
>
> *--*
> *-- Felipe Gutierrez*
>
> *-- skype: felipe.o.gutierrez*
> *--* *https://felipeogutierrez.blogspot.com
> <https://felipeogutierrez.blogspot.com>*
>
>
> On Tue, Apr 16,
t;
>
>
> implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
> org.sense.flink.App
>
>
> false
>
>
>
>
>
>
>
>
>
> Thanks
>
>
>
> *--*
> *-- Felipe Gutierrez*
>
> *-- sky
gt; <https://felipeogutierrez.blogspot.com>*
>
>
> On Mon, Apr 15, 2019 at 9:49 AM Felipe Gutierrez <
> felipe.o.gutier...@gmail.com> wrote:
>
>> Cool, thanks Kurt!
>> *-*
>> *- Felipe Gutierrez*
>>
>> *- skype: felipe.o.gutierrez*
>> *- **https://felip
Hi,
You can checkout the bundle operator which used in Blink to perform similar
thing you mentioned:
https://github.com/apache/flink/blob/blink/flink-libraries/flink-table/src/main/java/org/apache/flink/table/runtime/bundle/BundleOperator.java
Best,
Kurt
On Fri, Apr 12, 2019 at 8:05 PM Felipe G
+1 (non-binding)
Checked items:
- checked checksums and GPG files
- verified that the source archives do not contains any binaries
- checked that all POM files point to the same version
- build from source successfully
Best,
Kurt
On Wed, Mar 20, 2019 at 2:12 PM jincheng sun
wrote:
> Hi Aljosc
Hi Dongwon,
AFAIK, Flink doesn't support the usage like "myscalar(col1, col2, col3) as
(col4, col5)". Am I missing something?
If you want to split Tuple2 into two different columns, you can use UDTF.
Best,
Kurt
On Wed, Mar 20, 2019 at 9:59 AM Dongwon Kim wrote:
> Hi,
>
> I want to split Tupl
ors there.
> Or would you typically in such scenarios go the route of either having a
> retractable sink / sink that can update partial results by key?
>
>
>
> Thanks,
>
>
>
> -- Piyush
>
>
>
>
>
> *From: *Kurt Young
> *Date: *Tuesday, Marc
Keep one thing in mind: if you want the element remains legal after the
function call ends (maybe map(), flatmap(), depends on what you are using),
you should copy the elements.
Typical scenarios includes:
1. Save the elements into some collection like array, list, map for later
usage, you should c
rror like that.
Best,
Kurt
On Tue, Mar 12, 2019 at 9:46 PM Piyush Narang wrote:
> Thanks for getting back Kurt. Yeah this might be an option to try out. I
> was hoping there would be a way to express this directly in the SQL though
> ☹.
>
>
>
> -- Piyush
>
>
>
Hi Piyush,
Could you try to add clientId into your aggregate function, and to track
the map of inside your new aggregate
function, and assemble what ever result when emit.
The SQL will looks like:
SELECT userId, some_aggregation(clientId, eventType, `timestamp`,
dataField)
FROM my_kafka_stream_t
Congrats Thomas!
Best,
Kurt
On Wed, Feb 13, 2019 at 10:02 AM Shaoxuan Wang wrote:
> Congratulations, Thomas!
>
> On Tue, Feb 12, 2019 at 5:59 PM Fabian Hueske wrote:
>
>> Hi everyone,
>>
>> On behalf of the Flink PMC I am happy to announce Thomas Weise as a new
>> member of the Apache Flink P
Hi,
We have implemented ANALYZE TABLE in our internal version of Flink, and we
will try to contribute back to the community.
Best,
Kurt
On Thu, Nov 29, 2018 at 9:23 PM Fabian Hueske wrote:
> I'd try to tune it in a single query.
> If that does not work, go for as few queries as possible, spli
Hi,
Partition is the output of a JobVertex which you can simply thought
contains an operator. And in real world, JobVertex will run in parallel,
each
will output some data, which is conceptually called subpartition.
Best,
Kurt
On Thu, Oct 11, 2018 at 10:27 AM Renjie Liu wrote:
> Hi, Chris:
>
return null;
> }
> });
>
> Because it looks like aggregate would only transfer WindowedStream to a
> DataStream. But for a global aggregation phase (a reducer), should I
> extract the window again?
>
>
> Thanks! I apologize if that s
> Le
>
> On Sun, Oct 22, 2017 at 8:52 PM, Kurt Young wrote:
>
>> Hi,
>>
>> The document you are looking at is pretty old, you can check the newest
>> version here: https://ci.apache.org/projects/flink/flink-docs-releas
>> e-1.3/dev/batch/dataset_tran
Hi,
The document you are looking at is pretty old, you can check the newest
version here:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/batch/dataset_transformations.html
Regarding to your question, you can use combineGroup
Best,
Kurt
On Mon, Oct 23, 2017 at 5:22 AM, Le Xu wr
Copied from my earlier response to some similar question:
"Here is a short description for how it works: there are totally 3 threads
working together, one for reading, one for sorting partial data in memory,
and the last one is responsible for spilling. Flink will first figure out
how many memory
Hi Marchant,
I'm afraid that the serde cost still exists even if both operators run in
same TaskManager.
Best,
Kurt
On Tue, Sep 5, 2017 at 9:26 PM, Marchant, Hayden
wrote:
> I have a streaming application that has a keyBy operator followed by an
> operator working on the keyed values (a custom
aintain my ListStates
>
> 1. Does each Array List has its own ListState?
> 2. I am not clear with the open function on the example given by Flink. I
> wonder how I can initialize my arraylists with ListStateDescriptor.
>
>
>
> Desheng Zhang
> E-mail: gzzhangdesh...@corp.
Hi,
I think you can use State to achieve your goal:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/state.html
Best,
Kurt
On Thu, Jul 13, 2017 at 1:14 PM, ZalaCheung wrote:
> Hi all,
>
> I am stuck with a problem. I have a stream, I want keyby it and then do a
> map func
+1 for droppint Java 7, we have been using Java 8 for more than one year in
Alibaba and everything work fine.
Best,
Kurt
On Thu, Jul 13, 2017 at 2:53 AM, Bowen Li wrote:
> +1 for dropping Java 7
>
> On Wed, Jul 12, 2017 at 9:04 AM, Gyula Fóra wrote:
>
> > +1 for dropping 1.7 from me as well.
>
Hi,
Ufuk had write up an excellent document about Netty's memory allocation [1]
inside Flink, and i want to add one more note after running some large
scale jobs.
The only inaccurate thing about [1] is how much memory will
LengthFieldBasedFrameDecoder
use. From our observations, it will cost at m
I think the only way is adding more managed memory.
The large record handler only take effects in reduce side which used by the
merge sorter. According to
the exception, it is thrown during the combing phase which only uses an
in-memory sorter, which doesn't
have large record handle mechanism.
Be
Hi,
Can you paste some code snippet to show how you use the DataSet API?
Best,
Kurt
On Tue, Jun 13, 2017 at 4:29 PM, Sebastian Neef <
gehax...@mailbox.tu-berlin.de> wrote:
> Hi Kurt,
>
> thanks for the input.
>
> What do you mean with "try to disable your combiner"? Any tips on how I
> can do t
Hi,
I think the reason is your record is too large to do a in-memory combine.
You can try to disable your combiner.
Best,
Kurt
On Mon, Jun 12, 2017 at 9:55 PM, Sebastian Neef <
gehax...@mailbox.tu-berlin.de> wrote:
> Hi,
>
> when I'm running my Flink job on a small dataset, it successfully
> fi
1 - 100 of 107 matches
Mail list logo