Hi Tarandeep,
the AvroInputFormat was recently extended to support GenericRecords. [1]
You could also try to run the latest SNAPSHOT version and see if it works
for you.
Cheers, Fabian
[1] https://issues.apache.org/jira/browse/FLINK-3691
2016-05-12 10:05 GMT+02:00 Tarandeep Singh :
> I think I
Hi,
Flink's exactly-once semantics do not mean that events are processed
exactly-once but that events will contribute exactly-once to the state of
an operator such as a counter.
Roughly, the mechanism works as follows:
- Flink peridically injects checkpoint markers into the data stream. This
happe
master/apis/streaming/fault_tolerance.html#fault-tolerance-guarantees-of-data-sources-and-sinks
>
> If not what other Sinks can I use to have the exactly once output since
> getting exactly once output is critical for our use case.
>
>
>
> Thanks,
> Naveen
>
> From: Fabian Hueske
>
t; pipeline.
>
>
>
> Thanks,
> Naveen
>
> From: Fabian Hueske
> Reply-To: "user@flink.apache.org"
> Date: Friday, May 13, 2016 at 4:26 PM
>
> To: "user@flink.apache.org"
> Subject: Re: Flink recovery
>
> Hi Naveen,
>
> the Ro
Hi Prateek,
the missing numbers are an artifact from how the stats are collected.
ATM, Flink does only collect these metrics for data which is sent over
connections *between* Flink operators.
Since sources and sinks connect to external systems (and not Flink
operators), the dash board does not sho
".valid-length" file.
>>>
>>> The fix you mentioned is part of later Flink releases (like 1.0.3)
>>>
>>> Stephan
>>>
>>>
>>> On Mon, May 16, 2016 at 11:46 PM, Madhire, Naveen <
>>> naveen.madh...@capitalone
I think union is what you are looking for.
Note that all data sets must be of the same type.
2016-05-18 16:15 GMT+02:00 Ritesh Kumar Singh :
> Hi,
>
> How can I perform a reduce operation on a group of datasets using Flink?
> Let's say my map function gives out n datasets: d1, d2, ... dN
> Now I
I think that sentence is misleading and refers to the internals of Flink.
It should be removed, IMO.
You can only union two DataSets. If you want to union more, you have to do
it one by one.
Btw. union does not cause additional processing overhead.
Cheers, Fabian
2016-05-19 14:44 GMT+02:00 Rites
The problem seems to occur quite often.
Did you update your Flink version recently? If so, could you try to
downgrade and see if the problem disappears.
Is it otherwise possible that it is cause by faulty hardware?
2016-05-20 18:05 GMT+02:00 Flavio Pompermaier :
> This time (Europed instead of E
Actually, the program works correctly (according to the DataStream API)
Let me explain what happens:
1) You do not initialize the count variable, so it will be 0 (summing 0s
results in 0)
2) DataStreams are considered to be unbound (have an infinite size). KeyBy
does not group the records because
Hi Kirsti,
I'm not aware of anybody working on this issue.
Would you like to create a JIRA issue for it?
Best, Fabian
2016-05-23 16:56 GMT+02:00 KirstiLaurila :
> Is there any plans to implement this kind of feature (possibility to write
> to
> data specified partitions) in the near future?
>
>
No, that is not supported yet.
Beam provides a common API but the Flink runner translates programs against
batch sources into the DataSet API programs and Beam programs against
streaming source into DataStream programs.
It is not possible to mix both.
2016-05-26 10:00 GMT+02:00 Ashutosh Kumar :
>
Hi Elias,
yes, reduce, fold, and the aggregation functions (sum, min, max, minBy,
maxBy) on WindowedStream preform eager aggregation, i.e., the functions are
apply for each value that enters the window and the state of the window
will consist of a single value. In case you need access to the Windo
Hi Dongwon,
Maybe you can add your use case to the FLIP-107 discussion thread [1] and
thereby support the proposal (after checking that it would solve your
problem).
It's always helpful to learn about the requirements of users when designing
new features.
It also helps to prioritize which feature
Hi Marco,
You cannot really synchronize data that is being emitted via different
streams (without bringing them together in an operator).
I see two options:
1) emit the event to create the partition and the data to be written into
the partition to the same stream. Flink guarantees that records d
Hi Laurent,
Thanks for trying out Ververica platform!
However, please note that this is the mailing list of the Apache Flink
project.
Please post further questions using the "Community Edition Feedback" button
on this page: https://ververica.zendesk.com/hc/en-us
We are working on setting up a bett
Hi Giriraj,
This looks like the deserialization of a String failed.
Can you isolate the problem to a pair of sending and receiving tasks?
Best, Fabian
Am So., 5. Apr. 2020 um 20:18 Uhr schrieb Giriraj Chauhan <
graj.chau...@gmail.com>:
> Hi,
>
> We are submitting a flink(1.9.1) job for data pro
Hi Kristoff,
I'm not aware of any concrete plans for such a feature.
Best,
Fabian
Am So., 5. Apr. 2020 um 22:33 Uhr schrieb KristoffSC <
krzysiek.chmielew...@gmail.com>:
> Hi,
> according to [1] operator state and broadcast state (which is a "special"
> type of operator state) are not stored in
Hi,
With Flink streaming operators
However, these parts are currently being reworked to enable a better
integration of batch and streaming use cases (or hybrid use cases such as
yours).
A while back, we wrote a blog post about these plans [1]:
> *"Unified Stream Operators:* Blink extends the Fli
Hi Anil,
Here's a pointer to Flink's end-2-end test that's checking the integration
with schema registry [1].
It was recently updated so I hope it works the same way in Flink 1.9.
Best,
Fabian
[1]
https://github.com/apache/flink/blob/master/flink-end-to-end-tests/flink-confluent-schema-registry/
Hi Sudan,
I noticed a few issues with your code:
1) Please check the computation of timestamps. Your code
public long extractAscendingTimestamp(Eventi.Event element) {
return element.getEventTime().getSeconds() * 1000;
}
only seems to look at the seconds of a timestamp. Typically, you wou
> }
>
> Then used a new object of GenericSerializer in the FlinkKafkaProducer
>
> FlinkKafkaProducer producer =
> new FlinkKafkaProducer<>(topic, new GenericSerializer(topic, schema,
> schemaRegistryUrl), kafkaConfig, Semantic.AT_LEAST_ONCE);
>
> Thanks , Anil.
>
>
&
Hi,
If the interval join emits the time attributes of both its inputs, you can
use either of them as a time attribute in a following operator because the
join ensures that the watermark will be aligned with both of them.
Best, Fabian
Am Mo., 4. Mai 2020 um 00:48 Uhr schrieb lec ssmi :
> Thanks
ement to make it . Can it be possible?
>
> Fabian Hueske 于2020年5月4日周一 下午4:04写道:
>
>> Hi,
>>
>> If the interval join emits the time attributes of both its inputs, you
>> can use either of them as a time attribute in a following operator because
>> the join ensur
There's also the Table API approach if you want to avoid typing a "full"
SQL query:
Table t = tEnv.from("myTable");
Cheers,
Fabian
Am Di., 5. Mai 2020 um 16:34 Uhr schrieb Őrhidi Mátyás <
matyas.orh...@gmail.com>:
> Thanks guys for the prompt answers!
>
> On Tue, May 5, 2020 at 2:49 PM Kurt You
one?
>>>>
>>>> Benchao Li 于 2020年5月5日周二 17:26写道:
>>>>
>>>>> Hi lec,
>>>>>
>>>>> You don't need to specify time attribute again like `TUMBLE_ROWTIME`,
>>>>> you just select the time attribute field
Hi,
The code of the implementation is linked in the paper:
https://github.com/DataSystemsGroupUT/Adaptive-Watermarks
Since this is a prototype for a research paper, I'm doubtful that the
project is maintained.
I also didn't find an open-source license attached to the code.
Hence adding the project
Hi Josson,
I don't have much experience setting memory bounds in Kubernetes myself,
but my colleague Andrey (in CC) reworked Flink's memory configuration for
the last release to ease the configuration in container envs.
He might be able to help.
Best, Fabian
Am Do., 21. Mai 2020 um 18:43 Uhr sch
Congrats Yu!
Cheers, Fabian
Am Mi., 17. Juni 2020 um 10:20 Uhr schrieb Till Rohrmann <
trohrm...@apache.org>:
> Congratulations Yu!
>
> Cheers,
> Till
>
> On Wed, Jun 17, 2020 at 7:53 AM Jingsong Li
> wrote:
>
> > Congratulations Yu, well deserved!
> >
> > Best,
> > Jingsong
> >
> > On Wed, Jun
Hi Jie Feng,
As you said, Flink translates SQL queries into streaming programs with
auto-generated operator IDs.
In order to start a SQL query from a savepoint, the operator IDs in the
savepoint must match the IDs in the newly translated program.
Right now this can only be guaranteed if you transl
t;
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=shadowell&uid=shadowell%40126.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22shadowell%40126.com%22%5D>
> 签名由 网易邮箱大师 <https://mail.1
Hi Joris,
I don't think that the approach of "add methods in operator class code that
can be called from the main Flink program" will work.
The most efficient approach would be implementing a ProcessFunction that
counts in 1-min time buckets (using event-time semantics) and updates the
metrics.
I
Hi Brian,
AFAIK, Arvid and Piotr (both in CC) have been working on the threading
model of the checkpoint coordinator.
Maybe they can help with this question.
Best, Fabian
Am Mo., 20. Juli 2020 um 03:36 Uhr schrieb :
> Anyone can help us on this issue?
>
>
>
> Best Regards,
>
> Brian
>
>
>
> *Fr
Hi White,
Can you describe your problem in more detail?
* What is your Flink version?
* How do you deploy the job (application / session cluster), (Kubernetes,
Docker, YARN, ...)
* What kind of job are you running (DataStream, Table/SQL, DataSet)?
Best, Fabian
Am Mo., 20. Juli 2020 um 08:42 Uhr
Hi,
When running your code in the IDE, everything runs in the same local JVM.
When you run the job on Kubernetes, the situation is very different.
Your code runs in multiple JVM processes distributed in a cluster.
Flink provides a metrics collection system that you should use to collect
metrics f
Hi Boxiu,
This sounds like a good feature.
Please have a look at our contribution guidelines [1].
To propose a feature, you should open a Jira issue [2] and start a
discussion there.
Please note that the feature freeze for the Flink 1.9 release happened a
few weeks ago.
The community is currentl
. 2019 um 20:00 Uhr schrieb Ahmad Hassan <
ahmad.has...@gmail.com>:
>
> Hi Fabian,
>
> > On 4 Jul 2018, at 11:39, Fabian Hueske wrote:
> >
> > - Pre-aggregate records in a 5 minute Tumbling window. However,
> pre-aggregation does not work for FoldFunctions.
&g
Hi,
Flink does not distinguish between streams and tables. For the Table API /
SQL, there are only tables that are changing over time, i.e., dynamic
tables.
A Stream in the Kafka Streams or KSQL sense, is in Flink a Table with
append-only changes, i.e., records are only inserted and never deleted
Hi,
Regarding step 3, it is sufficient to check that you got on message from
each parallel task of the previous operator. That's because a task
processes the timers of all keys before moving forward.
Timers are always processed per key, but you could deduplicate on the
parallel task id and check t
Hi Joern,
Thanks for sharing your connectors!
The Flink community is currently working on a website that collects and
lists externally maintained connectors and libraries for Flink.
We are still figuring out some details, but hope that it can go live soon.
Would be great to have your repositories
>
>
> However, with your proposed solution, how would we be able to achieve this
> sliding window mechanism of emitting 24 hour window every 5 minute using
> processfunction ?
>
>
> Best,
>
>
> On Fri, 2 Aug 2019 at 09:48, Fabian Hueske wrote:
>
>> Hi Ahmad,
>&g
Thanks for the bug report Jacky!
Would you mind opening a Jira issue, preferably with a code snippet that
reproduces the bug?
Thank you,
Fabian
Am Fr., 2. Aug. 2019 um 16:01 Uhr schrieb Jacky Du :
> Hi, All
>
> Just find that Flink Table API have some issue if define nested object in
> an objec
Hi Jungtaek,
I would recommend to implement the logic in a ProcessFunction and avoid
Flink's windowing API.
IMO, the windowing API is difficult to use, because there are many pieces
like WindowAssigner, Window, Trigger, Evictor, WindowFunction that are
orchestrated by Flink.
This makes it very har
>> how much state the query will need to maintain.
>>
>>
>> I am not sure to understand the problem. If i have to append-only table
>> and perform some join on it, what's the issue ?
>>
>>
>> On Tue, Aug 6, 2019 at 8:03 PM Maatary Okouya
&
Congratulations Hequn!
Am Mi., 7. Aug. 2019 um 14:50 Uhr schrieb Robert Metzger <
rmetz...@apache.org>:
> Congratulations!
>
> On Wed, Aug 7, 2019 at 1:09 PM highfei2...@126.com
> wrote:
>
> > Congrats Hequn!
> >
> > Best,
> > Jeff Yang
> >
> >
> > Original Message
> > Subject:
Hi Vincent,
I don't think there is such a flag in Flink.
However, this sounds like a really good idea.
Would you mind creating a Jira ticket for this?
Thank you,
Fabian
Am Di., 6. Aug. 2019 um 17:53 Uhr schrieb Vincent Cai <
caidezhi...@foxmail.com>:
> Hi Users,
> In Spark, we can invoke Data
x this issue could be
> pretty simple .
>
> Thanks
> Jacky Du
>
> Fabian Hueske 于2019年8月2日周五 下午12:07写道:
>
>> Thanks for the bug report Jacky!
>>
>> Would you mind opening a Jira issue, preferably with a code snippet that
>> reproduces the bug?
>
Thanks for reporting this issue.
It is already discussed on Flink's dev mailing list in this thread:
->
https://lists.apache.org/thread.html/10f0f3aefd51444d1198c65f44ffdf2d78ca3359423dbc1c168c9731@%3Cdev.flink.apache.org%3E
Please continue the discussion there.
Thanks, Fabian
Am Di., 13. Aug.
Congrats Andrey!
Am Do., 15. Aug. 2019 um 07:58 Uhr schrieb Gary Yao :
> Congratulations Andrey, well deserved!
>
> Best,
> Gary
>
> On Thu, Aug 15, 2019 at 7:50 AM Bowen Li wrote:
>
> > Congratulations Andrey!
> >
> > On Wed, Aug 14, 2019 at 10:18 PM Rong Rong wrote:
> >
> >> Congratulations A
Hi,
Just to clarify. You cannot dynamically switch the join strategy while a
job is running.
What Hequn suggested was to have a util method Util.joinDynamically(ds1,
ds2) that chooses the join strategy when the program is generated (before
it is submitted for execution).
The problem is that distr
ess a
> checkpoint I can change the join strategy.
>
> and if you do, do you have any toy example of this?
>
> Thanks,
> Felipe
> *--*
> *-- Felipe Gutierrez*
>
> *-- skype: felipe.o.gutierrez*
> *--* *https://felipeogutierrez.blogspot.com
> <https://felipe
sult of that queries taking into account only the last
> values of each row. The result is inserted/updated in a in-memory K-V
> database for fast access.
>
>
>
> Thanks in advance!
>
>
>
> Best
>
>
>
> *De: *Fabian Hueske
> *Fecha: *miércoles, 7 de agost
Hi Padarn,
What you describe is essentially publishing Flink's watermarks to an
outside system.
Flink processes time windows, by waiting for a watermark that's past the
window end time. When it receives such a WM it processes and emits all
ended windows and forwards the watermark.
When a sink rece
Hi Theo,
The main problem is that the semantics of your join (Join all events that
happened on the same day) are not well-supported by Flink yet.
In terms of true streaming joins, Flink supports the time-windowed join
(with the BETWEEN predicate) and the time-versioned table join (which does
not
Hi Tony,
I'm sorry I cannot help you with this issue, but Becket (in CC) might have
an idea what went wrong here.
Best, Fabian
Am Mi., 14. Aug. 2019 um 07:00 Uhr schrieb Tony Wei :
> Hi,
>
> Currently, I was trying to update our kafka cluster with larger `
> transaction.max.timeout.ms`. The
> o
dows.
>
> Thanks.
>
> Best,
>
> On 2 Aug 2019, at 12:49, Fabian Hueske wrote:
>
> Ok, I won't go into the implementation detail.
>
> The idea is to track all products that were observed in the last five
> minutes (i.e., unique product ids) in a five minute tumb
Great!
Thanks for the feedback.
Cheers, Fabian
Am Mo., 19. Aug. 2019 um 17:12 Uhr schrieb Ahmad Hassan <
ahmad.has...@gmail.com>:
>
> Thank you Fabian. This works really well.
>
> Best Regards,
>
> On Fri, 16 Aug 2019 at 09:22, Fabian Hueske wrote:
>
>> Hi
Hi Anissa,
Are you using combineGroup or reduceGroup?
Your question refers to combineGroup, but the code only shows reduceGroup.
combineGroup is non-deterministic by design to enable efficient partial
results without network and disk IO.
reduceGroup is deterministic given a deterministic key extr
Hi Manvi,
A NoSuchMethodError typically indicates a version mismatch.
I would check if the Flink versions of your program, the client, and the
cluster are the same.
Best, Fabian
Am Di., 20. Aug. 2019 um 21:09 Uhr schrieb manvmali :
> Hi, I am facing the issue of writing the data stream result i
Hi Dongwon,
I'm not super familiar with Flink's MATCH_RECOGNIZE support, but Dawid (in
CC) might have some ideas about it.
Best,
Fabian
Am Mi., 21. Aug. 2019 um 07:23 Uhr schrieb Dongwon Kim <
eastcirc...@gmail.com>:
> Hi,
>
> Flink relational apis with MATCH_RECOGNITION looks very attractive a
Hi Sung,
There is no switch to configure the WM to be the max of both streams and it
would also in fact violate the core principles of the mechanism.
Watermarks are used to track the progress of event time in streams.
The implementations of operators rely on the fact that (almost) all records
tha
Hi Anissa,
This looks strange. If I understand your code correctly, your GroupReduce
function is summing up a field.
Looking at the results that you posted, it seems as if there is some data
missing (the total sum does not seem to match).
For groupReduce it is important that the grouping keys are
gt; My key fields is array of multiple type, in this case is string and long.
> The result that i'm posting is just represents sampling of output dataset.
>
> Thank you in advance !
>
> Anissa
>
> Le jeu. 22 août 2019 à 11:24, Fabian Hueske a écrit :
>
>> H
Hi,
Can you share a few more details about the data source?
Are you continuously ingesting files from a folder?
You are correct, that the parallelism should not affect the results, but
there are a few things that can affect that:
1) non-determnistic keys
2) out-of-order data with inappropriate wa
Hi Vinod,
This sounds like a watermark issue to me.
The commonly used watermark strategies (like bounded out-of-order) are only
advancing when there is a new record.
Moreover, the current watermark is the minimum of the current watermarks of
all input partitions.
So, the watermark only moves forwa
ler. The latest timestamp
> will be handled first ?
>
>
>
>
>
> BTW I tried to use a ContinuousEventTimeTrigger to make sure the window
> is calculated ? and got the processing to trigger multiple times so I’m
> not sure exactly how this type of trigger works..
>
>
>
> Tha
I'd like to thank, I'm learning Flink with the new book "Stream
> Processing with Apache Flink". :) Thanks for your amazing efforts on
> publishing nice book!
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
>
> On Mon, Aug 5, 2019 at 10:21 PM Fabian Hueske wr
ng on Flink’s monitoring page - for the watermarks I see
> different vales even after all my files were processed. Which is
> something I would not expect
> I would expect that eventually the WM will be the highest EVENT_TIME on
> my set of files..
>
>
>
>
>
> than
Hi all,
Flink 1.9 Docker images are available at Docker Hub [1] now.
Due to some configuration issue, there are only Scala 2.11 issues at the
moment but this was fixed [2].
Flink 1.9 Scala 2.12 images should be available soon.
Cheers,
Fabian
[1] https://hub.docker.com/_/flink
[2]
https://github.
Hi Theo,
The work on custom triggers has been put on hold due to some major
refactorings (splitting the modules, porting Scala code to Java, new type
system, new catalog interfaces, integration of the Blink planner).
It's also not on the near-time roadmap AFAIK.
To be honest, I'm not sure how much
27;t output a result any more when testing all of those
> combinations. Now the second attempt works but isn't really what I wanted
> to query (as the "same day"-predicate is still missing).
>
> Best regards
> Theo
>
> --
> *Von: *&qu
eems to have some quirks).
>
> I think ideally each partition of the kafka topic would have some regular
> information about watermarks. Perhaps the kafka producer can be modified to
> support this.
>
> Padarn
>
> On Fri, Aug 16, 2019 at 3:50 PM Fabian Hueske wrote:
>
>&
D*
> The World's Fastest Human Translation Platform.
> oy...@motaword.com — www.motaword.com
>
>
> On Tue, Aug 27, 2019 at 8:11 AM Fabian Hueske wrote:
>
>> Hi all,
>>
>> Flink 1.9 Docker images are available at Docker Hub [1] now.
>> Due to some configur
Hi Sushant,
It's hard to tell what's going on.
Maybe the thread pool of the async io operator is too small for the
ingested data rate?
This could cause the backpressure on the source and eventually also the
failing checkpoints.
Which Flink version are you using?
Best, Fabian
Am Do., 29. Aug. 2
up with a partition
>> containing element out of ‘window order’.
>>
>> I was also thinking this problem is very similar to that of checkpoint
>> barriers. I intended to dig into the details of the exactly once Kafka sink
>> for some inspiration.
>>
>> Padarn
>
9 Uhr schrieb Hanan Yehudai <
hanan.yehu...@radcom.com>:
> Im not sure what you mean by use process function and not window process
> function , as the window operator takes in a windowprocess function..
>
>
>
> *From:* Fabian Hueske
> *Sent:* Monday, August 26, 20
Hi all,
The registration for the Flink Forward Europe training sessions closes in
four weeks.
The training takes place in Berlin at October 7th and is followed by two
days of talks by speakers from companies like Airbus, Goldman Sachs,
Netflix, Pinterest, and Workday [1].
The following four train
Hi,
Flink does not have good support for mixing bounded and unbounded streams
in its DataStream API yet.
If the dimension table is static (and small enough), I'd use a
RichMapFunction and load the table in the open() method into the heap.
In this case, you'd probably need to restart the job (can b
Hi,
A window needs to keep the data as long as it expects new data.
This is clearly the case before the end time of the window was reached. If
my window ends at 12:30, I want to wait (at least) until 12:30 before I
remove any data, right?
In case you expect some data to be late, you can configure
Hi Vishwas,
This is a log statement from Kafka [1].
Not sure how when AppInfoParser is created (the log message is written by
the constructor).
For Kafka versions > 1.0, I'd recommend the universal connector [2].
Not sure how well it works if producers and consumers have different
versions.
Mayb
Hi,
Are you getting this error repeatedly or was this a single time?
If it's just a single time error, it's probably caused by a task manager
process that died for some reason (as suggested by the error message).
You should have a look at the TM logs whether you can finds something that
would exp
Hi Steve,
Maybe you could implement a custom TableSource that queries the data from
the rest API and converts the JSON directly into a Row data type.
This would also avoid going through the DataStream API just for ingesting
the data.
Best, Fabian
Am Mi., 4. Sept. 2019 um 15:57 Uhr schrieb Steve
Hi,
Kostas (in CC) might be able to help.
Best, Fabian
Am Mi., 4. Sept. 2019 um 22:59 Uhr schrieb sidhartha saurav <
sidsau...@gmail.com>:
> Hi,
>
> Can someone suggest a workaround so that we do not get this issue while
> changing the S3 bucket ?
>
> On Thu, Aug 22, 2019 at 4:24 PM sidhartha s
Hi everyone,
I'm very happy to announce that Kostas Kloudas is joining the Flink PMC.
Kostas is contributing to Flink for many years and puts lots of effort in
helping our users and growing the Flink community.
Please join me in congratulating Kostas!
Cheers,
Fabian
String key = iterator.next();
> row.setField(pos, jsonNode.get(key).asText());
> pos++;
> }
> return row;
> }
> }).returns(convert);
>
> Table tableA = tEnv.fromDataStream(dataStreamRow);
>
>
> Le jeu. 5 sept. 2019 à 13:23, Fabian Hueske a écr
Hi,
CheckpointableInputFormat is only relevant if you plan to use the
InputFormat in a MonitoringFileSource, i.e., in a streaming application.
If you plan to use it in a DataSet (batch) program, InputFormat is fine.
Btw. the latest release Flink 1.9.0 has major improvements for the recovery
of ba
Hi Niels,
I think (not 100% sure) you could also cast the event time attribute to
TIMESTAMP before you emit the table.
This should remove the event time property (and thereby the
TimeIndicatorTypeInfo) and you wouldn't know to fiddle with the output
types.
Best, Fabian
Am Mi., 21. Aug. 2019 um 1
database
systems have to deal with.
Best,
Fabian
Am Do., 5. Sept. 2019 um 13:37 Uhr schrieb Hanan Yehudai <
hanan.yehu...@radcom.com>:
> Thanks Fabian.
>
>
> is there any advantage using broadcast state VS using just CoMap function
> on 2 connected streams ?
>
>
>
&g
Hi,
that would be regular SQL cast syntax:
SELECT a, b, c, CAST(eventTime AS TIMESTAMP) FROM ...
Am Di., 10. Sept. 2019 um 18:07 Uhr schrieb Niels Basjes :
> Hi.
>
> Can you give me an example of the actual syntax of such a cast?
>
> On Tue, 10 Sep 2019, 16:30 Fabian Hueske,
Hi Theo,
I would implement this with a KeyedProcessFunction.
These are the important points to consider:
1) partition the output of the Kafka source by Kafka partition (or the
attribute that determines the partition). This will ensure that the data
stay in order (per partition).
2) The KeyedProce
Hi,
This is clearly a Scala version issue.
You need to make sure that all Flink dependencies have the same version and
are compiled for Scala 2.11.
The "_2.11" postfix in the dependency name indicates that it is a Scala
2.11 dependency ("_2.12 indicates Scala 2.12 compatibility).
Best, Fabian
Am
Hi,
There is no upper limit for state size in Flink. There are applications
with 10+ TB state.
However, it is natural that checkpointing time increases with state size as
more data needs to be serialized (in case of FSStateBackend) and written to
stable storage.
(The same is btw true for recovery
Congrats Zili Chen :-)
Cheers, Fabian
Am Mi., 11. Sept. 2019 um 12:48 Uhr schrieb Biao Liu :
> Congrats Zili!
>
> Thanks,
> Biao /'bɪ.aʊ/
>
>
>
> On Wed, 11 Sep 2019 at 18:43, Oytun Tez wrote:
>
>> Congratulations!
>>
>> ---
>> Oytun Tez
>>
>> *M O T A W O R D*
>> The World's Fastest Human Tran
Thanks for reporting back Catlyn!
Am Do., 12. Sept. 2019 um 19:40 Uhr schrieb Catlyn Kong :
> Turns out there was some other deserialization problem unrelated to this.
>
> On Mon, Sep 9, 2019 at 11:15 AM Catlyn Kong wrote:
>
>> Hi fellow streamers,
>>
>> I'm trying to support avro BYTES type in
Hi,
A GROUP BY query on a streaming table requires that the result is
continuously updated.
Updates are propagated as a retraction stream (see
tEnv.toRetractStream(table, Row.class).print(); in your code).
A retraction stream encodes the type of the update as a boolean flag, the
"true" and "false
Hi,
No, this is not possible at the moment. You can only pass a single
expression as primary key.
A work around might be to put the two fields in a nested field (haven't
tried if this works) or combine them in a single attribute, for example by
casting them to VARCHAR and concating them.
Best, Fa
Hi,
This can be set via the environment file.
Please have a look at the documentation [1] (see "execution:
min-idle-state-retention: " and "execution: max-idle-retention: " keys).
Fabian
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/sqlClient.html#environment-files
A
Hi,
The query that you wrote is not a time-windowed join.
INSERT INTO sourceKafkaMalicious
SELECT sourceKafka.* FROM sourceKafka INNER JOIN badips ON
sourceKafka.`source.ip`=badips.ip
WHERE sourceKafka.timestamp_received BETWEEN CURRENT_TIMESTAMP - INTERVAL
'15' MINUTE AND CURRENT_TIMESTAMP;
The
But with that 60 gb memory getting run out
>
> So i used below query.
> Can u please guide me in this regard
>
> On Wed, 18 Sep 2019 at 5:53 PM, Fabian Hueske wrote:
>
>> Hi,
>>
>> The query that you wrote is not a time-windowed join.
>>
>> INSERT IN
Hi Ken,
Changing the parallelism can affect the generation of input splits.
I had a look at BinaryInputFormat, and it adds a bunch of empty input
splits if the number of generated splits is less than the minimum number of
splits (which is equal to the parallelism).
See -->
https://github.com/apac
201 - 300 of 1728 matches
Mail list logo