you to store operator state as BLOB directly if that would be a
>doable option for you.
>
>
>
> Sincere greetings
>
>
>
> Thias
>
>
>
>
>
>
>
>
>
> *From:* Zakelly Lan
> *Sent:* Wednesday, February 21, 2024 8:04 AM
> *To:* Lorenzo Nicora
> *Cc:* Flin
cce4ec1a5/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/rank/FastTop1Function.java#L165
>
> Best,
> Zakelly
>
> On Sat, Feb 17, 2024 at 1:48 AM Lorenzo Nicora
> wrote:
>
>> Hi Thias
>>
>> I considered CheckpointedFuncti
tate(…) you also get access to state of different operator
> key.
>
> SnapshotState(…) is called as part of the (each) checkpoint in order to
> store data.
>
>
>
> Sincere greetings
>
>
>
> Thias
>
>
>
> *From:* Lorenzo Nicora
> *Sent:* Thursday, Fe
Hello everyone,
I have a convoluted problem.
I am implementing a KeyedProcessFunction that keeps some non-serializable
"state" in memory, in a transient Map (key = stream key, value = the
non-serializable "state").
I can extract a serializable representation to put in Flink state, and I
can load
Hi team
In Kafka Sink docs [1], with EXACTLY_ONCE it is recommended to set:
transaction_timeout > maximum_checkpoint duration + maximum_restart_duration.
I understand transaction_timeout > maximum_checkpoint_duration
But why adding maximum_restart_duration?
If the application recovers from a ch
Hi
I understand the FileSystem DataStream FileSource remembers in state all
the processed files, forever.
This causes the state to grow unbounded, making FileSource impractical to
use in a stateful application.
Is there any known workaround?
Thanks
Lorenzo
he.org/confluence/pages/viewpage.action?pageId=184615300
> [2]
> https://github.com/apache/flink-ml/blob/master/flink-ml-iteration/src/main/java/org/apache/flink/iteration/Iterations.java
>
> Lorenzo Nicora 于2023年2月18日周六 17:00写道:
> >
> > Hi all,
> >
> > I am
Hi all,
I am trying to implement an iterative streaming job that processes the loop
with a KeyedProcessFunction.
I need a KeyedProcessFunction to use keyed state and to emit a side-output
(that after further transformations becomes the feedback)
Problem is IterativeStream.process() only accepts
Hi
I have a linear streaming flow with a single source and multiple sinks to
publish intermediate results.
The time characteristic is Event Time and I am adding
one AssignerWithPeriodicWatermarks immediately after the source.
I need to add a different assigner, in the middle of the flow, to change
Hi
I was wondering whether there is any reasonably optimised DynamoDB Sink
I am surprised I only found some old, partial discussions about
implementing your own one.
Am I the only one with the requirement of sending output to DynamoDB?
Am I missing something obvious?
I am obviously looking for an
inary libraries (specifically I
> think for me the issue was related to snappy), because my HADOOP_HOME was
> not (properly) set.
>
> I have never used S3 so I don't know if what I mentioned could be the
> problem here too, but worth checking.
>
> Best regards,
>
Hi
I need to run my streaming job as a *standalone* Java application, for
testing
The job uses the Hadoop S3 FS and I need to test it (not a unit test).
The job works fine when deployed (I am using AWS Kinesis Data Analytics, so
Flink 1.8.2)
I have *org.apache.flink:flink-s3-fs-hadoop* as a "com
single record.
>
> Cheers,
> Till
>
> On Mon, Jun 29, 2020 at 3:52 PM Lorenzo Nicora
> wrote:
>
>> Hi
>>
>> My streaming job uses a set of rules to process records from a stream.
>> The rule set is defined in simple flat files, one rule per line.
>&
Hi
I need to set up a dockerized *session cluster* using Flink *1.8.2* for
development and troubleshooting. We are bound to 1.8.2 as we are deploying
to AWS Kinesis Data Analytics for Flink.
I am using an image based on the semi-official flink:1.8-scala_2.11
I need to add to my dockerized cluster
Hi
My streaming job uses a set of rules to process records from a stream.
The rule set is defined in simple flat files, one rule per line.
The rule set can change from time to time. A user will upload a new file
that must replace the old rule set completely.
My problem is with reading and updatin
tag missing values, use 0 = 01.01.1970 to identify missing values.
>
> Deep copies are used whenever the same record has to be used multiple
> times (state, broadcast). That's why I thought your idea of switching to
> POJOs asap should help. Where do you see issues?
>
> [1] htt
d they belong by using the IndexedRecord methods.
>
> [1]
> https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/test/resources/avro/user.avsc
> [2]
> https://github.com/apache/flink/blob/master/flink-formats/flink-avro/pom.xml
> [3] https://gist.github.com/AHeise/
Hi,
related to the same case I am discussing in another thread, but not related
to AVRO this time :)
I need to ingest files a S3 Sink Kafka Connector periodically adds to an S3
bucket.
Files are bucketed by date time as it often happens.
Is there any way, using Flink only, to monitor a base-path
ll) {
>> datum = readWithConversion(
>> oldDatum, f.schema(), f.schema().getLogicalType(), conversion, in);
>> } else {
>> datum = readWithoutConversion(oldDatum, f.schema(), in);
>> }
>>
>> getData().setField(r, f.name(), f.
been
one of the most reactive and helpful I ever interacted with.
On Thu, 11 Jun 2020 at 10:25, Guowei Ma wrote:
> Hi,
> for the 1.8.2(case 1) you could try the format.setReuseAvroValue(false);
>
> Best,
> Guowei
>
>
> Lorenzo Nicora 于2020年6月11日周四 下午5:02写道:
>
>>
d also never cache values like
> class ShiftElements extends MapFunction {
> Object lastElement;
>
> Object map(Object newElement, Collector out) {
> out.collect(lastElement);
> lastElement = newElement; // <- never cache with enableObjectReuse
> }
> }
&
t serializer (but that would cause back and forth type
> transformation).
>
> On Wed, Jun 10, 2020 at 4:31 PM Lorenzo Nicora
> wrote:
>
>> Thanks Timo,
>>
>> the stacktrace with 1.9.2-generated specific file is the following
>>
>>
a test. From the error
> description and the ticket, it looks like the issue is not the
> AvroInputFormat, but the serializer. So it would probably work with a
> different serializer (but that would cause back and forth type
> transformation).
>
> On Wed, Jun 10, 2020 at 4:31 PM Loren
mental in Avro. Maybe 1.9.2 has finally fixed the biggest
> shortcomings such that Flink can properly support them as well.
>
> Regards,
> Timo
>
> [1]
>
> https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/Avr
Hi,
I need to continuously ingest AVRO files as they arrive.
Files are written by an S3 Sink Kafka Connect but S3 is not the point here.
I started trying to ingest a static bunch of files from local fs first and
I am having weird issues with AVRO deserialization.
I have to say, the records contai
25 matches
Mail list logo