sounds for me
it's your use case, you can check sliding window.
I think seeing watermark in UI is possible now, or you can use debug mode to
see it.
The watermark you use won't wait for all topics(partitions). It's possible
if you implement your own watermark.
Cheers,
Sendoh
e any event, it means window is not triggered.
It would mean Watermark is not increasing. The issue can be the timestamp is
not extracted correctly.
Or, if you miss the trigger if use the window function doesn't have it.
Cheers,
Sendoh
--
Sent from: http://apache-flink-user-mailing-li
As I understood if wanted a smooth shutdown, it's recommend to throw
exception, and then cancel() is called, where you can even write your own.
Don't think it's the same as System.exit()
Cheers,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi,
Isn't accumulator like what fits your use case? Accumulator is shared.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
I think the first requirement is possible by using accumulator or metric, or?
Cheers,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
a type
is not supporting tumble window no matter using Types.LONG() or
Types.SQL_TIMESTAMP().
Is there anything I should also notice?
Best,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
a type
is not supporting tumble window no matter using Types.LONG() or
Types.SQL_TIMESTAMP().
Is there anything I should also notice?
Best,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
les",
"country"},
new TypeInformation[] { Types.STRING(),
Types.DOUBLE(), Types.INT()}
);
Cheers,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Thanks! don't know this works as well.
Cheers,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Found it. I should use .returns(typeInformation) after the map function.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
://gist.github.com/HungUnicorn/8a5c40fcf1e25c51cf77dc24a227d6d4
Best,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
I would recommend to also print the count of input and output of each
operator by using Accumulator.
Cheers,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
exactly! I initially thought this class is in table API. I was building a
custom table source and found I have to add Kafka connector dependency for
reading JSON encoded data, although my table source doesn't need it.
Cheers,
Hung
--
Sent from: http://apache-flink-user-mailing-list-archive.23
x27;t JsonRowDeserializationSchema be more general?
For example in our case we want to serialize json object via the REST API
but not through Kafka.
Best,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
x27;t JsonRowDeserializationSchema be more general?
For example in our case we want to serialize json object via the REST API
but not through Kafka.
Best,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
flink-connector-elasticsearch-base/src/main/java/org/apache/flink/streaming/connectors/elasticsearch/ElasticsearchSinkBase.java
Cheers,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi Flink users,
Did someone use accumulator with Elasticsearch Sink? So we can better
compare the last timestamps in the sink and the last timestamps in
Elasticsearch, in order to see how long does it take from the Elasticsearch
sink to Elasticsearch.
Best,
Sendoh
--
Sent from: http://apache
(mappers/reducers). But how would the later look like? would
it be to put the data in ExecutionConfig and let worker read ExecutionConfig
repeatedly? Would there be a critical factor that determine which one is
better than the other?
Best,
Sendoh
--
View this message in context:
http://apache-flink
}
@Override
public Watermark getCurrentWatermark() {
return new Watermark(currentMaxTimestamp - maxTimeLag);
}
}
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Last-event-in-event-time-window-is-not-out
es are not busy(more
memory left and task slots)? or Job manager doesn't consider it?
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-does-Job-manager-schedule-jobs-to-task-managers-tp13274.html
Sent from the Apache Fli
Could I also ask is this behavior the same as master?
I saw that when master uses more than 100% memory (starting a new job uses
35%, and master already uses 70%), ubuntu shuts down and restarts.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive
Hi Fabian,
Thank you for quick reply. I run the job in streaming environment.
So I think in streaming env memory is allocated up to the configured amount
and never returned until Flink is shutdown as you said if I understand well.
Best,
Sendoh
--
View this message in context:
http
itself, or sth else.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/memory-usage-in-task-manager-when-run-and-cancel-a-job-tp13142.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Nabble.com.
size limit, batch size and so on).
The data accessed by this Flink job is quite small. I would estimate it to
be no more than 50 MB, and each node has 2 GB RAM. Is there any suggest I
could work on?
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.23
Hi Robert,
Could I ask which endpoint you use to get the memory statistics of a Flink
job? I checked here but don't know which one to use.
https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/rest_api.html
Or should we put the memory in the metrics?
Best,
Sendoh
--
,
LOG_PATH);
StreamExecutionEnvironment env =
StreamExecutionEnvironment.createLocalEnvironment(4, ENVCONFIG);
I got 404. Is this the problem of the version of flink-runtime-web?
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Fail-to-call-
ob manager?
Savepoint seems to be necessary to be stored in the place where job manager
can find, and then it can start the job again. It looks like it can work
without s3 or hdfs , is that true? ( if this works, that means we can use
EBS)
Best,
Sendoh
--
View this message in context:
http://apache-
lance();
DataStream smallStream = env.setParallelism(1).addSource(
new FooSource(properties, smallTopic)
).rebalance();
env.setParallelism(3);
//do .map(), window(), ...
Would it have the same effect?
Best,
Sendoh
--
View this message in context:
http://apache-flink
Found the reason.
I saw using ParallelSourceFunction my override open() is called 4 times,
comparing to using sourceFunction open() is called only once, and my
override open() constructs the connection to sources, which determines how
many source are going to be read.
Cheers,
Sendoh
--
View
x27;t
understand SourceFunction and ParallelSourceFunction correctly.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Better-way-to-read-several-stream-sources-tp11224p11270.html
Sent from the Apache Flink User Mailing List ar
}
@Override
public void run(final SourceContext ctx) throws Exception {
LOG.info("Processing");
// It won't work, however, a parallel for-loop is fine for
performance concern?
for(FooSource fooSource: fooSourceList) {
fooSource.run(ctx);
have as expected, we can still union those two
streams into one stream and use similar solution.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/PartitionedState-and-watermark-of-Window-coGroup-tp10620.html
Sent from the
Thanks, I follow your suggestion and it works as we expected.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-let-1-1-3-not-drop-late-events-as-verion-1-0-3-does-tp10349p10510.html
Sent from the Apache Flink User
Hi Flink users,
Can I ask is my understanding of onEventTime() correct?
In my custom trigger, I have sth as follows:
onElement(JSONObject element, long timestamp, W window, TriggerContext ctx){
if(count == 3) {
ctx.registerEventTimeTimer(ctx.getWatermark+10);
return Trigger
Hi Flink users,
Can I ask how to avoid default allowLateness(0) ? so that late events
becomes single-element windows as 1.0.3 version acts?
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-let-1-1-3-not-drop-late
Thank you for your reply.
It sounds for me should not be the error that causing job manager down? Or
it can?
Currently we use 1.1.3.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Could-not-find-job-with-id-X
Hi Flink users,
Suddenly I discovered this "Could not find job with id". What would be the
possible causes for this?
It would be good to know the Job name of that job id but I cannot neither go
to web UI nor use ./bin/flink list
2016-11-16 16:26:21,276 WARN
org.apache.flink.runtime.webmonitor.Ru
@Override
public Watermark getCurrentWatermark() {
return new Watermark(currentMaxTimestamp - maxTimeLag);
}
}
Cheers,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Cannot-see-all-events-in-window-apply-for-big-input-tp99
split(), or implementing an event type recognized
AssignerWithPeriodicWatermarks along with custom EventTimeTrigger would be
the solution?
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Cannot-see-all-events-in-window-apply-for-big
, 10-30-XX
.
.
.
eventA, 11-04-XX
eventA is much much larger than eventB,
and it looks like we lost the count of eventA at 10-29 and 10-30 while we
have count of eventA at 11-04-XX.
Could it be the problem that watermark is gloabal rather than per event?
Best,
Sendoh
--
View this message in
only reads eventA, we can see
all of them.
It looks like data is stuck in that operator and the watermark of that event
which should trigger the window comes too late, when there is a lot of data,
or?
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list
parallelisms for window
operation if reprocessing a skew input from Kafka because it works with
fewer events, and small topics always appear while big topics disappear.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Cannot-see
ollector) throws Exception {
Iterator it = iterable.iterator();
while(it.hasNext()){
collector.collect(it.next());
}
}
}).writeAsText("test",
FileSystem.WriteMode.OVERWRITE).setParallelism(1);
Is there any sugges
Hi Flink users,
I saw a strange behavior that data are missing in reduce() but apply()
doesn't, and when using 1.0.3 we don't see this behavior, and we see this in
1.1.3. Losing data means we don't see any event in the keys assigned, not
the count of events.
The code is as follows.
DataStream> s
Thank you for your reply.
Does it mean when calling monitoring REST API - /joboverview/completed, we
actually call execution graphs?
Because I thought REST API reads a log file somewhere?
Cheers,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4
Hi Flink users,
Does anyone know how to clean up job history? especially for failed Flink
jobs
Our use case is to clean up history of failed jobs after we saw it.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Clean-up
reams = env.addSource(
new FlinkKafkaConsumer09<>(topicList, new JSONSchema(),
properties))
*.rebalance()*
.assignTimestampsAndWatermarks(new CorrelationWatermark());
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-l
, and it would
look like to join the configuration of Kafka connector with the data stream?
With this feature we can append static data to each event in the stream and
will be very useful?
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4
Probably `processAt` is not used adequately because after increasing maxDelay
in watermark to 10 minutes it works as expected.
Is there any upper limit of setting this maxDelay? Because there might be
too many windows are waiting for the last instance?
Best,
Sendoh
--
View this message in
Thank you for helping the issue.
Those single-element-windows arrive within seconds and delay is configured
with watermark as 6 seconds.
Following are some samples after investigated.
...
{"hashCode":-1798107744,"count":1,"processAt":"2016-08-01T11:08:05.846","startDate":"2016-07-19T21:34:00.
n timestamp;
}
@Override
public Watermark getCurrentWatermark() {
return new Watermark(currentMaxTimestamp - maxOutOfOrderness);
}
}
We have no problem with a smaller Kafka topic with Flink 1.0.3. Do we make a
mistake somewhere?
Please let me know if any further information is required to resolve this
issue.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/TimeWindowAll-doeesn-t-assign-properly-tp8201.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Nabble.com.
Thank you. It totally works as what we want which unions data streams.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Iterate-several-kafka-topics-using-the-kafka-connector-tp7673p7680.html
Sent from the Apache Flink User
ator> streamsIt = streams.iterator();
DataStream currentStream = streamsIt.next();
while(streamsIt.hasNext()){
DataStream nextStream = streamsIt.next();
currentStream = currentStream.union(nextStream);
}
Cheers,
Sendoh
--
View this message
s/437
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Logs-show-Marking-the-coordinator-2147483637-dead-in-Flink-Kafka-conn-tp7396p7420.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Nabble.com.
ap.servers", Config.bootstrapServers);
properties.setProperty("group.id", parameter.getRequired("groupId"));
properties.setProperty("auto.offset.reset", Config.autoOffsetReset);
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-ar
Glad to see it's developing.
Can I ask would the same feature (reconnect) be useful for Kafka connector ?
For example, if the IP of broker changes.
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Any-way-for-Flink-Elasticsearch-connector-refl
Flink environment create a new data sink?
We use Flink Elasticsearch-connector2(for Elasticsearch2.x) on AWS
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Any-way-for-Flink-Elasticsearch-connector-reflecting-IP-change-of-Elastics
Maybe you can refer to this- Kafka + Flink
http://data-artisans.com/kafka-flink-a-practical-how-to/
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Processing-millions-of-messages-in-milliseconds-real-time-Architecture-guide-required-tp6191p6
ter is which sounds not very robust?
Would be glad to know any better implementation and mistakes I have made.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Need-to-sleep-the-thread-to-let-my-Flink-Zookeeper-datasource-with-No
Thank you! Totally works.
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Turn-off-logging-in-Flink-tp6196p6200.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Nabble.com.
();
env_config.setBoolean("printProgressDuringExecution", false);
StreamExecutionEnvironment env =
StreamExecutionEnvironment.createLocalEnvironment(4, env_config);
env.execute();
Do you have any other suggestions?
Best,
Sendoh
--
View this message in context:
http://apache-flink-user-mailing-li
61 matches
Mail list logo