Hi,
I am currently using FlinkDynamoStreamsConsumer in Production, for
monitoring the lag I am relying on millisBehindLatest metric but this
always returns -1 even if the dynamo stream contains million records
upfront.
Also, it would be great if we can add a documentation mentioning that Flink
su
Hi Fabian,
Thank you for the response. So I am currently using .writeAsText() to print out
9 different datastreams in one Flink job as I am printing my original
datastream with various filters applied to it. I usually see around 6-7 of my
datastreams successfully list the JSON file in my S3 buc
Recently we moved from Oracle to Cassandra. In Oracle we were using advance
analytical functions such as lag, lead and Macth_Recognize heavily.
I was trying to identify equivalent functionality in java, and came across
Apache Flink, however I'm not sure if I should use that library in
stand-alone
Hello everbody,
Has anyone tried testing AggregateFunction() and ProcessWindowFunction() on a
KeyedDataStream? I have reviewed the testing page on Flink’s official website
(https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/testing.html)
and I am not quite sure how I could utiliz
Flink is merely StreamProcessing. I would not use it in a synchronous web call.
However, I would not make any complex analytic function available on a
synchronous web service. I would deploy a messaging bus (rabbitmq, zeromq etc)
and send the request there (if the source is a web app potentially
Thanks for the feedback, clear about non blocking interfaces.
However, can you clarify or guide me to any other libraries which can be
used with java collections for complex analytics.
On Mon, Oct 28, 2019, 11:29 Jörn Franke wrote:
> Flink is merely StreamProcessing. I would not use it in a syn
Hi Pankaj,
It seems it is a bug. You can report it by opening a Jira issue.
Best,
Vino
Pankaj Chand 于2019年10月28日周一 上午10:51写道:
> Hello,
>
> I am trying to modify the parallelism of a streaming Flink job (wiki-edits
> example) multiple times on a standalone cluster (one local machine) having
> t
It seems to be the case. But when I use timeWindow or CEP with
fromCollection, it works well. For example,
```
sEnv.fromCollection(Seq[Long](1, 1002, 2002,
3002)).assignAscendingTimestamps(identity[Long])
.keyBy(_ % 2).timeWindow(Time.seconds(1)).sum(0).print()
```
prints
```
1
1002
2002
300
Before a program close, it will emit Long.MaxValue as the watermark and that
watermark will trigger all the windows. This is the reason why your
`timeWindow` program could work. However, for the first program, you have not
registered the event time timer(though
context.timerService.registerEven
Hi Michael,
You may need to know `KeyedOneInputStreamOperatorTestHarness` test class.
You can consider
`WindowTranslationTest#testAggregateWithWindowFunctionEventTime` or
`WindowTranslationTest#testAggregateWithWindowFunctionProcessingTime`[1](both
of them call `processElementAndEnsureOutput`) as
Hi Vino,
This is a great example – thank you!
It looks like I need to instantiate a StreamExecutionEnvironment to order to
get my OneInputStreamOperator. Would I need to setup a local flinkCluster using
MiniClusterWithClientResource in order to use StreamExecutionEnvironment?
Best,
Michael
Hi all,
I have my own stream operator which trigger an aggregation based on the
number of items received
(OneInputStreamOperator#processElement(StreamRecord)). However, it is
possible to not trigger my aggregation if my operator does not receive the
max items that have been set. So, I need a timeo
Hi Michael,
>From the WindowTranslationTest, I did not see anything about the
initialization of mini-cluster. Here we are testing operator, it seems
operator test harness has provided the necessary infrastructure.
You can try to see if there is anything missed.
Best,
Vino
Nguyen, Michael 于2019
The reason why the watermark is not advancing is that
assignAscendingTimestamps is a periodic watermark generator. This
style of watermark generator is called at regular intervals to create
watermarks -- by default, this is done every 200 msec. With only a
tiny bit of data to process, the job doesn
Hi,
I am trying to execute Wordcount.jar in Flink 1.8.1 with Hadoop version 2.6.5.
HDFS is enabled with Kerberos+SSL. While writing output to HDFS, facing the
below exception and job will be failed. Please let me know if any suggestions
to debug this issue.
Caused by: org.apache.flink.runtime.
Thank you for your response. Registering a timer at Long.MaxValue works.
And I have found the mistake in my original code.
When a timer fires and there are elements in the priority queue with
timestamp greater than current watermark, they do not get processed. A new
timer should be registered for
It seems that the CryptoCodec is null from the exception stack trace. This may
occur when "hadoop.security.crypto.codec.classes.aes.ctr.nopadding" is
misconfigured. You could change the log level to "DEBUG" and it will show more
detailed information about why CryptoCodec is null.
> 在 2019年10月28
Hi,
>From debug logs I could see below logs in taskmanager. Please have a look.
org.apache.hadoop.ipc.ProtobufRpcEngine Call: addBlock took 372ms"}
org.apache.hadoop.ipc.ProtobufRpcEngine Call: addBlock took 372ms"}
org.apache.hadoop.hdfs.DFSClient pipeline = 10.76.113.216:1044"}
org.apache.hadoop
Thanks for your replies.
We use Flink from within a standalone Java 8 application (no Hadoop, no
clustering), so it's basically boils down to running a simple code like this:
import java.util.*;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.graph.*;
import org.ap
I guess this is a bug in Hadoop 2.6.5 and has been fixed in Hadoop 2.8.0 [1].
You can work around it by explicitly setting the configration
"hadoop.security.crypto.codec.classes.aes.ctr.nopadding" as
"org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec,
org.apache.hadoop.crypto.JceAesCtrCryptoCod
Thanks for the information. Without setting such parameter explicitly, is there
any possibility that it may work intermittently?
From: Dian Fu
Sent: Tuesday, October 29, 2019 7:12 AM
To: V N, Suchithra (Nokia - IN/Bangalore)
Cc: user@flink.apache.org
Subject: Re: Flink 1.8.1 HDFS 2.6.5 issue
I
Hi there,
I'm querying json data and is working fine. I would like to add custom
fields including the query result.
My query looks like: select ROW(`source`), ROW(`destination`), ROW(`dns`),
organization, cnt from (select (source.`ip`,source.`isInternalIP`) as
source, (destination.`ip`,destinatio
22 matches
Mail list logo