Good to know!
On January 10, 2017 at 1:06:29 PM, Renjie Liu (liurenjie2...@gmail.com) wrote:
Hi, all:
I used kafka connector 0.10 and the problem is fixed. I think this maybe caused
by incompatible between consumer 0.9 and broker 0.10.
Thanks Henri and Gordon.
On Tue, Jan 10, 2017 at 4:46 AM H
Hi, all:
I used kafka connector 0.10 and the problem is fixed. I think this maybe
caused by incompatible between consumer 0.9 and broker 0.10.
Thanks Henri and Gordon.
On Tue, Jan 10, 2017 at 4:46 AM Henri Heiskanen
wrote:
> Hi,
>
> We had the same problem when running 0.9 consumer against 0.10
Hi Telco,
What do you mean about the “keyBy value” ? Is it the string parameter value,
i.e. “partition” in your case , or the real key value of an actual element
being processed ?
If you mean the string parameter value, it seems that currently it doesn’t
support. If you mean the latter one,
Hi,
I found the root cause of the problem : the listEligibleFiles method in
ContinuousFileMonitoringFunction scans only the topmost files and ignores
the nested files. By fixing that I was able to get the expected output. I
created Jira issue: https://issues.apache.org/jira/browse/FLINK-5432.
@Ko
Hi,
Unfortunately I can not use reduce function.
I am now going with WindowFunction and see how it works on our production
load.
Br,
Henkka
On Wed, Jan 4, 2017 at 2:46 PM, Fabian Hueske wrote:
> Hi Henri,
>
> can you express the logic of your FoldFunction (or WindowFunction) as a
> combinatio
Hi,
We have been using HashMap and has been working fine so far.
Br,
Henkka
On Mon, Jan 9, 2017 at 5:35 PM, Aljoscha Krettek
wrote:
> You could try using JSON for all your data, this might me slow, however.
> The other route, which I would suggest, is to have your own custom
> TypeSerializers
Hi,
We had the same problem when running 0.9 consumer against 0.10 Kafka.
Upgrading Flink Kafka connector to 0.10 fixed our issue.
Br,
Henkka
On Mon, Jan 9, 2017 at 5:39 PM, Tzu-Li (Gordon) Tai
wrote:
> Hi,
>
> Not sure what might be going on here. I’m pretty certain that for
> FlinkKafkaConsu
Hi Tzu-Li,
Huge thanks for the input, I'll try to implement prototype of your idea and
see if it answers my requirements
On 9 January 2017 at 08:02, Tzu-Li (Gordon) Tai wrote:
> Hi Igor!
>
> What you can actually do is let a single FlinkKafkaConsumer consume from
> both topics, producing a sing
So I created a minimal working example where this behaviour can still be
seen. It is 15 LOC and can be downloaded here:
https://github.com/JonasGroeger/flink-inetaddress-zeroed
To run it, use sbt:
If you don't want to do the above fear not, here is the code:
For some reason, java.net.InetAddress
Hi,
We are using Sliding Event Time Window with Kafka Consumer. The window size
is 6 minutes, and slide is 2 minutes. We have written a window function to
select a particular window out of multiple windows for a keyed stream, e.g.
we select about 16 windows out of multiple windows for the keyed st
Hi,
We are using Sliding Event Time Window with Kafka Consumer. The window size
is 6 minutes, and slide is 2 minutes. We have written a window function to
select a particular window out of multiple windows for a keyed stream, e.g.
we select about 16 windows out of multiple windows for the keyed st
Hi Kostas,
I debugged the code and the nestedFileEnumeration parameter was always true
during the execution. I noticed however that in the following loop
in ContinuousFileMonitoringFunction, for some reason, the fileStatus was
null for files in nested folders, and non null for files directly under
Thanks a lot Aljoshca, this was a perfect answer to my vague question.
On 09-Jan-2017 4:52 pm, "Aljoscha Krettek" wrote:
Hi,
to clarify what Kostas said. A "single window" in this case is a window for
a given key and time period so the window for "key1" in time t1 to t2 can
be processed on a dif
Hi,
did you create a Jira issue for this? (I'm just getting up to speed after
vacation so sorry if you already did this, I didn't yet read the Jira mail.)
Cheers,
Aljoscah
On Fri, 6 Jan 2017 at 19:08 Stephan Ewen wrote:
> I think you are right, enabling checkpointing should not override the
> c
Hi,
to clarify what Kostas said. A "single window" in this case is a window for
a given key and time period so the window for "key1" in time t1 to t2 can
be processed on a different machine from the window for "key2" in time t1
to t2.
Cheers,
Aljoscha
On Thu, 5 Jan 2017 at 21:56 Kostas Kloudas
w
You could try using JSON for all your data, this might me slow, however.
The other route, which I would suggest, is to have your own custom
TypeSerializers than can efficiently deal with different types and dynamic
schemas.
Cheers,
Aljoscha
On Thu, 5 Jan 2017 at 07:02 ljwagerfield
wrote:
> I sh
I too was able to confirm that the issue was fixed in the 1.2
release-candidate 0 (zero) tag. I was unable to get it working with 1.1.4,
but with 1.2 it worked without any additional modifications.
Best,
--Scott Kidder
On Mon, Jan 9, 2017 at 7:03 AM, Aljoscha Krettek
wrote:
> I just verified t
Hi,
I think your approach with two window() operations is fine. There is no way
to retrieve the result from a previous window because it is not strictly
defined what the previous window is. Also, keeping data inside your user
functions (in fields) is problematic because these function instances are
Hi,
Not sure what might be going on here. I’m pretty certain that for
FlinkKafkaConsumer09 when checkpointing is turned off, the internally used
KafkaConsumer client will auto commit offsets back to Kafka at a default
interval of 5000ms (the default value for “auto.commit.interval.ms”).
Could
Yes, thanks for the effort. I will look into it.
Kostas
> On Jan 9, 2017, at 4:24 PM, Lukas Kircher
> wrote:
>
> Thanks for your suggestions:
>
> @Timo
> 1) Regarding the recursive.file.enumeration parameter: I think what counts
> here is the enumerateNestedFiles parameter in FileInputFormat
Hi,
Flink will serialise uses functions when distributing work across the
cluster. Therefore your shared objects will not be shared objects anymore
once your program executes. You will still get object sharing because only
one instance of your function is used to process data on one parallel
instan
Thanks for your suggestions:
@Timo
1) Regarding the recursive.file.enumeration parameter: I think what counts here
is the enumerateNestedFiles parameter in FileInputFormat.java. Calling the
setter for enumerateNestedFiles is expected to overwrite
recursive.file.enumeration. Not literally - I th
Hi,
I'm assuming you also have the call to assignTimestampsAndWatermarks()
somewhere in there as well, as in:
stream
.assignTimestampsAndWatermarks(new TimestampGenerator()) // or
somewhere else in the pipeline
.keyBy("id")
.map(...)
.filter(...)
.map(...)
.keyB
I just verified that this does indeed work on master and the Flink 1.2
branch. I'm also pushing a test that verifies this.
Stephan is right that this doesn't work on Flink 1.1? Is that critical for
you, if yes, then we should think about pushing out another Flink 1.x
bugfix release.
Cheers,
Aljos
Hi,
it's not possible to access the key in the open method because without an
element that is being processed there is no key. The user function is being
used to produce elements of different keys that are being processed on the
same shard (instance of a parallel operator). You can get the key manu
Hi team,
any suggestions on below topic?
I have a requirement that wants to output two different values from a time
window reduce function. Here is basic workflow
1. fetch data from Kafka
2. flow the data to a event session window. kafka source -> keyBy ->
session window -> reduce
3. inside the
On Jan 9, 2017 3:41 PM, "Aljoscha Krettek" wrote:
> I think the split/select variant should be a bit faster because it creates
> less object copies internally. It should also be more future proof because
> it will benefit from improvements (if any) in the way split/select works.
>
> There is also
Hi Yassine,
I suspect that the problem is in the way the input format (and not the reader)
scans nested files,
but could you see if in the code that is executed by the tasks, the
nestedFileEnumeration parameter is still true?
I am asking in order to pin down if the problem is in the way we shi
Hi Lukas,
Are you sure that the tempFile.deleteOnExit() does not remove the files before
the test completes.
I am just asking to be sure.
Also from the code, I suppose that you run it locally. I suspect that the
problem is in the way the input
format scans nested files, but could you see if in
+Till Off the top of your head, do you know how this could be done?
On Fri, 23 Dec 2016 at 11:21 Jim Raney wrote:
> Hello all,
>
> Is there any simple way to get the current address of the jobmanager
> cluster leader from any of the jobmanagers? I know it gives you a
> redirect on the webui, bu
I think the split/select variant should be a bit faster because it creates
less object copies internally. It should also be more future proof because
it will benefit from improvements (if any) in the way split/select works.
There is also some ongoing work in adding support for side outputs which
a
Hi Lukas,
have you tried to set the parameter " recursive.file.enumeration" to true?
|// create a configuration object Configuration parameters = new
Configuration(); // set the recursive enumeration parameter
parameters.setBoolean("recursive.file.enumeration", true); |
If this also does not
Hi all,
this is probably related to the problem that I reported in December. In case it
helps you can find a self contained example below. I haven't looked deeply into
the problem but it seems like the correct file splits are determined but
somehow not processed. If I read from HDFS nested file
Hi,
Any updates on this issue? Thank you.
Best,
Yassine
On Dec 20, 2016 6:15 PM, "Aljoscha Krettek" wrote:
+kostas, who probably has the most experience with this by now. Do you have
an idea what might be going on?
On Fri, 16 Dec 2016 at 15:45 Yassine MARZOUGUI
wrote:
> Looks like this is
I'm not a Kafka expert but maybe Gordon (in CC) knows more.
Timo
Am 09/01/17 um 11:51 schrieb Renjie Liu:
Hi, all:
I'm using flink 1.1.3 and kafka consumer 09. I read its code and it
says that the kafka consumer will turn on auto offset commit if
checkpoint is not enabled. I've turned off ch
Anything that I could check or collect for you for investigation ?
On Sat, Jan 7, 2017 at 1:35 PM, Chakravarthy varaga <
chakravarth...@gmail.com> wrote:
> Hi Stephen
>
> . Kafka version is: 0.9.0.1 the connector is flinkconsumer09
> . The flatmap n coflatmap are connected by keyBy
> . No data is
Hi, all:
I'm using flink 1.1.3 and kafka consumer 09. I read its code and it says
that the kafka consumer will turn on auto offset commit if checkpoint is
not enabled. I've turned off checkpoint and it seems that kafka client is
not committing to offsets to kafka? The offset is important for helpin
Hey Dawid! Thanks for reporting this. I will try to have a look over
the course of the day. From a first impression, this seems like a bug
to me.
On Sun, Jan 8, 2017 at 4:43 PM, Dawid Wysakowicz
wrote:
> Hi I was experimenting with the Query State feature and I have some problems
> querying the s
Hi,
I am running Flink on YARN mode. My app consumes data from Kafka and then
does some internal processing and then indexes the data to Solr cloud. I
have a big Solr cloud (one collection for each month of a language and the
collection name is decided from the date of the message read from kafka
Hi Niels,
Thank you for bringing this up. I recall there was some previous discussion
related to this before: [1].
I don’t think this is possible at the moment, mainly because of how the API is
designed.
On the other hand, a KeyedStream in Flink is basically just a DataStream with a
hash part
40 matches
Mail list logo