Hi Matt,
Here’s an example of writing a DeserializationSchema for your POJOs: [1].
As for simply writing messages from WebSocket to Kafka using a Flink job, while
it is absolutely viable, I would not recommend it,
mainly because you’d never know if you might need to temporarily shut down
Flink
Hi Philipp,
When used against Kinesalite, can you tell if the connector is already reading
data from the test shard before any
of the shard discovery messages? If you have any spare time to test this, you
can set a larger value for the
`ConsumerConfigConstants.SHARD_DISCOVERY_INTERVAL_MILLIS` in
"So in your case I would directly ingest my messages into Kafka"
I will do that through a custom SourceFunction that reads the messages from
the WebSocket, creates simple java objects (POJOs) and sink them in a Kafka
topic using a FlinkKafkaProducer, if that makes sense.
The problem now is I need
Hi Ufuk and Till,
Thanks a lot. Both these suggestions were useful. Older version of xerces
was being loaded from one of the dependencies, and I also fixed the
serialization glitch in my code, and now checkpointing works.
I have 5 value states apart from a custom trigger, and a custom trigger. Is
Hi Dominique,
I totally agree with you on opening a JIRA.
I would suggest to add these “collision” checks also
for other types like in-progress and pending for example.
Thanks for reporting it,
Kostas
> On Nov 15, 2016, at 10:34 PM, Dominique Rondé
> wrote:
>
> Thanks Kostas for this fast and
Thanks Kostas for this fast and helpful response! I was able to reproduce it
and changing the prefix and suffix really solve my problem.
I like to suggest to check both values and write a warning into the logs. If
there is no doubt, i like to open a jira and add this message.
GreetsDominiqu
Hi there,
I am looking into AWS Kinesis and wanted to test with a local install of
Kinesalite. This is on the Flink 1.2-SNAPSHOT. However it looks like my
subtask keeps on discovering new shards indicated by the following log
messages which is constantly written:
21:45:42,867 INFO
org.apache.flin
Hi,
Anything keeping this from being merged into master?
On Thu, Nov 3, 2016 at 10:56 AM, Ufuk Celebi wrote:
> A fix is pending here: https://github.com/apache/flink/pull/2750
>
> The behaviour on graceful shut down/suspension respects the
> cancellation behaviour with this change.
>
> On Thu,
Which version of Flink are you using because I tested exactly this scenario
with the latest 1.2-SNAPSHOT on a local flink cluster and it worked both
ways (independent of which jar was specified by -C or provided as the user
code jar).
Can you maybe share the stack trace of the error. Then I could
Hi Dominique,
Just wanted to add that the RollingSink is deprecated and will eventually
be replaced by the BucketingSink, so it is worth migrating to that.
Cheers,
Kostas
> On Nov 15, 2016, at 3:51 PM, Kostas Kloudas
> wrote:
>
> Hello Dominique,
>
> I think the problem is that you set both
Ok, please let me know when you manage to reproduce.
Cheers,
Aljoscha
On Tue, 15 Nov 2016 at 15:00 Dominique Rondé
wrote:
> Hi Aljoscha,
>
> i try to reproduce this. The job is started only once and fill my /tmp
> Directory within minutes.
>
> Greets
>
> Dominique
>
> Am 14.11.2016 um 16:31 sch
It seems what i tried did indeed not work.
Can you explain me why that doesn't work though?
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Retrieving-values-from-a-dataset-of-datasets-tp10108p10128.html
Sent from the Apache Flink User Mailin
Hello Dominique,
I think the problem is that you set both pending prefix and suffix to “”.
Doing this makes the “committed” or “finished” filepaths indistinguishable from
the pending ones.
Thus they are cleaned up upon restoring.
Could you undo this, and put for example a suffix “pending” or s
Good afternoon,
Currently we are using UnifiedViews (unifiedviews.eu) for RDF data
processing. So you may define various RDF data processing tasks in
UnifiedViews, e.g.: 1) extract data from certain SPARQL endpoint A, 2)
extract data from certain folder and convert it to RDF, 3) merge RDF data
out
Hi Miguel,
I'm sorry for the late reply; this e-mail got stuck in my spam folder. I'm
glad that you've found a solution :)
I've never used flink with docker, so I'm probably not the best person to
advise you on this. However, if I understand correctly, you're changing the
configuration before sub
Hi Till,
Sorry, I understand that this was confusing.
The JarWithMain contains a class called Launcher with main method that does
something like:
Class.getForName("classFromUserJar").newInstance().launch()
So it doesnt have any "static" dependency to the UserJar. JarWithMain has a
lot of API cla
Hi @all!
I figured out a strange behavior with the Rolling HDFS-Sink. We consume
events from a kafka topic and write them into a HDFS Filesystem. We use
the RollingSink-Implementation in this way:
RollingSink sink = new
RollingSink("/some/hdfs/directory") //
.setBucketer(new DateT
Hi Aljoscha,
i try to reproduce this. The job is started only once and fill my /tmp
Directory within minutes.
Greets
Dominique
Am 14.11.2016 um 16:31 schrieb Aljoscha Krettek:
> Hi,
> could it be that the job has restarted due to failures a large number
> of times?
>
> Cheers,
> Aljoscha
>
> O
Hi Gyula,
did I understand it correct that JarWithMain depends on UserJar because the
former will execute classes from the latter and UserJar depends on
JarWithMain because it contains classes depending on class from
JarWithMain? This sounds like a cyclic dependency to me.
The command line help f
Hi,
I understand now. For early (speculative) firing I would suggest to write a
custom trigger that repeatedly fires on processing time. We're also working
on a Trigger DSL that will make such cases simpler, for example, you would
be able to write:
window.trigger(EventTime.pastEndOfWindow().withEa
Hi Matt,
as you've stated Flink is a stream processor and as such it needs to get
its inputs from somewhere. Flink can provide you up to exactly-once
processing guarantees. But in order to do this, it requires a re-playable
source because in case of a failure you might have to reprocess parts of
t
Hi Satish,
your problem seems to be more related to a problem reading Hadoop's
configuration. According to the internet [1,2,3] try to select a proper
xerces version to resolve the problem.
[1]
http://stackoverflow.com/questions/26974067/org-apache-hadoop-conf-configuration-loadresource-error
[2]
I found the solution here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Regarding-Broadcast-of-datasets-in-streaming-context-td6456.html
2016-11-14 9:41 GMT+01:00 Ufuk Celebi :
> I think this is independent of streaming. If you want to compute the
> aggregate over all keys
Hello,
How exactly do you represent the DataSet of DataSets? I'm asking
because if you have something like a
DataSet>
that unfortunately doesn't work in Flink.
Best,
Gábor
2016-11-14 20:44 GMT+01:00 otherwise777 :
> Hey There,
>
> I'm trying to calculate the betweenness in a graph with Flin
Hi Scott,
Thanks, I am familiar with the ways you suggested. Unfortunately packaging
everything together is not really an option in our case, we specifically
want to avoid having to do this as many people will set up their own builds
and they will inevitable fail to include everything necessary wi
Hey Aljoscha,
that sounds very promising, awesome! Though, I still would need to implement my
own window management logic (window assignment and window state purging),
right? I was thinking about reusing some of the existing components
(TimeWindow) and WindowAssigner, but run my own WindowOpera
26 matches
Mail list logo