Correction: I cannot work around the problem. If I exclude hadoop1, I get the
following exception which appears to be due to flink-java-1.1.0's dependency on
Hadoop1.
Failed to submit job 4b6366d101877d38ef33454acc6ca500
(com.expedia.www.flink.jobs.DestinationCountsHistoryJob$)
org.apache.flink
Hi folks, congrats on 1.1.0!
FYI, after updating to Flink 1.1.0 I get the exception at bottom when
attempting to run a job that uses AvroParquetInputFormat wrapped in a Flink
HadoopInputFormat. The ContextUtil.java:71 is trying to execute:
Class.forName("org.apache.hadoop.mapreduce.task.JobCont
Hi Janardhan,
the fixed partitioner is the only one shipped with Flink.
However, it should be fairly simple to implement one that uses the key to
determine the partition.
On Mon, Aug 8, 2016 at 7:16 PM, Janardhan Reddy wrote:
> Hi,
> The Flink kafka producer uses fixed partitioner by default an
Thanks Stephan, I had a MapFunction using Unirest and that was the origin
of the leak.
On Tue, Aug 2, 2016 at 7:36 AM, Stephan Ewen wrote:
> My guess would be that you have a thread leak in the user code.
> More memory will not solve the problem, only push it a bit further away.
>
> On Mon, Aug
Hi Prabhu,
I'm pretty sure that the Kafka 09 consumer commits offsets to Kafka when
checkpointing is turned on.
In the FlinkKafkaConsumerBase.notifyCheckpointComplete(), we call
fetcher.commitSpecificOffsetsToKafka(checkpointOffsets);, which calls
this.consumer.commitSync(offsetsToCommit); in
Kaf
Hi,
you can get the offsets (current and committed offsets) in Flink 1.1 using
the Flink metrics.
In Flink 1.0, we expose the Kafka internal metrics via the accumulator
system (so you can access them from the web interface as well). IIRC, Kafka
exposes a metric for the lag as well.
On Mon, Aug 8,
Hi Stephan,
The flink kafka 09 connector does not do offset commits to kafka when
checkpointing is turned on. Is there a way to monitor the offset lag in this
case,
I am turning on a flink job that reads data from kafka (has about a week
data - around 7 TB) , currently the approximate way that I
>From the code in Kafka09Fetcher.java
// if checkpointing is enabled, we are not automatically committing to
Kafka.
kafkaProperties.setProperty(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,
Boolean.toString(!runtimeContext.isCheckpointingEnabled()));
If flink checkpointing
Great work all. Great Thanks to Ufuk as RE :)
On Monday, August 8, 2016, Stephan Ewen wrote:
> Great work indeed, and big thanks, Ufuk!
>
> On Mon, Aug 8, 2016 at 6:55 PM, Vasiliki Kalavri <
> vasilikikala...@gmail.com >
> wrote:
>
> > yoo-hoo finally announced 🎉
> > Thanks for managing the rele
Great work indeed, and big thanks, Ufuk!
On Mon, Aug 8, 2016 at 6:55 PM, Vasiliki Kalavri
wrote:
> yoo-hoo finally announced 🎉
> Thanks for managing the release Ufuk!
>
> On 8 August 2016 at 18:36, Ufuk Celebi wrote:
>
> > The Flink PMC is pleased to announce the availability of Flink 1.1.0.
>
Thanks! I’ll be watching that issue then
Adam
> On 08 Aug 2016, at 05:01, Aljoscha Krettek wrote:
>
> Hi Adam,
> sorry for the inconvenience. This is caused by a new file read operator,
> specifically how it treats watermarks/timestamps. I opened an issue here that
> describes the situation:
Hi,
The Flink kafka producer uses fixed partitioner by default and all our
events are ending up in 1-2 partitions. Is there any inbuilt partitioner
which distributes keys such that same key maps to same partition.
Thanks
yoo-hoo finally announced 🎉
Thanks for managing the release Ufuk!
On 8 August 2016 at 18:36, Ufuk Celebi wrote:
> The Flink PMC is pleased to announce the availability of Flink 1.1.0.
>
> On behalf of the PMC, I would like to thank everybody who contributed
> to the release.
>
> The release anno
The Flink PMC is pleased to announce the availability of Flink 1.1.0.
On behalf of the PMC, I would like to thank everybody who contributed
to the release.
The release announcement:
http://flink.apache.org/news/2016/08/08/release-1.1.0.html
Release binaries:
http://apache.openmirror.de/flink/fli
Thank you for the help Robert!
Regarding the static field alternative you provided, I'm a bit confused
about the difference between slots and instances.
When you say that by using a static field it will be shared by all
instances of the Map on the slot, does that mean that if the TM has
multiple
Hi Paul,
the example in the code is outdated, StringToByteSerializer has probably
been removed quite a while ago. I'll update the documentation once we
figured out the other problem you reported.
What's the exception you are getting?
Regards,
Robert
On Mon, Aug 8, 2016 at 4:33 PM, Paul Joireman
Hi all,
The documentation describing the use of RabbitMQ as a sink gives the following
example:
RMQConnectionConfig connectionConfig = new RMQConnectionConfig.Builder()
.setHost("localhost").setPort(5000).setUserName(..)
.setPassword(..).setVirtualHost("/").build();
stream.addSink(new RMQSink(
Hi Theo,
I think there are some variants you can try out for the problem. I think it
depends a bit on the performance characteristics you expect:
- The simplest variant is to run one TM per machine with one slot only.
This is probably not feasible because you can't use all the CPU cores
- ... to s
So if I change the input data to (I added an uID value to identify the single
data sets):
new Data(1, 123, new Date(116, 8,8,11,11,11), 5),
new Data(2, 123, new Date(116, 8,8,12,10,11), 8),
new Data(3, 123, new Date(116, 8,8,12,12,11), 10),
new Data(4, 123, new
I have to admit that the difference between the two methods is subtle, and
in my opinion it doesn't make much sense to have the two variants.
- max() returns a tuple with the max value at the specified position, the
other fields of the tuple/pojo are undefined
- maxBy() returns a tuple with the ma
Hi Stephan, hi Ufuk,
thank you very much for your insights, and sorry for the late reply,
there was a lot going on recently.
We finally figured out what the problem was: As you pointed out, the
Flink job simply waited for new YARN resources. But when a new YARN
session started, the Flink job did n
OK, found my mistake reagarding question 2.). I key by the id value and gave
all the data sets different values there. So of course all 4 data sets are
printed. Sorry :)
But question 1.) still remains.
Von: Claudia Wegmann [mailto:c.wegm...@kasasi.de]
Gesendet: Montag, 8. August 2016 14:27
An: u
Hey,
I have some questions to aggregate functions such as max or min. Take the
following example:
//create Stream with event time where Data contains an ID, a timestamp and a
temperature value
DataStream oneStream = env.fromElements(
new Data(123, new Date(116, 8,8,11,11,11), 5),
Hi Adam,
sorry for the inconvenience. This is caused by a new file read operator,
specifically how it treats watermarks/timestamps. I opened an issue here
that describes the situation:
https://issues.apache.org/jira/browse/FLINK-4329.
I think this should be fixed for an upcoming 1.1.1 bug fixing r
Hi,
what does the reduce function do exactly? Something like this?
(a: String, b: String) -> b.toUppercase
If yes, then I would expect a) to be the output you get.
if it is this:
(a: String, b: String) -> a + b.toUppercase
then I would expect this: a,b,cC,d,eE,f,gG,h
Cheers,
Aljoscha
On Sun,
Hi Davood,
right now, you can only inspect the timestamps by writing a custom operator
that you would use with DataStream.transform(). Measuring latency this way
has some pitfalls, though. The timestamp might be assigned on a different
machine than the machine that will process the tuple at the sin
26 matches
Mail list logo