date:20180322

Re: Error running on Hadoop 2.7

2018-03-22 Thread Ken Krugler

Hi Ashish, Are you using Flink 1.4? If so, what does the “hadoop classpath” command return from the command line where you’re trying to start the job? Asking because I’d run into issues with https://issues.apache.org/jira/browse/FLINK-7477 , wh

Re: Restart hook and checkpoint

2018-03-22 Thread Ashish Pokharel

Fabian, that sounds good. Should I recap some bullets in an email and start a new thread then? Thanks, Ashish > On Mar 22, 2018, at 5:16 AM, Fabian Hueske wrote: > > Hi Ashish, > > Agreed! > I think the right approach would be to gather the requirements and start a > discussion on the dev m

Re: Error running on Hadoop 2.7

2018-03-22 Thread Ashish Pokharel

Hi All, Looks like we are out of the woods for now (so we think) - we went with Hadoop free version and relied on client libraries on edge node. However, I am still not very confident as I started digging into that stack as well and realized what Till pointed out (traces leads to a class that

Re: Example PopularPlacesFromKafka fails to run

2018-03-22 Thread James Yu

Just figured out the data format in nycTaxiRides.gz doesn't match to the way TaxiRide.java interpreting the lines fed into it. Then I check the exercise training github and found the TaxiRide.java ( https://github.com/dataArtisans/flink-training-exercises/tree/master/src/main/java/com/dataartisans/

Re: Rowtime

2018-03-22 Thread Fabian Hueske

Hi Gregory, Event-time timestamps are handled a bit differently in Flink's SQL compared to the DataStream API. In the DataStream API, timestamps are hidden from the user and implicitly used for time-based operations such as windows. In SQL, the query semantics cannot depend on hidden fields. There

Example PopularPlacesFromKafka fails to run

2018-03-22 Thread James Yu

Hi, I fail to run the PopularPlacesFromKafka example with the following exception, and I wonder what might cause this "Invalid record" error? when running within Intellij IDEA --> 07:52:23.960 [Source: Custom Source -> Map (7/8)] INFO org.apache.flink.runtime.taskmanager.Task - Source: Custom S

Rowtime

2018-03-22 Thread Gregory Fee

Hello! I have found that even though I am processing using event time and I provide an event time for all my events that the events produced in a RetractStream I create from a Table do not have timestamps. That is to say that I put a ProcessFunction on the RetractStream and ctx.timestamp() always r

Re: Lucene SPI class loading fails with shaded flink-connector-elasticsearch

2018-03-22 Thread Till Rohrmann

Hi Manuel, thanks for reporting this issue. It sounds to me like a bug we should fix. I've pulled Gordon into the conversation since he will most likely know more about the ElasticSearch connector shading. Cheers, Till On Thu, Mar 22, 2018 at 5:09 PM, Haddadi Manuel wrote: > Hello, > > When up

Lucene SPI class loading fails with shaded flink-connector-elasticsearch

2018-03-22 Thread Haddadi Manuel

Hello, When upgrading from flink-1.3.2 to flink-1.4.2, I faced this error on runtime of a Flink job : java.util.ServiceConfigurationError: An SPI class of type org.apache.lucene.codecs.PostingsFormat with classname org.apache.lucene.search.suggest.document.Completion50PostingsFormat does not

Re: Error running on Hadoop 2.7

2018-03-22 Thread Kien Truong

Hi Ashish, Yeah, we also had this problem before. It can be solved by recompiling Flink with HDP version of Hadoop according to instruction here: https://ci.apache.org/projects/flink/flink-docs-release-1.4/start/building.html#vendor-specific-versions Regards, Kien On 3/22/2018 12:25 AM,

Re: [ANNOUNCE] Weekly community update #12

2018-03-22 Thread Till Rohrmann

Eron pointed out to me that the Flink improvement proposal 6 (short Flip-6) deserves some more comments since not everyone will be aware of what it actually means. I totally agree and will try to give a bit more context for everyone interested. Flip-6 is intended to solve some of the Flink's short

Re: Kafka Consumers Partition Discovery doesn't work

2018-03-22 Thread Tzu-Li (Gordon) Tai

Hi, I think you’ve made a good point: there is currently no logs that tell anything about discovering a new partition. We should probably add this. And yes, it would be great if you can report back on this using either the latest master, release-1.5 or release-1.4 branches. On 22 March 2018 at

Re: Migration to Flip6 Kubernetes

2018-03-22 Thread Till Rohrmann

Hi Edward and Eron, you're right that there is currently no JobClusterEntrypoint implementation for Kubernetes. How this entrypoint looks like mostly depends on how the job is stored and retrieved. There are multiple ways conceivable: - The entrypoint connects to an external system from which it

Re: Kafka Consumers Partition Discovery doesn't work

2018-03-22 Thread Juho Autio

Thanks, that sounds promising. I don't know how to check if it's consuming all partitions? For example I couldn't find any logs about discovering a new partition. However, did I understand correctly that this is also fixed in Flink dev? If yes, I could rebuild my 1.5-SNAPSHOT and try again. On Thu

Re: Kafka Consumers Partition Discovery doesn't work

2018-03-22 Thread Tzu-Li (Gordon) Tai

Hi Juho, Can you confirm that the new partition is consumed, but only that Flink’s reported metrics do not include them? If yes, then I think your observations can be explained by this issue: https://issues.apache.org/jira/browse/FLINK-8419 This issue should have been fixed in the recently rele

Kafka Consumers Partition Discovery doesn't work

2018-03-22 Thread Juho Autio

According to the docs*, flink.partition-discovery.interval-millis can be set to enable automatic partition discovery. I'm testing this, apparently it doesn't work. I'm using Flink Version: 1.5-SNAPSHOT Commit: 8395508 and FlinkKafkaConsumer010. I had my flink stream running, consuming an existin

Re: Let BucketingSink roll file on each checkpoint

2018-03-22 Thread XilangYan

Ok, then may be I have misunderstanding about checkpoint. I thought flink use checkpoint to store offset, but when kafka connector making a checkpoint, it doesn't know whether data is in in-progress file or in pending-file so a whole offset is saved in checkpoint. I used to guess, the data in in-p

Re: Connect more than stream!!

2018-03-22 Thread miki haiat

You can join streams http://training.data-artisans.com/exercises/eventTimeJoin.html On Thu, 22 Mar 2018, 11:36 Puneet Kinra, wrote: > Hi > > Is there any way of connecting multiple streams(more than one) in flink > > -- > *Cheers * > > *Puneet Kinra* > > *Mobile:+918800167808 | Skype : puneet.

Re: how does SQL mode work with PopularPlaces example?

2018-03-22 Thread Fabian Hueske

Hi James, the exercise does not require to filter on pickup events. It says: "This is done by counting every five minutes the number of taxi rides that started and ended in the same area within the last 15 minutes. Arrival and departure locations should be separately counted." That is achieved b

Connect more than stream!!

2018-03-22 Thread Puneet Kinra

Hi Is there any way of connecting multiple streams(more than one) in flink -- *Cheers * *Puneet Kinra* *Mobile:+918800167808 | Skype : puneet.ki...@customercentria.com * *e-mail :puneet.ki...@customercentria.com *

Re: ListCheckpointed function - what happens prior to restoreState() being called?

2018-03-22 Thread Fabian Hueske

Hi Ken, ListCheckpointed.restoreState() is part of the operator state initialization, but you are right that it is not explicitly mentioned in the docs. That documentation page is rather about the internals and does not directly like to public API. I think you can also initialize the variable in

Re: Let BucketingSink roll file on each checkpoint

2018-03-22 Thread Fabian Hueske

Hi, Flink maintains its own Kafka offsets in its checkpoints and does not rely on Kafka's offset management. That way Flink guarantees that read offsets and checkpointed operator state are always aligned. The BucketingSink is designed to not lose any data and the mode of operation is described in

Re: Flink remote debug not working

2018-03-22 Thread Ankit Chaudhary

Thanks a lot Fabian :). I found the information on the configuration page this morning :))) https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/config.html Cheers, Ankit On Thu, Mar 22, 2018 at 9:56 AM, Fabian Hueske wrote: > Hi Ankit, > > The env.java.opts parameter is used for al

Re: Restart hook and checkpoint

2018-03-22 Thread Fabian Hueske

Hi Ashish, Agreed! I think the right approach would be to gather the requirements and start a discussion on the dev mailing list. Contributors and committers who are more familiar with the checkpointing and recovery internals should discuss a solution that can be integrated and doesn't break with

Re: CsvSink

2018-03-22 Thread Fabian Hueske

Great, thanks for reporting back! Best, Fabian 2018-03-20 22:40 GMT+01:00 karim amer : > Never mind I found the error and has nothing to do with flink. > Sorry > > On Tue, Mar 20, 2018 at 12:12 PM, karim amer > wrote: > >> here is the output after fixing the scala issues >> >> https://gist.gith

Re: Flink CEP window for 1 working day

2018-03-22 Thread Fabian Hueske

Hi, I don't think the CEP library is that flexible, but I loop in Kostas (CC) who knows more about it. I'm not exactly sure what you mean by "manipulate" event-time, but I don't think that's necessary. You can implement rules also with state and timers in the ProcessFunction. The function ingests

Re: Flink remote debug not working

2018-03-22 Thread Fabian Hueske

Hi Ankit, The env.java.opts parameter is used for all JVMs started by Flink, i.e., JM and TM. Since the JM process is started before the TM, the port is already in use when you start the TM. You can use env.java.opts.taskmanager to pass parameters only for TM JVMs. Best, Fabian 2018-03-20 14

Re: Error running on Hadoop 2.7

2018-03-22 Thread Till Rohrmann

Hi Ashish, the class `RequestHedgingRMFailoverProxyProvider` was only introduced with Hadoop 2.9.0. My suspicion is thus that you start the client with some Hadoop 2.9.0 dependencies on the class path. Could you please check the logs of the client what's on its class path? Maybe you could also sha

Query regarding to CountinousFileMonitoring operator

2018-03-22 Thread Puneet Kinra

if i set parallelsim equals to 1 still it create multiple splits while processing. -- *Cheers * *Puneet Kinra* *Mobile:+918800167808 | Skype : puneet.ki...@customercentria.com * *e-mail :puneet.ki...@customercentria.com *

Re: Error running on Hadoop 2.7

Re: Restart hook and checkpoint

Re: Error running on Hadoop 2.7

Re: Example PopularPlacesFromKafka fails to run

Re: Rowtime

Example PopularPlacesFromKafka fails to run

Rowtime

Re: Lucene SPI class loading fails with shaded flink-connector-elasticsearch

Lucene SPI class loading fails with shaded flink-connector-elasticsearch

Re: Error running on Hadoop 2.7

Re: [ANNOUNCE] Weekly community update #12

Re: Kafka Consumers Partition Discovery doesn't work

Re: Migration to Flip6 Kubernetes

Re: Kafka Consumers Partition Discovery doesn't work

Re: Kafka Consumers Partition Discovery doesn't work

Kafka Consumers Partition Discovery doesn't work

Re: Let BucketingSink roll file on each checkpoint

Re: Connect more than stream!!

Re: how does SQL mode work with PopularPlaces example?

Connect more than stream!!

Re: ListCheckpointed function - what happens prior to restoreState() being called?

Re: Let BucketingSink roll file on each checkpoint

Re: Flink remote debug not working

Re: Restart hook and checkpoint

Re: CsvSink

Re: Flink CEP window for 1 working day

Re: Flink remote debug not working

Re: Error running on Hadoop 2.7

Query regarding to CountinousFileMonitoring operator

29 matches

Site Navigation

Mail list logo

Footer information