date:20170220

Re: Apache Flink and Elasticsearch send Json Object instead of string

2017-02-20 Thread Tzu-Li (Gordon) Tai

Hi, I’ll use your code to explain. public IndexRequest createIndexRequest(String element){ HashMap esJson = new HashMap<>(); esJson.put("data", element); What you should do here is parse the field values from `element`, and simply tre

Re: kinesis producer setCustomPartitioner use stream's own data

2017-02-20 Thread Tzu-Li (Gordon) Tai

Hi Sathi, The `getPartitionId` method is invoked with each record from the stream. In there, you can extract values / fields from the record, and use that to determine the target partition id. Is this what you had in mind? Cheers, Gordon On February 21, 2017 at 11:54:21 AM, Sathi Chowdhury (

kinesis producer setCustomPartitioner use stream's own data

2017-02-20 Thread Sathi Chowdhury

Hi flink users and experts, In my flink processor I am trying to use Flink Kinesis connector . I read from a kinesis stream , and After the transformation (for which I use RichCoFlatMapFunction), json event needs to sink to a kinesis stream k1. DataStream myStream = see.addSource(new FlinkKine

回复：回复：Transfer information from one window to the next

2017-02-20 Thread 施晓罡(星罡)

Hi Sonex All windows under the same key (e.g., TimeWindow(0, 3600) and TimeWindow(3600, 7200)) will be processed by the flatmap function. You can put the variable drawn from TimeWindow(0, 3600) into a State. When you receive TimeWindow(3600, 7200), you can access the state and apply the function

Re: Watermarks per key

2017-02-20 Thread jganoff

There's nothing stopping me assigning timestamps and generating watermarks on a keyed stream in the implementation and the KeyedStream API supports it. It appears the underlying operator that gets created in DataStream.assignTimestampsAndWatermarks() isn't key-aware and globally tracks timestamps.

Re: Checkpointing with RocksDB as statebackend

2017-02-20 Thread vinay patil

Hi Stephan, Just saw your mail while I was explaining the answer to your earlier questions. I have attached some more screenshots which are taken from the latest run today. Yes I will try to set it to higher value and check if performance improves Let me know your thoughts Regards, Vinay Patil

Re: Checkpointing with RocksDB as statebackend

2017-02-20 Thread vinay patil

Hi Stephan, I am using Flink 1.2.0 version, and running the job on on YARN using c3.4xlarge EC2 instances having 16 cores and 30GB RAM each. In total I have set 32 slots and alloted 1200 network buffers I have attached the latest checkpointing snapshot, its configuration, cpu load average ,physi

Re: Checkpointing with RocksDB as statebackend

2017-02-20 Thread Stephan Ewen

@Vinay! Just saw the screenshot you attached to the first mail. The checkpoint that failed came after one that had an incredible heavy alignment phase (14 GB). I think that working that off threw the next checkpoint because the workers were still working off the alignment backlog. I think you can

Re: Log4J

2017-02-20 Thread Stephan Ewen

How about adding this to the "logging" docs - a section on how to run log4j2 On Mon, Feb 20, 2017 at 8:50 AM, Robert Metzger wrote: > Hi Chet, > > These are the files I have in my lib/ folder with the working log4j2 > integration: > > -rw-r--r-- 1 robert robert 79966937 Oct 10 13:49 flink-dist_

Re: blob store defaults to /tmp and files get deleted

2017-02-20 Thread Stephan Ewen

Hi Shannon! In the latest HA and BlobStore changes (1.3) it uses "/tmp" only for caching and will re-obtain the files from the persistent storage. I think we should make this a bigger point, even: - Flink should not use "/tmp" at all (except for mini cluster mode) - Yarn and Mesos should alwa

Re: Is it OK to have very many session windows?

2017-02-20 Thread Stephan Ewen

With pre-aggregation (which the Reduce does), Flink can handle many windows and many keys, as long as you have the memory and storage to support that. Your case should work. On Mon, Feb 20, 2017 at 4:58 PM, Vadim Vararu wrote: > It's something like: > > DataStreamSource stream = > env.addSourc

Re: How to achieve exactly once on node failure using Kafka

2017-02-20 Thread Stephan Ewen

Hi! Exactly-once end-to-end requires sinks that support that kind of behavior (typically some form of transactions support). Kafka currently does not have the mechanisms in place to support exactly-once sinks, but the Kafka project is working on that feature. For ElasticSearch, it is also not sim

Re: Checkpointing with RocksDB as statebackend

2017-02-20 Thread Stephan Ewen

Hi Vinay! Can you start by giving us a bit of an environment spec? - What Flink version are you using? - What is your rough topology (what operations does the program use) - Where is the state (windows, keyBy)? - What is the rough size of your checkpoints and where does the time go? Can y

Re: Checkpointing with RocksDB as statebackend

2017-02-20 Thread vinay patil

Hi Xiaogang, Thank you for your inputs. Yes I have already tried setting MaxBackgroundFlushes and MaxBackgroundCompactions to higher value (tried with 2, 4, 8) , still not getting expected results. System.getProperty("java.io.tmpdir") points to /tmp but there I could not find RocksDB logs, can y

How to achieve exactly once on node failure using Kafka

2017-02-20 Thread Y. Sakamoto

Hi, I'm using Flink 1.2.0 and try to do "exactly once" data transfer from Kafka to Elasticsearch, but I cannot. (Scala 2.11, Kafka 0.10, without YARN) There are 2 Flink TaskManager nodes, and when processing with 2 parallelism, shutdown one of them (simulating node failure). Using flink-connecto

Apache Flink and Elasticsearch send Json Object instead of string

2017-02-20 Thread Fábio Dias

Hi, I'm using Flink and Elasticsearch and I want to recieve in elasticsearch a json object ({"id":1, "name":"X"} ect...), I already have a string with this information, but I don't want to save it as string. I recieve this: { "_index": "logs", "_type": "object", "_id": "AVpcARfkfYWqSubr0Zv

Re: Is it OK to have very many session windows?

2017-02-20 Thread Vadim Vararu

It's something like: DataStreamSource stream = env.addSource(getKafkaConsumer(parameterTool)); stream .map(getEventToDomainMapper()) .keyBy(getKeySelector()) .window(ProcessingTimeSessionWindows.withGap(Time.minutes(90))) .reduce(getReducer()) .map(getToJs

Re: Is it OK to have very many session windows?

2017-02-20 Thread Timo Walther

Hi Vadim, this of course depends on your use case. The question is how large is your state per pane and how much memory is available for Flink? Are you using incremental aggregates such that only the aggregated value per pane has to be kept in memory? Regards, Timo Am 20/02/17 um 16:34 schr

Is it OK to have very many session windows?

2017-02-20 Thread Vadim Vararu

HI guys, Is it okay to have very many (tens of thousands or hundreds of thousand) of session windows? Thanks, Vadim.

Re: Is a new window created for each key/group?

2017-02-20 Thread Sonex

Yes, you are correct. A window will be created for each key/group and then you can apply a function, or aggregate elements per key. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Is-a-new-window-created-for-each-key-group-tp11745p11746.html

Is a new window created for each key/group?

2017-02-20 Thread Vadim Vararu

Hi guys, I can see in many examples that window method is always preceded by keyBy: |data.keyBy()| |.window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))| |.()| || || || | | Does it mean that a new window will be create for each group/key? Thanks, Vadim.

Splitting a stream based on validation

2017-02-20 Thread Chet Masterson

A while back on the mailing list, there was a discussion on validating a stream, and splitting the stream into two sinks, depending on how the validation went:(operator generating errors) --> (filter) --> stream without errors --> sink --> (filter) --> error stream --> sink Is there an exam

Re: blob store defaults to /tmp and files get deleted

2017-02-20 Thread Ufuk Celebi

Hey Shannon, good idea! We currently have this: https://ci.apache.org/projects/flink/flink-docs-release-1.3/ops/production_ready.html It has a strong focus on managed state and not the points you mentioned. Would you like to create an issue for adding this to the production check list? I think i

Re: Previously working job fails on Flink 1.2.0

2017-02-20 Thread Stefan Richter

Hi, Flink 1.2 is partitioning all keys into key-groups, the atomic units for rescaling. This partitioning is done by hash partitioning and is also in sync with the routing of tuples to operator instances (each parallel instance of a keyed operator is responsible for some range of key groups). T

Previously working job fails on Flink 1.2.0

2017-02-20 Thread Steffen Hausmann

Hi there, I’m having problems running a job on Flink 1.2.0 that successfully executes on Flink 1.1.3. The job is supposed to read events from a Kinesis stream and to send outputs to Elasticsearch and it actually initiates successfully on a Flink 1.2.0 cluster running on YARN, but as soon as I

Re: 回复：Transfer information from one window to the next

2017-02-20 Thread Sonex

I don`t think you understood the question correctly. I do not care about information between windows at the same time (i.e., start of window = 0, end of window 3600). I want to pass a variable, let`s say for key 1, from the apply function of window 0-3600 to the apply function of window 3600-7200,

回复：Transfer information from one window to the next

2017-02-20 Thread 施晓罡(星罡)

Hi sonex I think you can accomplish it by using a PassThroughFunction as the apply function and processing the elements in a rich flatMap function followed. You can keep the information in the flatmap function (via states) so that they can be shared among different windows. The program may look

Transfer information from one window to the next

2017-02-20 Thread Sonex

val stream = inputStream.assignAscendingTimestamps(_.eventTime).keyBy(_.inputKey).timeWindow(Time.seconds(3600),Time.seconds(3600)) stream.apply{...} Given the above example I want to transfer information (variables and values) from the current apply function to the apply function of the next win

Re: Jobmanager was killed when disk less 10% in yarn

2017-02-20 Thread lining jing

I have seen the log, did not find any information. Just get some information about the machine run this node. Disk less 10% 2017-02-20 14:03 GMT+08:00 wangzhijiang999 : > The log just indicates the SignalHandler handles the kill signal and the > process of JobManager exit , and it can not get the

Re: Apache Flink and Elasticsearch send Json Object instead of string

Re: kinesis producer setCustomPartitioner use stream's own data

kinesis producer setCustomPartitioner use stream's own data

回复：回复：Transfer information from one window to the next

Re: Watermarks per key

Re: Checkpointing with RocksDB as statebackend

Re: Checkpointing with RocksDB as statebackend

Re: Checkpointing with RocksDB as statebackend

Re: Log4J

Re: blob store defaults to /tmp and files get deleted

Re: Is it OK to have very many session windows?

Re: How to achieve exactly once on node failure using Kafka

Re: Checkpointing with RocksDB as statebackend

Re: Checkpointing with RocksDB as statebackend

How to achieve exactly once on node failure using Kafka

Apache Flink and Elasticsearch send Json Object instead of string

Re: Is it OK to have very many session windows?

Re: Is it OK to have very many session windows?

Is it OK to have very many session windows?

Re: Is a new window created for each key/group?

Is a new window created for each key/group?

Splitting a stream based on validation

Re: blob store defaults to /tmp and files get deleted

Re: Previously working job fails on Flink 1.2.0

Previously working job fails on Flink 1.2.0

Re: 回复：Transfer information from one window to the next

回复：Transfer information from one window to the next

Transfer information from one window to the next

Re: Jobmanager was killed when disk less 10% in yarn

29 matches

Site Navigation

Mail list logo

Footer information