Hi,
I’ll use your code to explain.
public IndexRequest createIndexRequest(String element){
HashMap esJson = new HashMap<>();
esJson.put("data", element);
What you should do here is parse the field values from `element`, and simply
tre
Hi Sathi,
The `getPartitionId` method is invoked with each record from the stream. In
there, you can extract values / fields from the record, and use that to
determine the target partition id.
Is this what you had in mind?
Cheers,
Gordon
On February 21, 2017 at 11:54:21 AM, Sathi Chowdhury
(
Hi flink users and experts,
In my flink processor I am trying to use Flink Kinesis connector . I read from
a kinesis stream , and After the transformation (for which I use
RichCoFlatMapFunction), json event needs to sink to a kinesis stream k1.
DataStream myStream = see.addSource(new
FlinkKine
Hi Sonex
All windows under the same key (e.g., TimeWindow(0, 3600) and TimeWindow(3600,
7200)) will be processed by the flatmap function. You can put the variable
drawn from TimeWindow(0, 3600) into a State. When you receive TimeWindow(3600,
7200), you can access the state and apply the function
There's nothing stopping me assigning timestamps and generating watermarks on
a keyed stream in the implementation and the KeyedStream API supports it. It
appears the underlying operator that gets created in
DataStream.assignTimestampsAndWatermarks() isn't key-aware and globally
tracks timestamps.
Hi Stephan,
Just saw your mail while I was explaining the answer to your earlier
questions. I have attached some more screenshots which are taken from the
latest run today.
Yes I will try to set it to higher value and check if performance improves
Let me know your thoughts
Regards,
Vinay Patil
Hi Stephan,
I am using Flink 1.2.0 version, and running the job on on YARN using
c3.4xlarge EC2 instances having 16 cores and 30GB RAM each.
In total I have set 32 slots and alloted 1200 network buffers
I have attached the latest checkpointing snapshot, its configuration, cpu
load average ,physi
@Vinay!
Just saw the screenshot you attached to the first mail. The checkpoint that
failed came after one that had an incredible heavy alignment phase (14 GB).
I think that working that off threw the next checkpoint because the workers
were still working off the alignment backlog.
I think you can
How about adding this to the "logging" docs - a section on how to run log4j2
On Mon, Feb 20, 2017 at 8:50 AM, Robert Metzger wrote:
> Hi Chet,
>
> These are the files I have in my lib/ folder with the working log4j2
> integration:
>
> -rw-r--r-- 1 robert robert 79966937 Oct 10 13:49 flink-dist_
Hi Shannon!
In the latest HA and BlobStore changes (1.3) it uses "/tmp" only for
caching and will re-obtain the files from the persistent storage.
I think we should make this a bigger point, even:
- Flink should not use "/tmp" at all (except for mini cluster mode)
- Yarn and Mesos should alwa
With pre-aggregation (which the Reduce does), Flink can handle many windows
and many keys, as long as you have the memory and storage to support that.
Your case should work.
On Mon, Feb 20, 2017 at 4:58 PM, Vadim Vararu
wrote:
> It's something like:
>
> DataStreamSource stream =
> env.addSourc
Hi!
Exactly-once end-to-end requires sinks that support that kind of behavior
(typically some form of transactions support).
Kafka currently does not have the mechanisms in place to support
exactly-once sinks, but the Kafka project is working on that feature.
For ElasticSearch, it is also not sim
Hi Vinay!
Can you start by giving us a bit of an environment spec?
- What Flink version are you using?
- What is your rough topology (what operations does the program use)
- Where is the state (windows, keyBy)?
- What is the rough size of your checkpoints and where does the time go?
Can y
Hi Xiaogang,
Thank you for your inputs.
Yes I have already tried setting MaxBackgroundFlushes and
MaxBackgroundCompactions to higher value (tried with 2, 4, 8) , still not
getting expected results.
System.getProperty("java.io.tmpdir") points to /tmp but there I could not
find RocksDB logs, can y
Hi,
I'm using Flink 1.2.0 and try to do "exactly once" data transfer
from Kafka to Elasticsearch, but I cannot.
(Scala 2.11, Kafka 0.10, without YARN)
There are 2 Flink TaskManager nodes, and when processing
with 2 parallelism, shutdown one of them (simulating node failure).
Using flink-connecto
Hi,
I'm using Flink and Elasticsearch and I want to recieve in elasticsearch a
json object ({"id":1, "name":"X"} ect...), I already have a string with
this information, but I don't want to save it as string.
I recieve this:
{
"_index": "logs",
"_type": "object",
"_id": "AVpcARfkfYWqSubr0Zv
It's something like:
DataStreamSource stream =
env.addSource(getKafkaConsumer(parameterTool)); stream
.map(getEventToDomainMapper())
.keyBy(getKeySelector())
.window(ProcessingTimeSessionWindows.withGap(Time.minutes(90)))
.reduce(getReducer())
.map(getToJs
Hi Vadim,
this of course depends on your use case. The question is how large is
your state per pane and how much memory is available for Flink?
Are you using incremental aggregates such that only the aggregated value
per pane has to be kept in memory?
Regards,
Timo
Am 20/02/17 um 16:34 schr
HI guys,
Is it okay to have very many (tens of thousands or hundreds of thousand)
of session windows?
Thanks, Vadim.
Yes, you are correct. A window will be created for each key/group and then
you can apply a function, or aggregate elements per key.
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Is-a-new-window-created-for-each-key-group-tp11745p11746.html
Hi guys,
I can see in many examples that window method is always preceded by keyBy:
|data.keyBy()|
|.window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))|
|.()|
||
||
||
|
|
Does it mean that a new window will be create for each group/key?
Thanks, Vadim.
A while back on the mailing list, there was a discussion on validating a stream, and splitting the stream into two sinks, depending on how the validation went:(operator generating errors) --> (filter) --> stream without errors --> sink --> (filter) --> error stream --> sink Is there an exam
Hey Shannon,
good idea! We currently have this:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/ops/production_ready.html
It has a strong focus on managed state and not the points you mentioned.
Would you like to create an issue for adding this to the production
check list? I think i
Hi,
Flink 1.2 is partitioning all keys into key-groups, the atomic units for
rescaling. This partitioning is done by hash partitioning and is also in sync
with the routing of tuples to operator instances (each parallel instance of a
keyed operator is responsible for some range of key groups). T
Hi there,
I’m having problems running a job on Flink 1.2.0 that successfully
executes on Flink 1.1.3. The job is supposed to read events from a
Kinesis stream and to send outputs to Elasticsearch and it actually
initiates successfully on a Flink 1.2.0 cluster running on YARN, but as
soon as I
I don`t think you understood the question correctly. I do not care about
information between windows at the same time (i.e., start of window = 0, end
of window 3600). I want to pass a variable, let`s say for key 1, from the
apply function of window 0-3600 to the apply function of window 3600-7200,
Hi sonex
I think you can accomplish it by using a PassThroughFunction as the apply
function and processing the elements in a rich flatMap function followed. You
can keep the information in the flatmap function (via states) so that they can
be shared among different windows.
The program may look
val stream =
inputStream.assignAscendingTimestamps(_.eventTime).keyBy(_.inputKey).timeWindow(Time.seconds(3600),Time.seconds(3600))
stream.apply{...}
Given the above example I want to transfer information (variables and
values) from the current apply function to the apply function of the next
win
I have seen the log, did not find any information. Just get some
information about the machine run this node. Disk less 10%
2017-02-20 14:03 GMT+08:00 wangzhijiang999 :
> The log just indicates the SignalHandler handles the kill signal and the
> process of JobManager exit , and it can not get the
29 matches
Mail list logo