I have a file, which each line is one json record
I run the following
val env = ExecutionEnvironment.getExecutionEnvironment
val data = env.readTextFile("file:///somefile")
.map(line => JSON.parseFull(line))
and get the following for one json record. For simplicity, the ke
It is complicated:
1. If you have a file you should consider using the DataSet API. It is more
complicated to use DataStream with files as you have to simulate a stream from
a file.
2. You need a tokenizer for a map operator unless you have a word per line.
3. Sum operator is fine it will count
Seems a bit convoluted for such a simple problem. I am thinking a custom
streaming count() operator will simplify. Wasn¹t able to find examples for
custom Streaming operators.
-roshan
On 7/21/16, 8:00 PM, "hrajaram" wrote:
>Can't you use a KeyedStream, I mean keyBy with the sameKey? something
I'm evaluating for some processing batches of data. As a simple example say I
have 2000 points which I would like to pass through an FIR filter using
functionality provided by the Python scipy libraryjk. The scipy filter is a
simple function which accepts a set of coefficients and the data to
Hi All-
I am having trouble using the FlinkShell against a standalone HA cluster
(recovery.mode: zookeeper)
If I remove the zookeeper conf from flink-conf.yaml and restart the
cluster, I can execute stuff from the shell just fine. (One master is
running)
Adding back the config, and restarti
Initializing in "open(Configuration)" means that the ObjectMapper is
created only in the cluster once the MapFunction is started.
Otherwise it is created before (on the client) and Serialization-copied
into the cluster, together with the MapFunction.
If the second approach works well (i.e., the O
declare objectMapper out of map class.
final ObjectMapper objectMapper = new ObjectMapper();
source.map(str -> objectMapper.readValue(value, Request.class));
On Sat, Jul 23, 2016 at 12:28 AM, Yassin Marzouki
wrote:
> Thank you Stephan and Kim, that solved the problem.
> Just to make sure, is u
Thank you Stephan and Kim, that solved the problem.
Just to make sure, is using a MapFunction as in the following code any
different? i.e. does it initialize the objectMapper for every element in
the stream?
.map(new MapFunction() {
private ObjectMapper objectMapper = new ObjectMapper();
oops. stephan already answered.
sorry. T^T
On Sat, Jul 23, 2016 at 12:16 AM, Dong iL, Kim wrote:
> is open method signature right? or typo?
>
> void open(Configuration parameters) throws Exception;
>
> On Sat, Jul 23, 2016 at 12:09 AM, Stephan Ewen wrote:
>
>> I think you overrode the open meth
is open method signature right? or typo?
void open(Configuration parameters) throws Exception;
On Sat, Jul 23, 2016 at 12:09 AM, Stephan Ewen wrote:
> I think you overrode the open method with the wrong signature. The right
> signature would be "open(Configuration cfg) {...}". You probably over
I think you overrode the open method with the wrong signature. The right
signature would be "open(Configuration cfg) {...}". You probably overlooked
this because you missed the "@Override" annotation.
On Fri, Jul 22, 2016 at 4:49 PM, Yassin Marzouki
wrote:
> Hi everyone,
>
> I want to convert a
Hi everyone,
I want to convert a stream of json strings to POJOs using Jackson, so I did
the following:
.map(new RichMapFunction() {
private ObjectMapper objectMapper;
public void open() {
objectMapper = new ObjectMapper();
}
@Override
publi
Hi Chesnay, hi Robert
Thank you for your explanations : - )
(And sorry for the late reply).
Regards,
Robert
Von: Robert Metzger [mailto:rmetz...@apache.org]
Gesendet: Dienstag, 21. Juni 2016 12:12
An: user@flink.apache.org
Betreff: Re: Getting the NumberOfParallelSubtask
Hi Robert,
the number
Hi all,
>(1) Only write to the DB upon a checkpoint, at which point it is known
that no replay of that data will occur any more. Values from partially
successful writes will be overwritten >with correct value. I assume that is
what you thought of when referring to the State Backend, because in so
@Sameer, yes, if one source stops emitting watermarks then downstream
operations will buffer data until the source starts updating the watermark
again. If you can live with some data being late you could change the
watermark logic in the source to start advancing the watermark if no new
data is arr
15 matches
Mail list logo