subject:"A WordCount job using DataStream API but behave like the batch WordCount example"

Re: A WordCount job using DataStream API but behave like the batch WordCount example

2023-04-30 Thread Luke Xiong

Hi Ken, Never mind. I did a map of Tuple2.of(C, 1) -> Tuple2.of(F, 1) before keyBy, then replaced the map after sum with Tuple2.of(F, ) -> d . I got the expected result. So it's very likely something is wrong with C in determining equalness. Thank you all the same. - Luke On Sun, Apr 30, 2023

Re: A WordCount job using DataStream API but behave like the batch WordCount example

2023-04-30 Thread Ken Krugler

Hi Luke, Without more details, I don’t have a good explanation for why you’re seeing duplicate results. — Ken > On Apr 29, 2023, at 7:38 PM, Luke Xiong wrote: > > Hi Ken, > > My workflow looks like this: > > dataStream > .map(A -> B) > .flatMap(B -> some Tuple2.of(C, 1)) > .keyB

Re: A WordCount job using DataStream API but behave like the batch WordCount example

2023-04-29 Thread Ken Krugler

Hi Luke, What’s your workflow look like? The one I included in my email generates what I’d expect. — Ken > On Apr 29, 2023, at 7:22 PM, Luke Xiong wrote: > > Hi Marco and Ken, > > Thanks for the tips. I tried setting runtime mode to BATCH and it seems to > work. However, I notice there are

Re: A WordCount job using DataStream API but behave like the batch WordCount example

2023-04-29 Thread Marco Villalobos

Hi Luke, A batch has a beginning and an end. Although a stream has a beginning, it has no end. Thus, you'll have three choices: 1. You can collect your aggregated sum at a period interval 2. You can aggregate the sum in a time window. If your window is large enough, you'll get the same result.

Fwd: A WordCount job using DataStream API but behave like the batch WordCount example

2023-04-28 Thread Luke Xiong

Dear experts, Is it possible to write a WordCount job that uses the DataStream API, but make it behave like the batch version WordCount example? More specifically, I hope the job can get a DataStream of the final (word, count) records when fed a text file. For example, given a text file: ```inpu

Re: A WordCount job using DataStream API but behave like the batch WordCount example

Re: A WordCount job using DataStream API but behave like the batch WordCount example

Re: A WordCount job using DataStream API but behave like the batch WordCount example

Re: A WordCount job using DataStream API but behave like the batch WordCount example

Fwd: A WordCount job using DataStream API but behave like the batch WordCount example

5 matches

Site Navigation

Mail list logo

Footer information