I'm not able to use asyncIO because the file will not fit in memory. I
thought that flatmap will allow me to enrich/process records while
downloading instead of waiting for the whole file to get downloaded. The
solution works but its not scalable because i'm not able to use
AsynFunction in Flatmap.
You can use flatMap to flatten and have an asyncIO after it.
On Wed, Mar 9, 2022 at 8:08 AM Diwakar Jha wrote:
> Thanks Gen, I will look into customized Source and SpiltEnumerator.
>
> On Mon, Mar 7, 2022 at 10:20 PM Gen Luo wrote:
>
>> Hi Diwakar,
>>
>> An asynchronous flatmap function without
Thanks Gen, I will look into customized Source and SpiltEnumerator.
On Mon, Mar 7, 2022 at 10:20 PM Gen Luo wrote:
> Hi Diwakar,
>
> An asynchronous flatmap function without the support of the framework can
> be problematic. You should not call collector.collect outside the main
> thread of the
Hi Diwakar,
An asynchronous flatmap function without the support of the framework can
be problematic. You should not call collector.collect outside the main
thread of the task, i.e. outside the flatMap method.
I'd suggest using a customized Source instead to process the files, which
uses a SplitE
Hello Everyone,
I'm running a streaming application using Flink 1.11 and EMR 6.01. My use
case is reading files from a s3 bucket, filter file contents ( say record)
and enrich each record. Filter records and output to a sink.
I'm reading 6k files per 15mints and the total number of records is 3
bi