Re: Batch Flink Job S3 write performance vs Spark

sri hari kali charan Tummala Wed, 26 Feb 2020 07:35:53 -0800

Ok, thanks for the clarification.

On Wed, Feb 26, 2020 at 9:22 AM Arvid Heise <ar...@ververica.com> wrote:


> Exactly. We use the hadoop-fs as an indirection on top of that, but Spark
> probably does the same.
>
> On Wed, Feb 26, 2020 at 3:52 PM sri hari kali charan Tummala <
> kali.tumm...@gmail.com> wrote:
>
>> Thank you  (the two systems running on Java and using the same set of
>> libraries), so from my understanding, Flink uses AWS SDK behind the scenes
>> same as spark.
>>
>> On Wed, Feb 26, 2020 at 8:49 AM Arvid Heise <ar...@ververica.com> wrote:
>>
>>> Fair benchmarks are notoriously difficult to setup.
>>>
>>> Usually, it's easy to find a workload where one system shines and as its
>>> vendor you report that. Then, the competitor benchmarks a different use
>>> case where his system outperforms ours. In the end, customers are more
>>> confused than before.
>>>
>>> You should do your own benchmarks for your own workloads. That is the
>>> only reliable way.
>>>
>>> In the end, both systems use similar setups and improvements in one
>>> system are often also incorporated into the other system with some delay,
>>> such that there should be no ground-breaking differences between the two
>>> systems running on Java and using the same set of libraries.
>>> Of course, if one system has a very specific optimization for your use
>>> case, that could be much faster.
>>>
>>>
>>> On Mon, Feb 24, 2020 at 11:26 PM sri hari kali charan Tummala <
>>> kali.tumm...@gmail.com> wrote:
>>>
>>>> Hi All,
>>>>
>>>> have a question did anyone compared the performance of Flink batch job
>>>> writing to s3 vs spark writing to s3?
>>>>
>>>> --
>>>> Thanks & Regards
>>>> Sri Tummala
>>>>
>>>>
>>
>> --
>> Thanks & Regards
>> Sri Tummala
>>
>>

-- 
Thanks & Regards
Sri Tummala

Re: Batch Flink Job S3 write performance vs Spark

Reply via email to