Re: Batch Flink Job S3 write performance vs Spark

Arvid Heise Wed, 26 Feb 2020 07:22:52 -0800

Exactly. We use the hadoop-fs as an indirection on top of that, but Spark
probably does the same.


On Wed, Feb 26, 2020 at 3:52 PM sri hari kali charan Tummala <
kali.tumm...@gmail.com> wrote:

> Thank you  (the two systems running on Java and using the same set of
> libraries), so from my understanding, Flink uses AWS SDK behind the scenes
> same as spark.
>
> On Wed, Feb 26, 2020 at 8:49 AM Arvid Heise <ar...@ververica.com> wrote:
>
>> Fair benchmarks are notoriously difficult to setup.
>>
>> Usually, it's easy to find a workload where one system shines and as its
>> vendor you report that. Then, the competitor benchmarks a different use
>> case where his system outperforms ours. In the end, customers are more
>> confused than before.
>>
>> You should do your own benchmarks for your own workloads. That is the
>> only reliable way.
>>
>> In the end, both systems use similar setups and improvements in one
>> system are often also incorporated into the other system with some delay,
>> such that there should be no ground-breaking differences between the two
>> systems running on Java and using the same set of libraries.
>> Of course, if one system has a very specific optimization for your use
>> case, that could be much faster.
>>
>>
>> On Mon, Feb 24, 2020 at 11:26 PM sri hari kali charan Tummala <
>> kali.tumm...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> have a question did anyone compared the performance of Flink batch job
>>> writing to s3 vs spark writing to s3?
>>>
>>> --
>>> Thanks & Regards
>>> Sri Tummala
>>>
>>>
>
> --
> Thanks & Regards
> Sri Tummala
>
>

Re: Batch Flink Job S3 write performance vs Spark

Reply via email to