I have shared only observations based on my load test, I am happy to discuss 
any specifics if anyone is interested

Jigar

> On Oct 14, 2024, at 12:51 PM, Jarek Potiuk <ja...@potiuk.com> wrote:
> 
> Hello Jigar,
> 
> I think if someone wants to do a fair comparison of performance and cost, a
> detailed analysis of multiple cases, users, usage patterns and statistics
> if needed - rather than anecdotal evidence and single installation done not
> following the best practices (i.e. pgbouncer).
> 
> The statement "cost is higher" is an interesting anecdotal evidence of a
> single case, and there are multiple options of running Postgres and MysQL,
> so comparing costs is rather difficult. It might or might not be true in
> concrete cases.
> 
> I generally try to avoid making judgement based on very limited data and
> generalise based on a single case - it's often misleading and dangerous to
> draw such conclusions. And publicly stating them as "facts" is a bit of
> misinformation.  That's why we avoid making a "definite" statement like
> "Postgres is slower" and "Postgres is not good at all". We try to judge
> stats and signals we have and draw conclusions when we have
> statistically significant data - that makes it more likely we make better
> judgment.
> 
> I find it quite amusing how someone could reliably draw such conclusions
> based on a single example. Sounds a bit early and skewed and I'd recommend
> avoiding it.
> 
> On the other hand the locking problems are based on statistical
> comparison of our issues that are reported and surveys we run every year.
> Last one -
> https://docs.google.com/forms/d/1wYm6c5Gn379zkg7zD7vcWB-1fCjnOocT0oZm-tjft_Q/viewanalytics
> - where 75% of our users use Postgres. If you look at a number of locking
> issues in Postgres compared to MySQL (similar to what Damian saw) - you can
> get your numbers and it's about 10:1 for MySQL - so if you combine the two,
> the average ratio of Mysql vs. Postgres is 40:1 - quite high to be
> statistically significant, so the statement "MySQL has problems with
> locking" is pretty justified.
> 
> But If you would like to lead that effort and do it Jigar - if there is
> no-one who has done such statistics, it would be great if you lead it and
> get such statistics to back-up your statements. That would be an awesome
> decision point for the community if someone had done such reliable
> statistically significant analysis.
> 
> J.
> 
>> On Mon, Oct 14, 2024 at 9:19 PM Jigar Parekh <ji...@vizeit.com> wrote:
>> 
>> I have observed that allocating right amount of memory-cpu helps to avoid
>> issues with MySQL
>> 
>> Jigar
>> 
>>> On Oct 14, 2024, at 11:51 AM, Damian Shaw <ds...@striketechnologies.com>
>> wrote:
>>> 
>>> I can't speak to general performance between MySQL and PostgreSQL, but
>> I can tell you a real-world specific issue I we have faced being on MySQL
>> for Airflow.
>>> 
>>> MySQL locking and Airflow's rendering of task fields do not play nice,
>> and we see many errors (the relevant GitHub issue:
>> https://github.com/apache/airflow/issues/32342). It has led us to avoid
>> using Dynamic Task Mapping, as well as nesting of Task Groups, while we are
>> still on MySQL, which is a shame because these features would be a natural
>> fit to solve some of our problems.
>>> 
>>> Damian
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: Jigar Parekh <ji...@vizeit.com>
>>> Sent: Monday, October 14, 2024 2:35 PM
>>> To: dev@airflow.apache.org
>>> Subject: Re: [DISCUSSION] Plans for Database backend
>>> 
>>> In my understanding, the slower performance of PostgreSQL is a known
>> behavior for write intensive applications. PGBouncer used for connection
>> pooling cannot change/improve that. And Airflow with multiple DAGs and/or
>> dynamic tasks with heavy workload will be write intensive. I have done
>> extensive tests and have collected database statistics that show the
>> bottleneck. I can share the details if you would like to see. I am
>> wondering if there are any load tests or comparisons done so far by anyone
>> in the community. Unless I am doing something totally wrong, PostgreSQL
>> actually does not seem to be a right choice at all; slower performance and
>> higher cost (due to PGBouncer process) compared to MySQL
>>> 
>>> Jigar
>>> 
>>>> On Oct 14, 2024, at 2:41 AM, Jarek Potiuk <ja...@potiuk.com> wrote:
>>>> 
>>>> MySQL is not going away. You can use it if you want. We have no plans
>>>> to remove it.
>>>> 
>>>> The advice did not change. Postgres is generally more stable that's
>>>> why we recommend it. MySQL has much worse locking behaviour that is
>>>> somewhat unpredictable and - especially when you use backfills - it is
>>>> known to generate occasional deadlocks. This is likely why you got the
>>>> advice (quite likely by me).
>>>> 
>>>> But in general it's fully supported and works and we have no plans to
>>>> get rid of it.
>>>> 
>>>> Re: cost/price - it's up to you to choose and pay for services you use
>>>> - we do not compare pricing of all possible available services.
>>>> 
>>>> Also if you follow the best practices for Postgres (use pgbouncer:
>>>> https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-dat
>>>> abase.html#setting-up-a-postgresql-database),
>>>> it's unlikely you will have slower postgres with similar machine/setup.
>>>> 
>>>> J.
>>>> 
>>>> 
>>>>>> On Mon, Oct 14, 2024 at 7:38 AM Jigar Parekh <ji...@vizeit.com>
>> wrote:
>>>>> 
>>>>> Back in June’24, I had a discussion on Slack about a database issue.
>>>>> My database backend for the Airflow instance is MySQL. It was
>>>>> recommended to migrate to PostgreSQL to resolve such issues. I was
>>>>> also told that MySQL may not be supported in the future versions. I
>>>>> configured PostgreSQL and performed few tests to compare both the DBs
>>>>> with the type of heavy workload I was expecting for my airflow
>>>>> instance in the production environment. The test results did not show
>>>>> a reason to switch from MySQL to PostgreSQL. In fact, PostgreSQL
>>>>> performed slower and airflow configuration cost more compared to
>>>>> MySQL. I wanted to start this discussion to find out if others have
>>>>> any similar observations about Postgres and what is Airflow community
>> planning to do about MySQL support in the future versions?
>>>>> 
>>>>> Jigar
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org
>>>>> 
>>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>>> For additional commands, e-mail: dev-h...@airflow.apache.org
>>> 
>>> ________________________________
>>> Strike Technologies, LLC (“Strike†) is part of the GTS family of
>> companies. Strike is a technology solutions provider, and is not a broker
>> or dealer and does not transact any securities related business directly
>> whatsoever. This communication is the property of Strike and its
>> affiliates, and does not constitute an offer to sell or the solicitation of
>> an offer to buy any security in any jurisdiction. It is intended only for
>> the person to whom it is addressed and may contain information that is
>> privileged, confidential, or otherwise protected from disclosure.
>> Distribution or copying of this communication, or the information contained
>> herein, by anyone other than the intended recipient is prohibited. If you
>> have received this communication in error, please immediately notify Strike
>> at i...@striketechnologies.com, and delete and destroy any copies hereof.
>>> ________________________________
>>> 
>>> CONFIDENTIALITY / PRIVILEGE NOTICE: This transmission and any
>> attachments are intended solely for the addressee. This transmission is
>> covered by the Electronic Communications Privacy Act, 18 U.S.C ''2510-2521.
>> The information contained in this transmission is confidential in nature
>> and protected from further use or disclosure under U.S. Pub. L. 106-102,
>> 113 U.S. Stat. 1338 (1999), and may be subject to attorney-client or other
>> legal privilege. Your use or disclosure of this information for any purpose
>> other than that intended by its transmittal is strictly prohibited, and may
>> subject you to fines and/or penalties under federal and state law. If you
>> are not the intended recipient of this transmission, please DESTROY ALL
>> COPIES RECEIVED and confirm destruction to the sender via return
>> transmittal.
>>> 
>> B‹KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB• È
>> [œÝXœØÜšX™K  K[XZ[ ˆ  ]‹][œÝXœØÜšX™P Z\™› ݢ\ XÚ K›Ü™ÃB‘›Üˆ Y  ] [Û˜[
>> ÛÛ[X[™ Ë  K[XZ[ ˆ  ]‹Z [   Z\™› ݢ\ XÚ K›Ü™ÃB
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>> For additional commands, e-mail: dev-h...@airflow.apache.org
>> 
>> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org

Reply via email to