Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-10 Thread Bryan Davis
On Mon, Jun 3, 2019 at 4:30 AM Bryan Davis wrote: > > One "fix" I could imagine for this would be for the Wiki Replicas to > also provide per-entity views of the comment table that only return > rows related to a single entity table. For example a > "comment_ipblocks" view could use a where clause

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-04 Thread Daniel Kinzler
Am 03.06.19 um 22:50 schrieb John: > With the introduction of the actor and comment tables, I have seen a notable > spike in query times. Take a look at Jessie’s query. A query that used to > take a > couple of seconds now takes almost 5 minutes. It also spanned what 8-9 > different > joins now?

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-04 Thread Jaime Crespo
There is another factor that contributes both to the thinking of too much normalization and performance issues: There is now multiple "*_temp" tables that are required to join things like comment and revision, or actor and revision. These tables are temporary and only necessary during the migratio

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-04 Thread Jaime Crespo
I think there is a misunderstanding that normalization reduces performance (even for analytic queries)- all the opposite, due to a lower amount of actual data stored, the memory to disk ratio increases and while joins may increase cpu usage, the usage, even on labs, is normally negligible. In parti

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Huji Lee
It may help to put things in context and remind everyone that among the most visited websites in the world (of which, Wikipedia is #5 per Alexa [0]) the vast majority do not even use a relational database, making real-time large-scale analytic queries something that is not easily available to the u

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Bryan Davis
On Mon, Jun 3, 2019 at 2:56 PM John wrote: > > Please don’t be patronizing insulting. What did read. What used to be one > fairly quick query will now be thousands of decoupled queries. I know a > number of cases where the decoupled query count can get into the hundreds of > thousands. So what

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread John
Please don’t be patronizing insulting. What did read. What used to be one fairly quick query will now be thousands of decoupled queries. I know a number of cases where the decoupled query count can get into the hundreds of thousands. So what used to take 1 query and <2 minutes is now going to hamme

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Alex Monk
I believe John is trying to make the implication that the comment table was created purely for the sake of normalisation, but I believe it came out of the concerns around the size of the revision table: https://www.mediawiki.org/wiki/User:Brion_VIBBER/Compacting_the_revision_table_round_2 On Mon,

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread John
With the introduction of the actor and comment tables, I have seen a notable spike in query times. Take a look at Jessie’s query. A query that used to take a couple of seconds now takes almost 5 minutes. It also spanned what 8-9 different joins now? In database design you should almost never have p

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Amir Sarabadani
I guarantee you, mediawiki core is everything but "over-normalized". We haven't done anything yet. In WMCS it's slower for reasons mentioned above, in production it's fast. Also, as I mentioned about "join decomposition", please read. Best regards On Mon, Jun 3, 2019, 21:58 John wrote: > This

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Chico Venancio
Hey John, What is the specific normalization you believe is unneeded? Chico Venancio Em seg, 3 de jun de 2019 16:58, John escreveu: > This is a prime example of over normalization. We are normalizing and > making things slower. > > On Mon, Jun 3, 2019 at 3:38 PM Jesse Plamondon-Willard < > path

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread John
This is a prime example of over normalization. We are normalizing and making things slower. On Mon, Jun 3, 2019 at 3:38 PM Jesse Plamondon-Willard < pathosch...@gmail.com> wrote: > Sure! I just fetch the comment ID in the original query ([1]), and fetch > the comment text with a separate query ([

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Jesse Plamondon-Willard
Sure! I just fetch the comment ID in the original query ([1]), and fetch the comment text with a separate query ([2]). You need *n*+1 queries based on the number of records, but the performance is adequate for my use cases. [1]: main query

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Huji Lee
Out of curiosity, Jesse, can you share the query after that change, so it is all stored in this thread for future reference? On Sun, Jun 2, 2019 at 11:09 PM Jesse Plamondon-Willard < pathosch...@gmail.com> wrote: > Thanks! Performance is much better with that change. > > > On Sun, 2 Jun 2019 at 2

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Brad Jorsch (Anomie)
On Sun, Jun 2, 2019 at 11:30 PM Bryan Davis wrote: > One "fix" I could imagine for this would be for the Wiki Replicas to > also provide per-entity views of the comment table that only return > rows related to a single entity table. For example a > "comment_ipblocks" view could use a where clause

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-02 Thread Bryan Davis
On Sun, Jun 2, 2019 at 7:43 PM Jesse Plamondon-Willard wrote: > > Hello, > > Some Toolforge SQL queries have much worse performance when updated for the > new comment table. > > For example, see one previous SQL query and its updated version which just > adds a join on the comment table. The cha

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-02 Thread Jesse Plamondon-Willard
Thanks! Performance is much better with that change. On Sun, 2 Jun 2019 at 21:54, Amir Sarabadani wrote: > Hey, > One important optimization you can use and it's often missed out (and it's > going to be needed more as we normalize more tables) is join decomposition. > It basically means you don

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-02 Thread Amir Sarabadani
Hey, One important optimization you can use and it's often missed out (and it's going to be needed more as we normalize more tables) is join decomposition. It basically means you don't join and query but do two (or several) queries separately in your code. This might seem counter intuitive but it's

[Cloud] Toolforge SQL performance issue with new comment table

2019-06-02 Thread Jesse Plamondon-Willard
Hello, Some Toolforge SQL queries have much worse performance when updated for the new comment table. For example, see one previous SQL query