Re: [Cloud] cloudservices1003 rebuild on 2019-06-03

2019-06-03 Thread Arturo Borrero Gonzalez
On 5/28/19 8:11 PM, Arturo Borrero Gonzalez wrote: > Hi! > > On 2019-06-03 UTC+2 14:00 (next monday) we will be rebuilding the > cloudservices1003 server, > that holds the designate service which serves DNS request for CloudVPS and > Toolforge. > > We have a backup server -cloudservices1004-, so

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Brad Jorsch (Anomie)
On Sun, Jun 2, 2019 at 11:30 PM Bryan Davis wrote: > One "fix" I could imagine for this would be for the Wiki Replicas to > also provide per-entity views of the comment table that only return > rows related to a single entity table. For example a > "comment_ipblocks" view could use a where clause

Re: [Cloud] [Cloud-announce] Dropping user text columns from replica views 2019-05-27 now 2019-06-03

2019-06-03 Thread Brooke Storm
Work on this is beginning. Brooke Storm Operations Engineer Wikimedia Cloud Services bst...@wikimedia.org IRC: bstorm_ > On May 20, 2019, at 12:38 PM, Brooke Storm wrote: > > I’d like to announce that after taking considerable user feedback about it, > and fixing

[Cloud] Optimalizing a SQL query

2019-06-03 Thread Martin Urbanec
Hello, I'm trying to run this query more fastly. Can you help me how to optimalize it? MariaDB [commonswiki_p]> select count(*) from logging_logindex where log_type="thanks" and log_title="Martin_Urbanec"; +--+ | count(*) | +--+ |6 | +--+ 1 row in set (8 min 42.87

Re: [Cloud] Optimalizing a SQL query

2019-06-03 Thread Jaime Crespo
Hello, On Mon, Jun 3, 2019 at 7:31 PM Martin Urbanec wrote: > > Hello, > > I'm trying to run this query more fastly. Can you help me how to optimalize > it? > > MariaDB [commonswiki_p]> select count(*) from logging_logindex where > log_type="thanks" and log_title="Martin_Urbanec"; > +--

Re: [Cloud] Optimalizing a SQL query

2019-06-03 Thread Alex Monk
To use those indexes on title you also need to filter by log_namespace: MariaDB [commonswiki_p]> select count(*) from logging_logindex where log_type="thanks" and log_title="Martin_Urbanec" and log_namespace = 2; +--+ | count(*) | +--+ |6 | +--+ 1 row in set (0.03 se

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Huji Lee
Out of curiosity, Jesse, can you share the query after that change, so it is all stored in this thread for future reference? On Sun, Jun 2, 2019 at 11:09 PM Jesse Plamondon-Willard < pathosch...@gmail.com> wrote: > Thanks! Performance is much better with that change. > > > On Sun, 2 Jun 2019 at 2

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Jesse Plamondon-Willard
Sure! I just fetch the comment ID in the original query ([1]), and fetch the comment text with a separate query ([2]). You need *n*+1 queries based on the number of records, but the performance is adequate for my use cases. [1]: main query

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread John
This is a prime example of over normalization. We are normalizing and making things slower. On Mon, Jun 3, 2019 at 3:38 PM Jesse Plamondon-Willard < pathosch...@gmail.com> wrote: > Sure! I just fetch the comment ID in the original query ([1]), and fetch > the comment text with a separate query ([

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Chico Venancio
Hey John, What is the specific normalization you believe is unneeded? Chico Venancio Em seg, 3 de jun de 2019 16:58, John escreveu: > This is a prime example of over normalization. We are normalizing and > making things slower. > > On Mon, Jun 3, 2019 at 3:38 PM Jesse Plamondon-Willard < > path

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Amir Sarabadani
I guarantee you, mediawiki core is everything but "over-normalized". We haven't done anything yet. In WMCS it's slower for reasons mentioned above, in production it's fast. Also, as I mentioned about "join decomposition", please read. Best regards On Mon, Jun 3, 2019, 21:58 John wrote: > This

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread John
With the introduction of the actor and comment tables, I have seen a notable spike in query times. Take a look at Jessie’s query. A query that used to take a couple of seconds now takes almost 5 minutes. It also spanned what 8-9 different joins now? In database design you should almost never have p

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Alex Monk
I believe John is trying to make the implication that the comment table was created purely for the sake of normalisation, but I believe it came out of the concerns around the size of the revision table: https://www.mediawiki.org/wiki/User:Brion_VIBBER/Compacting_the_revision_table_round_2 On Mon,

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread John
Please don’t be patronizing insulting. What did read. What used to be one fairly quick query will now be thousands of decoupled queries. I know a number of cases where the decoupled query count can get into the hundreds of thousands. So what used to take 1 query and <2 minutes is now going to hamme

Re: [Cloud] Optimalizing a SQL query

2019-06-03 Thread Yetkin Sakal via Cloud
Hello Martin, Adding "and log_namespace = 2" to the where clause solves the performance problem. Regards. On Monday, June 3, 2019, 8:31:44 PM GMT+3, Martin Urbanec wrote: Hello,  I'm trying to run this query more fastly. Can you help me how to optimalize it? MariaDB [commonswiki_p]> s

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Bryan Davis
On Mon, Jun 3, 2019 at 2:56 PM John wrote: > > Please don’t be patronizing insulting. What did read. What used to be one > fairly quick query will now be thousands of decoupled queries. I know a > number of cases where the decoupled query count can get into the hundreds of > thousands. So what

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Huji Lee
It may help to put things in context and remind everyone that among the most visited websites in the world (of which, Wikipedia is #5 per Alexa [0]) the vast majority do not even use a relational database, making real-time large-scale analytic queries something that is not easily available to the u