Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-02 Thread Amir Sarabadani
Hey, One important optimization you can use and it's often missed out (and it's going to be needed more as we normalize more tables) is join decomposition. It basically means you don't join and query but do two (or several) queries separately in your code. This might seem counter intuitive but it's

Re: [Cloud] Toolforge SQL performance issue with new comment table

2019-06-03 Thread Amir Sarabadani
I guarantee you, mediawiki core is everything but "over-normalized". We haven't done anything yet. In WMCS it's slower for reasons mentioned above, in production it's fast. Also, as I mentioned about "join decomposition", please read. Best regards On Mon, Jun 3, 2019, 21:58 John wrote: > This

Re: [Cloud] [Cloud-announce] Wiki Replicas 2020 Redesign

2020-11-14 Thread Amir Sarabadani
Hello, I actually welcome the change and am quite happy about it. It might break several tools (including some of mine) but as a database nerd, I can see the benefits outweighing the problems (and I wish benefits would have been communicated in the announcement). The short version is that this cha

Re: [Cloud] [Cloud-announce] Wiki Replicas 2020 Redesign

2020-11-17 Thread Amir Sarabadani
Hello, Actually Jaime's email gave me an idea. Why not having a separate actual data lake? Like a hadoop cluster, it can even take the data from analytics cluster (after being sanitized of course). I remember there were some discussions about having a hadoop or Presto cluster in WM Cloud. Has this

Re: [Cloud] [Ops] Change to how Cloud VPS and Toolforge contact Wikis

2021-01-29 Thread Amir Sarabadani
This is sorta (under-)documented in https://www.mediawiki.org/wiki/Manual:$wgRateLimits I made a patch for it but I'm not sure if I did it correctly. On Fri, Jan 29, 2021 at 10:21 AM Arturo Borrero Gonzalez < aborr...@wikimedia.org> wrote: > On 1/28/21 9:50 PM, Martin Urbanec wrote: > > Hi Artur

Re: [Cloud] [Cloud-announce] new Cloud VPS feature: attachable block storage

2021-02-05 Thread Amir Sarabadani
That's amazing Thank you so much. On Fri, Feb 5, 2021 at 5:02 PM Andrew Bogott wrote: > Wikimedia Cloud Services now supports attachable block storage via the > OpenStack Cinder project. Attachable block storage is a flexible storage > option that allows you to create volumes local to your p

[Cloud] Code of conduct committee call for new members

2021-02-26 Thread Amir Sarabadani
Hello all, It's coming close to time for annual appointments of community members to serve on the Code of Conduct (CoC) committee. The Code of Conduct Committee is a team of five trusted individuals plus five auxiliary members with diverse affiliations responsible for general enforcement of the Co

[Cloud] Issue with jitsi - Anyone knows why?

2021-02-26 Thread Amir Sarabadani
Hello, I usually wouldn't bother people with my issues but I'm sorta desperate here. This is about https://meet.wmcloud.org. WM Cloud jitsi instance. It's on a bigram VM and on docker . Users report that when using this "a ses

[Cloud] Changes to storage of files metadata

2021-06-26 Thread Amir Sarabadani
Hello, If you don't do anything with metadata fields of file tables (image table for example) in replicas, you can ignore this email. "image" table in Wikimedia Commons is extremely big (more than 380GB compressed) and has been causing multiple major issues (including an incident recently). Deep i

[Cloud] Re: [Cloud-announce] Database as a Service in Cloud VPS

2021-07-22 Thread Amir Sarabadani
This is awesome! Thank you for making it happen!🎉🎉🎉 On Mon, Jul 19, 2021 at 10:53 PM Andrew Bogott wrote: > A few weeks ago we rolled out a new service for Cloud VPS users: > OpenStack Trove, aka 'Database as a Service.' > > Trove provides automatic orchestration of stand-alone database > instan

[Cloud] Re: [Cloud-announce] [IMPORTANT] Announcing Toolforge Debian Stretch Grid Engine deprecation

2022-03-10 Thread Amir Sarabadani
I recently migrated most of pywikibot jobs from grid engine to k8s and to my surprise, it was actually quite easy. I have lots of tasks (been running bots since 2008). So much that it showed redaction in the total number of jobs in SGE. One thing that helped me that I collapsed everything into a b

[Cloud] Changes in schema of MediaWiki links tables

2022-03-29 Thread Amir Sarabadani
://phabricator.wikimedia.org/T24. Thanks -- *Amir Sarabadani (he/him)* Staff Database Architect Wikimedia Foundation <https://wikimediafoundation.org/> ___ Cloud mailing list -- cloud@lists.wikimedia.org List information: https://lists.wikimedia.org/postorius

[Cloud] Removal of revision_actor_temp table

2022-04-20 Thread Amir Sarabadani
hange it ASAP. Hopefully this should make your queries much simpler. Keep in mind a similar work will happen on revision_comment_temp table in the future. You can follow the work in https://phabricator.wikimedia.org/T275246 Best -- *Amir Sarabadani (he/him)* Staff Database Architect Wiki

[Cloud] Re: Changes in schema of MediaWiki links tables

2022-05-31 Thread Amir Sarabadani
use the new schema as we will slowly stop writing to tl_namespace and tl_title fields and drop them in the next month. Best Am Di., 29. März 2022 um 15:37 Uhr schrieb Amir Sarabadani < asarabad...@wikimedia.org>: > (If you don’t work with links tables such as templatelinks, pagelinks an

[Cloud] Proposal on redesigning externallinks table: Request for feedback

2022-08-11 Thread Amir Sarabadani
and tech support to 740+ Wikimedia projects, starting with Wikimedia Commons and Wikidata." Best -- *Amir Sarabadani (he/him)* Staff Database Architect Wikimedia Foundation <https://wikimediafoundation.org/> ___ Cloud mailing list -- cloud

[Cloud] Re: Changes in schema of MediaWiki links tables

2022-08-11 Thread Amir Sarabadani
red the impact of this normalization on the health of our databases. Here are the reports for s5 <https://phabricator.wikimedia.org/T299417#8143953> and s2 <https://phabricator.wikimedia.org/T314041#8146798>. Best Am Di., 31. Mai 2022 um 15:42 Uhr schrieb Amir Sarabadani < asarabad...

[Cloud] Re: WMCS team work transparency (was: New servers don't show up on grafana-labs.wikimedia.org)

2022-10-07 Thread Amir Sarabadani
Andrew, I think one general issue is the lack of transparency (and community involvement) in the decision making discussions. Not the documentation of the decisions once done. That's good to have but if we have transparency as a guiding principle, we should include that in the discussions too. My

[Cloud] Reminder about the externallinks migration

2023-06-07 Thread Amir Sarabadani
l_to_domain_index, count(*) from externallinks group by el_to_domain_index order by count(*) desc limit 50;` Thank you and sorry for the inconvenience. -- *Amir Sarabadani (he/him)* Staff Database Architect Wikimedia Foundation <https://wikimediafoundation.org/> ___

[Cloud] Re: Reminder about the externallinks migration

2023-08-09 Thread Amir Sarabadani
Juni 2023 um 16:38 Uhr schrieb Amir Sarabadani < asarabad...@wikimedia.org>: > Hello, > > We have communicated > <https://lists.wikimedia.org/hyperkitty/list/wikitec...@lists.wikimedia.org/message/JFMU43374T64BTJWI6WLZKLOJ4FL4PRP/> > this change in August 2022 but

[Cloud] Changes in schema of pagelinks tables

2023-10-18 Thread Amir Sarabadani
itec...@lists.wikimedia.org/message/U2U6TXIBABU3KDCVUOITIGI5OJ4COBSW/> . Thank you, -- *Amir Sarabadani (he/him)* Staff Database Architect Wikimedia Foundation <https://wikimediafoundation.org/> ___ Cloud mailing list -- cloud@lists.wi

[Cloud] Call for maintainers: Wikimedia Chat and Wikimedia Meet

2023-11-14 Thread Amir Sarabadani
Hi, Wikimedia Chat, is a mattermost instance hosted in chat.wmcloud.org and meet is a jitsi instance hosted in meet.wmcloud.org (both are shut down now). I made these two services back in 2020 when we were all in lock-down and stuck at our homes and virtual social tools were scarce. I wanted it to

[Cloud] Re: Reminder about the externallinks migration

2023-12-11 Thread Amir Sarabadani
Hi Tilman, Sorry for the late reply. Regarding finding the actual link from the row. The recommended way is to do processing in code afterwards. That's what MediaWiki does (in https://gerrit.wikimedia.org/g/mediawiki/core/+/80790ffc21a49fbe7709eaf5ce634b645798cf47/includes/ExternalLinks/LinkFilt

[Cloud] Re: Changes in schema of pagelinks tables

2024-01-16 Thread Amir Sarabadani
2023 um 13:46 Uhr schrieb Amir Sarabadani < asarabad...@wikimedia.org>: > (If you don’t work with pagelinks table, feel free to ignore this message) > > Hello, > > Here is an update and reminder on the previous announcement > <https://lists.wikimedia.org/hyperkitt

[Cloud] Re: Changes in schema of pagelinks tables

2024-01-17 Thread Amir Sarabadani
Hi, Yes that is correct but given the size these tables and the databases (for s4, see https://phabricator.wikimedia.org/T343131) we don't really have a choice in this specific case, Commons has grown to 1.8TB already. My apologies for the inconvenience. One thing that help here is that s3 and

[Cloud] Re: Changes in schema of pagelinks tables

2024-01-17 Thread Amir Sarabadani
Hi! Am Mi., 17. Jan. 2024 um 19:37 Uhr schrieb Ben Kurtovic < wikipedia.ear...@gmail.com>: > Hi Amir & others, > > I’m glad we are making changes to improve DB storage/query efficiency. I > wanted to express my agreement with Tacsipacsi that dropping the data > before the migration has completed

[Cloud] Re: Changes in schema of pagelinks tables

2024-01-22 Thread Amir Sarabadani
imedia Foundation >> >> On Wed, Jan 17, 2024 at 9:57 PM Ben Kurtovic >> wrote: >> >>> Thanks for the clear explanation, this gives more context for the >>> urgency. >>> >>> > On Jan 17, 2024, at 3:04 PM, Amir Sarabadani < >>&

[Cloud] Re: Changes in schema of pagelinks tables

2024-04-15 Thread Amir Sarabadani
ASAP. We will start dropping the old columns gradually in two weeks. Best Am Di., 16. Jan. 2024 um 11:56 Uhr schrieb Amir Sarabadani < asarabad...@wikimedia.org>: > Hello, > The data migration for several sections have been completed. We will start > dropping the old co

[Cloud] Re: Update your tool's code to prepare for temporary accounts

2024-05-21 Thread Amir Sarabadani
That's extremely unlikely. Roy Smith schrieb am Di., 21. Mai 2024, 19:52: > Is it possible this is what's been causing > https://phabricator.wikimedia.org/T318479? > > > > On May 21, 2024, at 8:55 AM, Szymon Grabarczuk > wrote: > > Hello everyone, > > I am reaching out on behalf of the Wikimedi

[Cloud] Code of Conduct Committee -- call for new members

2024-06-11 Thread Amir Sarabadani
deadline for applications is *the end of day on June 25, 2024*. Please feel free to pass this invitation along to any users who you think may be qualified and interested. Best, Amir Sarabadani, on behalf of the Code of Conduct Committee ___ Cloud mailing

[Cloud] Re: [IMPORTANT] Action Required for Your Wikitech Account Migration

2024-07-22 Thread Amir Sarabadani
The domain is wrong. it's idm.wikimedia.org (not wikipedia.org) Am Mo., 22. Juli 2024 um 17:02 Uhr schrieb Travis Briggs < audiod...@gmail.com>: > Hello, > > Just got this email, but the referenced site doesn't exist ( > idm.wikipedia.org). Wondering if this was sent by mistake? > > Thanks, > -Tr

[Cloud] New Code of Conduct Committee members

2024-08-27 Thread Amir Sarabadani
Hello everyone, The Committee has finished selecting its new members. The new committee candidates are (in alphabetical order): - Amir Sarabadani - Egbe Eugene Agbor (Eugene233) - Greg Grossmeier - Jayprakash12345 - Kamila Součková Auxiliary members will be (alphabetically

[Cloud] Re: New Code of Conduct Committee members

2024-10-13 Thread Amir Sarabadani
Hi, Since there have been no objections raised to any of the candidates, this list is now final. Welcome Greg and Kamila and thank you Nuria and Tony for their years of service. Best Am Mi., 28. Aug. 2024 um 01:19 Uhr schrieb Amir Sarabadani < ladsgr...@gmail.com>: > Hello everyone

[Cloud] Re: What do y'all use Dumps for?

2024-10-08 Thread Amir Sarabadani
Oh I don't know where to even start: AI/ML done by me: - https://www.mediawiki.org/wiki/User:Ladsgroup/masz this uses dumps to help checkusers of 11 wikis do their work more efficiently. - Tool that finds "bad words" of each wiki automatically using the history of edits dumps which la

[Cloud] Several database changes happening in the coming months

2025-01-30 Thread Amir Sarabadani
e, we will start working on the imagelinks table. Thank you -- *Amir Sarabadani (he/him)* Staff Database Architect Wikimedia Foundation <https://wikimediafoundation.org/> ___ Cloud mailing list -- cloud@lists.wikimedia.org List inf