Hello, in the next coming months, these changes will happen in databases and the infrastructure. And it might affect you if you rely on them in your tools or queries. This list is ordered based on how soon the change will happen.
We understand that updating your tools and systems can be time consuming, hence we are giving an advanced notice. I truly apologize for the inconvenience but many of these changes are needed to keep the site running smoothly. Image table redesign Around fourteen years after the creation of T28741 <https://phabricator.wikimedia.org/T28741>, we are implementing the changes described therein. Currently, every current version of an image has a row in the image table and if there are older versions of that file, those rows could be found in the oldimage table. These two tables (image and oldimage) will be dropped in around two months. The replacement will be two main tables: file and filerevision. Every file will have a row in the file table describing the name and the type. Every version of the file (current and old) will have a row in filerevision describing the file-specific information such as its size or the hash of the file, similar to the existing distinction between pages and revisions. Another improvement is that every file and file revision will get a unique auto increment id simplifying many operations and queries. You can check T28741 <https://phabricator.wikimedia.org/T28741> for more information. The new tables are already accessible in wikireplicas but the data hasn’t been fully migrated yet. Term store split out of wikidata’s database Wikidata’s database has been growing too fast and we need to move the term store (tables starting with wbt_) to a dedicated cluster to allow growth and improve wikidata’s performance by utilizing cache locality. The new section will be called x3 and you will be able to access it in wikireplicas but this also means you won’t be able to join these tables with the rest of wikidata’s database (such as page table) since they will be residing in two physically separate servers that also means most of your queries to wikidata’s database (and term store) will become faster. We are aiming for the switch to happen in three months’ time. You can follow the work in T351820 <https://phabricator.wikimedia.org/T351820>. Additionally, wb_type table will be dropped and the mapping will be hard-coded in the code instead. See gerrit:1110810 <https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/1110810> for more details. This helped us simplify a lot of Wikibase code (example <https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/1110720>). Categorylinks normalization Categorylinks is the next table in the series of links tables being normalized via the linktarget table (parent ticket <https://phabricator.wikimedia.org/T300222>, RFC <https://phabricator.wikimedia.org/T222224>). Similar to templatelinks and pagelinks tables, cl_to will be dropped and instead the new field cl_target_id will point to lt_id in the linktarget table. We will also drop the cl_collation field and replace it with cl_collation_id which will point to the collation_id field on the new table we are introducing called collation. We are aiming to get this fully done by the end of the next quarter (end of June 2025) but it depends on how fast the migration script can operate and that’s outside of our control. You can follow the work in T299951 <https://phabricator.wikimedia.org/T299951>.It’s worth noting that after this migration is done, we will start working on the imagelinks table. Thank you -- *Amir Sarabadani (he/him)* Staff Database Architect Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________ Cloud mailing list -- cloud@lists.wikimedia.org List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/