Hello,
An update regarding term store split. We are slowly moving towards
finalizing the split.

Now you can connect to the term store databases via these endpoints:
 - termstore.wikidatawiki.analytics.db.svc.wikimedia.cloud
 - termstore.wikidatawiki.web.db.svc.wikimedia.cloud

They currently point to the wikidata's database but after the split, they
will only hold term store tables (wbt_* ones). Please migrate your queries
to use the term store endpoints otherwise, in a couple of weeks, they will
start to drift and eventually error out.

Emphasizing that it'll be no longer possible to join term store tables with
other wikidata tables as they will reside in physically separate servers.

You can follow the work in https://phabricator.wikimedia.org/T351820

Thank you and sorry for the inconvenience

Am Do., 30. Jan. 2025 um 14:47 Uhr schrieb Amir Sarabadani <
asarabad...@wikimedia.org>:

> Hello, in the next coming months, these changes will happen in databases
> and the infrastructure. And it might affect you if you rely on them in your
> tools or queries. This list is ordered based on how soon the change will
> happen.
>
> We understand that updating your tools and systems can be time consuming,
> hence we are giving an advanced notice. I truly apologize for the
> inconvenience but many of these changes are needed to keep the site running
> smoothly.
> Image table redesign
>
> Around fourteen years after the creation of T28741
> <https://phabricator.wikimedia.org/T28741>, we are implementing the
> changes described therein. Currently, every current version of an image has
> a row in the image table and if there are older versions of that file,
> those rows could be found in the oldimage table. These two tables (image
> and oldimage) will be dropped in around two months. The replacement will
> be two main tables: file and filerevision. Every file will have a row in
> the file table describing the name and the type. Every version of the file
> (current and old) will have a row in filerevision describing the
> file-specific information such as its size or the hash of the file, similar
> to the existing distinction between pages and revisions. Another
> improvement is that every file and file revision will get a unique auto
> increment id simplifying many operations and queries. You can check T28741
> <https://phabricator.wikimedia.org/T28741> for more information. The new
> tables are already accessible in wikireplicas but the data hasn’t been
> fully migrated yet.
>
> Term store split out of wikidata’s database
>
> Wikidata’s database has been growing too fast and we need to move the term
> store (tables starting with wbt_) to a dedicated cluster to allow growth
> and improve wikidata’s performance by utilizing cache locality. The new
> section will be called x3 and you will be able to access it in wikireplicas
> but this also means you won’t be able to join these tables with the rest of
> wikidata’s database (such as page table) since they will be residing in two
> physically separate servers that also means most of your queries to
> wikidata’s database (and term store) will become faster. We are aiming
> for the switch to happen in three months’ time. You can follow the work
> in T351820 <https://phabricator.wikimedia.org/T351820>.
>
> Additionally, wb_type table will be dropped and the mapping will be
> hard-coded in the code instead. See gerrit:1110810
> <https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/1110810>
> for more details. This helped us simplify a lot of Wikibase code (example
> <https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/1110720>
> ).
>
> Categorylinks normalization
>
> Categorylinks is the next table in the series of links tables being
> normalized via the linktarget table (parent ticket
> <https://phabricator.wikimedia.org/T300222>, RFC
> <https://phabricator.wikimedia.org/T222224>). Similar to templatelinks
> and pagelinks tables, cl_to will be dropped and instead the new field
> cl_target_id will point to lt_id in the linktarget table. We will also drop
> the cl_collation field and replace it with cl_collation_id which will point
> to the collation_id field on the new table we are introducing called
> collation. We are aiming to get this fully done by the end of the next
> quarter (end of June 2025) but it depends on how fast the migration
> script can operate and that’s outside of our control. You can follow the
> work in T299951 <https://phabricator.wikimedia.org/T299951>.It’s worth
> noting that after this migration is done, we will start working on the
> imagelinks table.
>
> Thank you
> --
> *Amir Sarabadani (he/him)*
> Staff Database Architect
> Wikimedia Foundation <https://wikimediafoundation.org/>
>
_______________________________________________
Cloud mailing list -- cloud@lists.wikimedia.org
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to