Jump to content

Single Version MediaWiki/Rationale

From Wikitech
This page documents an RFC that was run in a Google Doc trying to document in one place Multiversion and evaluating was we could do Single Version HTTP routing of requests in an effort to make deployments faster. It's posted in it's entirety for transparency and historical purposes. Parts of it will make it into other docs in wikitech, either for Multiversion or for Singleversion.

This is a work in progress RFC trying to document MediaWiki Multiversion and the ways it is used across the production realm of the Wikimedia infrastructure and figuring whether it is possible to move away from it (and why)

The intent behind this document is to best identify how we could move away from MultiVersion and into a solution that includes single MediaWiki version containers. It’s part of FY 2024-2025 APP as Q1 Hypothesis for “WE6.2.5 "If we move MultiVersion routing out of MediaWiki, we 'll be able to ship single version MediaWiki containers, largely cutting down the size of containers allowing for faster deployments, as measured by the deployment tool"

One important thing to note is that the author of this document DOES NOT intend to change end user expectations. WP:Thursday is part of the various communities' culture by now. Altering that, without sufficient prior research and surveys, is not prudent.

Current situation

As of July 2024, the situation is best described visually by the diagram from Timo Tijhof at

https://wikitech.wikimedia.org/wiki/MediaWiki_at_WMF#/media/File:MediaWiki_infrastructure_2022.png

The situation has arguably evolved since then due to the MediaWiki on Kubernetes projects, which has recently reached levels of 100% global and internal traffic for almost all wikis. However the diagram is still correct as the core concepts have not changed.1

What we wanna focus on, for the purposes of this section, is the MediaWiki clusters and the way traffic reaches them.

We can discern 2 distinct ways requests can reach MediaWiki

  • Traffic coming from the CDN
  • Traffic coming from the rest of the internal infrastructure

Traffic coming from the CDN

From MediaWiki's point of view, it never sees end-users from the world talking directly to it, but rather sees what is called the CDN (Content Distribution Network, aka Edge) talking to it, proxying the requests from users. While there is quite a bit of business logic in the CDN layer and many components in it, only one of them interfaces with MediaWiki. ATS, aka Apache Traffic Server, aka backend cache. The other layers never interface with MediaWiki directly and as such won’t be discussed more here. ATS, after some basic pattern matching on the URL path and host, routes requests to a specific MediaWiki cluster. The rough outline is in the linked diagram but overall it falls to 2 big buckets

  • Web (/w/index.php, /w/load.php)
  • API (/w/api.php, /w/rest.php)2

And that’s about it. There is some extra logic for things like wikifunctions, mwdebug et al, but the 2 buckets above are > 99.9% of external use cases currently.

Operationally, those 2 large buckets have been split in the infrastructure for Separation of concerns reasons. Simply put, since API users tend to be automated and thus have a tolerance for latency that is higher than humans, it makes sense to split the two and have them on dedicated infrastructure, making the user experience of humans better and more stable.

While the CDN targets the above said clusters by name, it is unaware of the actual endpoints, and uses LVS to target the backends, which abstracts them, from ATS point of view, behind a set of IPv4/IPv6 addresses. Unlike ATS, LVS, as a layer 4 load balancer, is completely unaware of the content of the request.

Traffic coming from the rest of the internal infrastructure

MediaWiki does see a lot of traffic generated internally in the infrastructure. Some examples are the JobQueue, RESTBase, MCS/PCS, Content Translation, CirrusSearch, research workloads and so on.

Furthermore, there are some cases where MediaWiki ends up needing to execute some medium or long term tasks.

For the above use cases, the “Web” bucket is never a valid target. The API definitely is and is actively used3 by internal workloads. Furthermore, 1 extra bucket has been created that is named “Jobrunner” and is running very specific entrypoint code that is meant to serve MediaWiki asynchronous Jobs (/rpc/RunJobs.php). For efficiency reasons, this bucket has been in cohabitation with 1 more bucket, named “videoscaler” that has been used to re-encode videos uploaded to commons.wikimedia.org.

We have 1 more bucket, which comprises 1 single instance, called “mwmaint”. That one runs arbitrary and scheduled maintenance tasks for all wikis. It has its own specific configuration.

Load Balancing towards a MediaWiki instance in the internal infrastructure case happens again via LVS in the exact same way that is used for the CDN case. Routing decisions regarding destination buckets happen at the originating application.

What does each single MediaWiki instance contain

Without going into details that are unrelated to the scope of this document, each MediaWiki instance contains:

  • MediaWiki core and extensions code for 3 different versions
  • i18n translations for above said different versions of MediaWiki
  • Apache containing VirtualHost configuration, containing some rewrite rules and passing various configuration and information to PHP and MediaWiki
  • Envoy acting both as TLS demarcation and service mesh
  • PHP-FPM, the process manager for php that is being used to execute MediaWiki
  • Mcrouter for sharding keys to memcached
  • A number of metrics exporters for the above components

Multiversion

Why 3 different MediaWiki versions? That’s because of Multiversion, a WMF-specific script, living outside the core MediaWiki repos, that allows wikis to run different versions of MediaWiki at the same time while sharing the same instance. It has grown organically, to solve the specific problem of deploying changes gradually. It currently supports 3 groups of wiki cohorts, named, unsurprisingly, groups. They are better described in Deployments/Train but the general idea is that some very low traffic wikis see changes on Tuesdays, some higher traffic ones on Wednesdays and finally the highest traffic ones, all Wikipedias, see changes on Thursdays. This has been introduced in 2011

This has been happening for long enough that, at least the editor community of en.wikipedia.org has a special page, originally called WP:Thursday that describes the effects of the Train

Problem statement

Multiversion was invented in an era (2011) where the overall IT industry didn’t offer any kind of similar functionality. Containerization technologies hadn’t even been invented, virtualization was still in its early years of being adopted in production workloads, the cloud wasn’t a big thing like today and the solutions offered by it today either didn’t yet exist or weren’t featureful enough. Multiversion’s invention solved the issue of gradually deploying changes to wikis in a very WMF specific way, saving in the process developers, deployers and end users from experiencing bugs in high traffic and high pressure environments. For more than a decade it has served us well. It’s in 2024, with a new platform and capabilities that we are able to revisit it and see where we can improve on it.

One important thing to note is that the author of this document DOES NOT intend to change end user expectations. WP:Thursday is part of the various communities' culture by now. Altering that, without sufficient prior research and surveys, is not prudent. However, altering the implementation while keeping the behavior similar should be possible. This means that the experience of developers and deployers might have to change. The "how" will be key in the success of this project.

Multiversion in 2024 is starting to show signs of becoming a hindrance to better development practices. A standard multiversion image today, uncompressed, is just shy of 8GB.

  • Building it from scratch takes time.
  • Pushing it to our artifact registry takes time and might at times fail due to size constraints.
  • Deploying it takes time.
  • Debugging it can be complicated.
  • It’s hard coded to 3 versions.
  • Per wiki routing happens internally in MediaWiki, making it impossible to utilize the rest of the infrastructure to perform easier testing.

While we‘ve found various workarounds (e.g. Dragonfly, building the image unattended on Monday morning, mwdebug and related extension) to alleviate issues the general sentiment remains that if the decision of which version to route to could be done outside of MediaWiki, we could shrink the image size by a factor of 3 leading to faster deployments, less network traffic in our links, ability to have > 3 versions, ability to route specific domains, paths, etc to specific versions making testing easier and so on.

How is Multiversion implemented?4

Multiversion isn’t shipped with MediaWiki core (or any extension for that matter). It’s a WMF-specific script (maintained in the mediawiki-config repo). What it does is find out the appropriate MediaWiki installation for a specific hostname during an execution context. Usually, that’s an HTTP request and the method is to inspect the HTTP Host header but there are other use cases. The listing below is an effort to provide an exhaustive list of direct users of Multiversion MediaWiki. In some cases, prominent indirect users are also documented.

Users

In mediawiki-config/multiversion

  • buildLogoHTML.php - used by composer via composer.json. It includes MWWikiversions.php and delegates to it by usage of MWWikiversions::readWikiVersionsFile
  • getMWVersion - simple PHP script to just return the version in the CLI. Relies on wikiversions.php and wikiversions-labs.php
  • bin/expandbdblist - Relies on MWWikiversions.php, just expands the list
  • bin/generate-php-wikiversions.php - Relies on MWWikiversions.php
  • bin/list-versions.php - Lists a unique set of versions in use. Which means on Fridays it returns just 1 result, confusingly prefixed with the very first wiki, aawiki. Relies on MWWikiversions.php
  • bin/manage-dblist - Tool to work with dblists, includes MWWikiversions.php and MWMultiVersions.php. Called from composer

In mediawiki-config/

  • rpc/RunSingleJob.php - The standard entrypoint for Jobrunners. It differs from other stuff as it doesn't rely on the HTTP host header, but on the database stanza of the event pushed to the jobrunners.
  • docroot/noc/db.php - https://noc.wikimedia.org/db.php. Includes MWConfigCacheGenerator.php
  • docroot/noc/wiki.php https://noc.wikimedia.org/wiki.php - Same as above

In mediawiki-config/w

Those are the actual HTTP endpoints we expose to the public, so they account for like 99.9% of the use cases

  • api.php: Wraps equivalent MediaWiki file using MWMultiVersion::getMediaWiki()
  • fatal-error.php: Used to intentionally cause fatal-errors in production.
  • favicon.php: Used to stream favicon.ico requests
  • img_auth.php: Wraps equivalent MediaWiki file using MWMultiVersion::getMediaWiki()
  • index.php: Wraps equivalent MediaWiki file using MWMultiVersion::getMediaWiki()
  • load.php: Wraps equivalent MediaWiki file using MWMultiVersion::getMediaWiki()
  • opensearch_desc.php: Wraps equivalent MediaWiki file using MWMultiVersion::getMediaWiki()
  • rest.php: Wraps equivalent MediaWiki file using MWMultiVersion::getMediaWiki()
  • robots.php:: This one concatenates the main robots.txt file with the contents of Mediawiki:Robots.txt of the wiki in question.
  • static.php: Serves static files in an Multiversion friendly way. Wraps includes/WebStart.php with MWMultiVersion::getMediaWiki() and then serves .js etc
  • thumb_handler.php: Wraps equivalent MediaWiki file using MWMultiVersion::getMediaWiki()
  • thumb.php: Wraps equivalent MediaWiki file using MWMultiVersion::getMediaWiki()
  • touch.php: Implementation to cater to Apple's touch icon functionality. Similar to static.php above

In operations/puppet and the mediawiki image

  • foreachwiki - Bash script. Exists on deployment hosts, maintenance hosts, wikitech host, canaries via the scap::scripts puppet class. It just feeds foreachwikiindblist all.dblist and delegates execution to it (via exec)
  • foreachwikiindblist - Bash script. Same hosts as above. Relies on expanddblist to loop over the wikis in dblist, running MWScript.php on each one. Neither this one nor the above one need to be altered, they are indirect users.

In other places

  • MediaWikiPingback - See beacon/event/index.php. It uses MWMultiVersion directly. Per comments in that file it could be deprecated and removed.
  • MWScript.php is used in a ton of places. Deprecation and removal shouldn't be impossible, but it will take quite some time. For now it's best to not mess with callers, but just alter the underlying way the script works.
  • scap/train Scap, implementing what is known as the Deployment train, prepares the ground for Multiversion to work. It does so by creating a fresh branch from the latest master of MediaWiki (including any extensions we deploy), then cloning it to the deployment server in a directory named like /srv/mediawiki/php-–. E.g. for a wiki with version php-1.43.0-wmf.16, index.php will be mapped to /srv/mediawiki/php-1.30.0-wmf.5/index.php. Then, using the group0, group1 and group2 definitions that exist in mediawiki-config, populates a file, named wikiversions.json (and a .php equivalent). It is important to point out that this “API” between scap and MediaWiki is the only way they interact, at least as far as Multiversion is concerned.

Implementation

Multiversion's core implementation logic is stored under mediawiki-config/multiversion. Below is a rundown of "main" vs "helpers".

Helpers

  • MWConfigCacheGenerator.php: It uses Multiversion to collect dblist tags associated with a wiki. Used by CommonSettings.php
  • missing.php: Handle’s a “missing wiki” response. Only caller: MWMultiVersion.php
  • update-all-localsettings.php: No detected users. Relies on MWWikiversions.php
  • defines.php**. Not much to see here, basic definitions
  • MWMultiVersionException.php - Just an exception definition
  • MWRealm.php - Figures out labs or not and datacenter. Not directly related to multiversion, it found a home here and is included by MWMultiVersion.php. It’s important to figuring out the active datacenter

Main

Scap has already been covered above, but it’s important to reiterate that it’s the creator of the wikiversions.json file (and the .php corresponding one) that will be discussed later.

  • MWMultiversion.php: This file contains various functions, but the main 2 entry points are: getMediaWiki($file, $wiki=null) and getMediaWikiCli($file, $useOld=false). Internally, there is a function called getVersion() from the same class that includes wikiversions.php and figures out the wiki to version mapping. Then, what both do, is return the properly versioned file referenced by $file. Unfortunately for wannabe refactor-ers, business logic has crept in this file, in functions like setSiteInfoForWiki() or initializeFromServerData() where project specific if clauses have been added.
  • MWVersions.php: This file contains functions that read/write wikiversions.json and wikiversions.php and perform dblist expression evaluations (dblists support a .expr file that uses a small DSL to allow basic set functions like diff, intersect, and union). It isn’t used in the hot path of requests though.

Conclusions

Given the above, the following conclusions can be drawn regarding the implementation of Multiversion

  • While the original implementation has been contained enough, 13 years of usage have allowed some business logic to creep into the code, probably for lack of a better alternative. This alone, makes ripping Multiversion altogether from mediawiki-config improbable in the near and medium term future.
  • Multiversion added the /w prefix that many entrypoints still use to this day. While there exists a mechanism to switch requests for pages (i.e. to /w/index.php) to a canonical URL, that isn’t true for e.g. the API endpoints, which are ossified, due to client usage, to /w/api.php.
  • The actual implementation of Multiversion decision making is rather well hidden behind proper abstraction layers, limiting the decision to a single function. This means that even if we keep Multiversion code around for the foreseeable future, changing how routing happens, and thus what code versions need to be available in the file system, is possible.
  • There exists an intermediate source of truth for wiki to version mapping, namely wikiversions.json (and .php). The original source of truth, namely the groups, remain within scap. The intermediate source of truth is generated by scap and consumed by Multiversion. We can switch that intermediate source of truth to other formats and mediums, with minimal modifications to scap and to MWMultiversion.php and MWVersions.php. By formats, we mean JSON, YAML, PHP, etc and by mediums, we mean files, HTTP endpoints, datastores, etc
  • We have full control over the above and thus can make any decision that would best serve the WMF and the Movement at large.

Multiversion routing options

The following sections explore various options that can be used to facilitate SingleVersion MW OCI containers. The following aspects of them will be discussed with pros and cons: Scalability, Reliability, Latency, Centralization.

Services Proxy (aka Service Mesh)

This infrastructure has existed already for a while now. It’s documented at Services_Proxy and utilizes Envoy, a versatile software component, implementing alongside other things an HTTP reverse proxy, to set up a configurable mesh between services. Every service instance, including MediaWiki, is directly collocated with an instance of it. It is responsible for TLS demarcation (initiation and termination), circuit breaking, retries and other functionality.

No URL routing decision making capacity has been added to it yet, but it is possible and the Service Operations team is looking forward to experimenting with it while working on a control plane. In the current form, Services Proxy has no concept of MediaWiki version. As such, by the time traffic reaches MediaWiki, it’s already too late to make a routing decision. The Multiversion routing should happen at the Envoy component of the initiating service for Multiversion routing to work and even then, it would only function partially. The main problem that exists in this case is that only requests that use the Services Proxy would be able to be routed per version. The external traffic, that is the traffic coming from the CDN wouldn’t be able to be routed, unless the CDN becomes part of the Services Proxy, which would go against the design goals of the Services Proxy. It was never designed to be addressed from the CDN. This also applies to maintenance scripts and one off scripts.

Pros

  • Already existing and proven solution, with robust infrastructure and metrics. Scalability and Reliability concerns are well addressed. Latency concerns don’t exist as the Services Proxy is already in the path of internal requests.
  • Service Operations already working on a control plane.

Cons

  • Can’t currently serve traffic from the CDN/Edge as it was never designed to be exposed to the edge. This means that unless it’s re-architected with that in mind, there needs to be duplication at the CDN of the functionality of routing, meaning double work and possibly 2 different sources of truth.
  • The possibility for N (out of M) instances of the Services Proxy failing to update their Multiversion related configuration should not be ruled out.
  • It doesn’t solve the maintenance and one off script issues

WikiKube Ingress Gateway

Kubernetes Ingress Gateway is an already existing component in our infrastructure. It was first introduced as a way to mitigate the risk of performing LVS/Pybal changes when new services were added to the infrastructure, reserving such changes just for very high traffic services (e.g. MediaWiki) as a precaution while the component proved itself as being battle worthy. There are many low traffic services using it these days. The capacity of the component has also grown in the last fiscal year, allowing us to ponder utilizing it for this task.

The component, shortly after being introduced was imbued with extra functionalities, like termination of TLS connections from clients with auto issued certificates using cert-manager and it fares quite well.

It is implemented using the Istio specific Kubernetes resources, namely Gateway, VirtualService and DestinationRule due to the fact that when it was evaluated and implemented as a solution the more generic upstream Kubernetes Gateway resource didn't exist yet. However, it should be noted that nowadays Istio does support the Gateway API and intends to make it the default API for traffic management in the future.

While the component has a lot of functionality, there is an entire set of Traffic Management capabilities that it has and it is what we are solely interested in for the purposes of this document. We'll focus here on Request Routing, of which an entire section documents in upstream documentation. This specific functionality allows us to route incoming requests on a variety of predicates, like HTTP headers, URIs, schemes, ports, HTTP methods and so on. A full list exists at HTTPMatchRequest. On a high level, implementing it is as simple as something like the following simplified Kubernetes resource object.

VirtualService
apiVersion: networking.istio.io/v1
kind: VirtualService
metadata:
  name: mwmultiversion
spec:
  gateways:
  - "mw"
  hosts:
  # Internal cluster traffic, if any
  - "mw.<namespace>.svc.cluster.local"
  # Now our standard endpoints
  - "mw.svc.eqiad.wmnet"
  - "mw.svc.codfw.wmnet"
  - "mw.discovery.wmnet"
  # And now everything we receive traffic for 
  - '*.m.mediawiki.org'
  - '*.m.wikibooks.org'
  - '*.wikipedia.org'
  - '*.m.wikidata.org'
  - '*.m.wikimedia.org'
  - '*.m.wikimediafoundation.org'
  - '*.m.wikinews.org'
  - '*.m.wikipedia.org'
  - '*.m.wikiquote.org'
  - '*.m.wikisource.org'
  - '*.m.wikiversity.org'
  - '*.m.wikivoyage.org'
  - '*.m.wiktionary.org'
  - '*.mediawiki.org'
  - '*.planet.wikimedia.org'
  - '*.wikibooks.org'
  - '*.wikidata.org'
  - '*.wikimedia.org'
  - '*.wikimediafoundation.org'
  - '*.wikinews.org'
  - '*.wikiquote.org'
  - '*.wikisource.org'
  - '*.wikiversity.org'
  - '*.wikivoyage.org'
  - '*.wiktionary.org'
  - '*.wmfusercontent.org'
  - '*.zero.wikipedia.org'
  - 'mediawiki.org'
  - 'w.wiki'
  - 'wikibooks.org'
  - 'wikidata.org'
  - 'wikimedia.org'
  - 'wikimediafoundation.org'
  - 'wikinews.org'
  - 'wikiquote.org'
  - 'wikisource.org'
  - 'wikiversity.org'
  - 'wikivoyage.org'
  - 'wiktionary.org'
  - 'wmfusercontent.org'
  - 'wikipedia.org' 
  # TODO: We need to clarify whether we can use an IP address here. Docs say: IP addresses are allowed only for services defined via the Gateway.
  http:
  - match:
    - headers: # A better predicate is actually "authority" but keeping it simple for now
        host: 
          regex: (test|test2).wikipedia.org # example
    route:
    - destination:
        host: mw
        subset: group0
  - match:
    - headers:
        host:
          # we won't be using exact, but it's an example
          exact: wikidata.org
    route:
    - destination:
        host: mw
        subset: group1
  - route: # default to everything in group2
    - destination:
        host: mw
        subset: group2

A couple of explanations on the above:

gateways. We 'll be explaining below what a Gateway is, but this just binds this VirtualService to a specifically named gateway, in the same namespace in this example

hosts is the stanza listing all possible endpoints that traffic might flow to and this VirtualService should match.

http is a grouping stanza that includes in sequential order the various matches. The last entry is a single route entry that matches everything that isn't more selectively matched. It routes to group2

match is where the logic happens. On the above, we match on the http host header (although technically authority would be better) as a regex and then sending test.wikipedia.org and test2.wikipedia.org traffic to the subset of a Kubernetes Service named mw. Similarly another match matches wikidata.org and routes to a Kubernetes Service named mw again but at the subset group1.

Note that nothing stops us from adding more groups, matches, etc in the future. One interesting question that hasn't been answered yet is how unwieldy the match section in the above will become once we have all wikis in all groups defined and matched. The question of whether this will fit in the Kubernetes API has already been asked and answered though. The size of the resource is expected to be ~12KB (the size of all.dblist) and thus well below the 1MB limit of etcd (and thus the Kubernetes API). However, should this become unwieldy in the future, we can have multiple VirtualService Resources and they will be merged by the ingress gateway. There are some gotchas however mentioned in split-virtual-services

The subsets mentioned above are defined by the following DestinationRule

DestinationRule
apiVersion: networking.istio.io/v1 
kind: DestinationRule 
metadata: 
  name: mw 
spec: 
  host: mw.<namespace>.svc.cluster.local
  subsets:
  - name: "group0"
    labels:
      group: "group0"
  - name: "group1"
    labels:
      group: "group1"
  - name: "group2"
    labels:
      group: "group2"
  # We can also add whatever we want here. The above are kinda set in stone but we can add anything we want in the future, e.g. a MW version
  - name: "1.44-wmf.3"
    labels:
      group: "1.44-wmf.3"

The above is simple, but also functional and extendable. We can also add capabilities for traffic policy management like load balancing, connection pooling, outlier detection, TLS settings etc. We will likely only be using TLS settings to start with, since we already do for every other service that is behind the Ingress.

Note that the host entry, namely a Kubernetes Service, is the same for all subsets. They are differentiated by the labels and not much more.

The above will be defining the routing from wiki -> group, thus the equivalent of dblists, not the mapping to the exact version. The exact version will be defined in the corresponding helmfile.yaml values files. Thus, we are decoupling the explicit mapping between wikis and exact versions that scap informs MediaWiki of, from both and moving it to this component.

There is one more object we will need in the Kubernetes API, namely the Gateway. An example is below

Gateway
- apiVersion: networking.istio.io/v1beta1
  kind: Gateway
  metadata:
    labels:
      app: "mediawiki"
    name: "mw"
  spec:
    selector:
      istio: ingressgateway # this is the same everywhere. But this is the place were we can separate MW from the rest of the services at the ingress level. 
    servers:
    - hosts:
      # TODO: Figure out whether this can be deduplicated
	  # Internal cluster traffic, if any
	  - "mw.<namespace>.svc.cluster.local"
	  # Now our standard endpoints
	  - "mw.svc.eqiad.wmnet"
	  - "mw.svc.codfw.wmnet"
	  - "mw.discovery.wmnet"
	  # And now everything we receive traffic for 
	  - '*.m.mediawiki.org'
	  - '*.m.wikibooks.org'
	  - '*.wikipedia.org'
	  - '*.m.wikidata.org'
	  - '*.m.wikimedia.org'
	  - '*.m.wikimediafoundation.org'
	  - '*.m.wikinews.org'
	  - '*.m.wikipedia.org'
	  - '*.m.wikiquote.org'
	  - '*.m.wikisource.org'
	  - '*.m.wikiversity.org'
	  - '*.m.wikivoyage.org'
	  - '*.m.wiktionary.org'
	  - '*.mediawiki.org'
	  - '*.planet.wikimedia.org'
	  - '*.wikibooks.org'
	  - '*.wikidata.org'
	  - '*.wikimedia.org'
	  - '*.wikimediafoundation.org'
	  - '*.wikinews.org'
	  - '*.wikiquote.org'
	  - '*.wikisource.org'
	  - '*.wikiversity.org'
	  - '*.wikivoyage.org'
	  - '*.wiktionary.org'
	  - '*.wmfusercontent.org'
	  - '*.zero.wikipedia.org'
	  - 'mediawiki.org'
	  - 'w.wiki'
	  - 'wikibooks.org'
	  - 'wikidata.org'
	  - 'wikimedia.org'
	  - 'wikimediafoundation.org'
	  - 'wikinews.org'
	  - 'wikiquote.org'
	  - 'wikisource.org'
	  - 'wikiversity.org'
	  - 'wikivoyage.org'
	  - 'wiktionary.org'
	  - 'wmfusercontent.org'
	  - 'wikipedia.org' 
      port:
        name: https
        number: 443
        protocol: HTTPS
      tls:
        credentialName: mw-tls-certificate
        mode: SIMPLE

One interesting consequence of the above is that eventually we 'll be able to move a lot of the CDN's responsibility for routing to the proper MediaWiki Kubernetes deployment into this component, e.g., by having a HTTPMatchRequest like the one below in a VirtualService, we 'll be able to decide on this layer whether that request is for mw-api-ext or mw-web.

HTTPMatchRequest
http:
  - match
	- uri:
	    prefix: "/w/api.php"
    route:
    - destination:
	    host: "mw"
	    subset: "group2-api-ext"
  - route: # default
    - destination:
        host: "group2-mw-web"

Thus, the CDN will become more agnostic to the implementation details of which traffic needs to go where. We will also be able to offer a single endpoint for every user internally, obviating the need for them to know which exact endpoint (mw-api-int-ro vs mw-web) to talk to, making mistakes and weird traffic patterns due to staff/operator errors a thing of the past. This exercise is left for once the approach has overall been proven.

Traffic patterns covered

The above would suffice to cover the following traffic patterns:

  • Traffic from CDN to WikiKube cluster
  • Traffic from WMF infrastructure (but not in WikiKube) to WikiKube
  • Traffic from WikiKube workloads to WikiKube workloads

And thus would serve the following purposes:

  • mw-web
  • mw-api-int
  • mw-api-exte
  • mw-jobrunners
  • mw-parsoid
  • mw-wikifunctions

For changeprop-jobqueue, which is the only workload submitting to the jobrunners, we ‘ll need to alter it slightly to inject an HTTP Host header in the payloads. This is already covered under T246371, which admittedly is a larger task

Scap

How would scap update such an environment. Scap would be relegated to handling the 3 subsets of the mw Kubernetes in the exact same way it handles right now all the mw-* helm releases (e.g. mw-api-int-main, mw-api-ext-canary, etc). As an example

/etc/helmfile-defaults/mediawiki/release/mw-group1.yaml would have contents like these

scap YAML
docker:
  registry: docker-registry.discovery.wmnet
main_app:
  image: restricted/mediawiki:2024-10-22-220415-publish
mw:
  httpd:
    image_tag: restricted/mediawiki-webserver:2024-10-22-181107-webserver

Note that the only difference from the current situation is the removal of -multiversion from the mediawiki image name.

In helmfile.yaml we would be having stanzas like the ones below

helmfile.yaml
releases:
  - name: group0
    <<: *default
  - name: group1
    <<: *default
  - name: group1
    <<: *default
  # Note the MW version release. We can have scap augment this in the future to have specific version mentioned. Probably in a complimentary fashion to the group abstraction in the medium future.
  - name: "1.44-wmf.3"
    <<: *default
environments:
  eqiad:
    values: 
      # Same comment as above for MW version release.
      - releases: [group0, group1, group2, "1.44-wmf.3"]

One interesting repercussion of the above design is that scap is relegated to being the OCI image builder and a thin wrapper over helmfile. This will allow us in the future to work in replacing helmfile with different deployment tooling that exists in the Kubernetes ecosystem. Some of these systems, e.g. Argo Rollouts, even integrate (optionally) with the Kubernetes Ingress API, allowing to utilize the Kubernetes resources outlined above for progressive deployments.

It should be noted that this approach doesn't differ from the others in that it doesn't specifically handle maintenance scripts and one-off scripts. There is an entire section about that further down this document.

Pros

  • Already existing and proven solution, albeit not battle tested at MediaWiki traffic levels
  • Off the shelf component and architecture, not suffering from NIH syndrome
  • Centralizes the point of the URL routing decision, allowing both applications using the service mesh as well as the CDN to defer all knowledge of servergroups and wikis to it
  • Utilizes envoy under the hood, a component Service Operations is comfortable with
  • Kubernetes model ensures that eventual consistency will not take a long time to happen.

Cons

  • It doesn’t solve the maintenance and one off script issues
  • 1 more reverse proxy in the chain of all requests. Added latency

Apache Traffic Server (aka ATS)

This infrastructure has existed for a long time now. It is the last layer of the CDN/Edge and is what connects directly to the application layer from the CDN. Aside from HTTP caching, it also has some routing logic itself, e.g. it knows to route calls to MediaWiki APIs to the proper MediaWiki “servergroup”. The current routing logic relies solely on URL patterns (and Cookies) and static destinations. For the CDN layer to become MediaWiki version aware, we would need to:

  • Expose wikiversions.json (in whatever format) somehow to all instances of the CDN
  • Create multiple addressable deployments for the 3 parts of the versions
  • Create functionality in ATS (Scripting in Lua mostly) that routes wikis to the corresponding addressable environments.

None of the above are prohibitive, but the above would expose internal details of the deployments to the CDN, coupling the 2 more closely and thus slowing down the ability to perform alterations to either, depending on > 2 different teams already. However, that is not the biggest drawback of the solution. That would be the fact that the internal traffic doesn’t go via the CDN and architecturally it would be very suboptimal if it did. A few reasons for that are:

  • artificially polluting the organically (that's human user traffic) populated cache of the reverse proxies
  • obfuscating logs with IPs that do not belong to the internal endpoint that actually talks to the MW API (or other endpoints),
  • Having to go through another 3 intermediaries (3 reverse proxies)
  • Added latency
  • It would fail to utilize the service mesh, largely neutralizing the benefits we obtain from it

This also applies to maintenance scripts and one off scripts.

Pros

  • Already existing and proven solution, with robust infrastructure and metrics. Scalability and Reliability concerns are well addressed.
  • Some routing logic already exists and extra logic has been implemented in the past to facilitate projects like PHP version upgrades or the transition to mw-on-k8s.

Cons

  • Can’t currently serve internal traffic from applications and MediaWiki and it would be architecturally suboptimal to alter it to do so.
  • The possibility for N (out of M) instances of the CDN failing to update their Multiversion related configuration should not be ruled out.
  • ATS specific Lua scripting code will need to be written and maintained.
  • It doesn’t solve the maintenance and one off script issues

A new, written from scratch, component?

We considered crafting a new component that would do just MediaWiki URL routing. However, for this to work, it would need to be a form of reverse proxy that sits right before the MediaWiki deployments, having all of the information needed to perform routing (wikiversions.php|json effectively). However, implementing a reverse proxy properly, with respect to all HTTP versions and differences between them is a very large task that would require a large amount of time to implement. Furthermore, the green field approach would also take time to mature, which would probably also mean a variety of security and confusion issues showing up

Pros

  • We can pretty much achieve what we want, writing code.

Cons

  • A lot of effort and investment
  • A long time to mature
  • Increased probability for incidents of every nature.

Proposal

Given the above, we consider the WikiKube Ingress Gateway solution the best possible one to pursue further in the effort to route wiki traffic to group specific containers

Maintenance and one-off scripts

The astute reader probably is asking by now the question "what about maintenance and one-off scripts? Shouldn't they be running in the same version as the wiki they target?". And the answer is they should, but this was out of scope for the hypothesis. However, a proposal that would fit all of the above scenarios has been thought out and is explained below. mwscript-k8s already allows via the --mediawiki_image parameter to specify which exact image will be used. It defaults right now to figuring out which multiversion image is used to power the mw-web deployment and uses it. This works out fine. The typical current invocation looks like this

mwscript-k8s
mwscript-k8s --comment='backfill for T123456' -- Filename.php --wiki=aawiki --script-specific-arg

the decision about which wiki to target is there, but delegated to the PHP script in the image. Moving away from this pattern, will require the following

  1. Amend mwscript-k8s to scan arguments for the database of the wiki passed as a parameter and thus choose the proper image
  2. Based on a file that scap populates, default to figuring out the proper value for --mediawiki_image

The above should happen by reading the usual dblists files (group0.dblist, group1.dblist, group2.dblist) and figure out which /etc/helmfile-defaults/mediawiki/release/mw-<group>.yaml file to read to obtain the image. Since mwscript-k8s is meant to be run on the deployment hosts, it is expected that it will have access to this information, since there is the point where that information is generated, before being disseminated to other places (e.g. noc.wikimedia.org)

Notes

^1 What has indeed changed is that every item called “webserver” in that diagram, is no longer a bare metal hardware server but rather an OCI (aka Docker) container running in our WikiKube platform. We also now have split the API cluster to external and internal API for separation of concerns reasons.

^2 Purposefully not naming the clusters themselves as names have recently changed due to MediaWiki-on-k8s and it is possible they‘ll change further. Historically the 2 buckets were named appservers and api_appservers.

^3 Since the move to mw-on-k8s, there’s been a further separation of concerns, splitting the API bucket to 2 different subbuckets, targeted for external and internal usage respectively.

^4 This is the one section of this document that the author wants it to be exhaustive and as technically correct as possible. The rest are left on purpose high level, this one and the one below will contribute heavily in the decision