doc.wikimedia.org
doc.wikimedia.org is the publication target for auto-generated software documentation.
It contains the "published" version of the inline documentation in many of our code repositories. For example, instead of reading lines of "PHP comments" you can view it nicely formatted, e.g. doc vs code.
Service
Currently hosted on doc1003 and doc2002. Web traffic routing is directed via doc.discovery.wmnet
. The same DNS discovery name is also used as the destination of the "publish-to-doc" Jenkins job.
Prior to 2018, the doc site was previously hosted on gallium.
Architecture
Landing pages
The navigational structure and page layout is generated on the server-side by a PHP script. This provides navigation header, footer, and page layout. The main content of each page comes from either an HTML file, YAML file, or PHP function that produces a directory index.
The root page lists project for which documentation is open, one can see it as a catalogue of some of our open source products. Items can be added by editing the file org/wikimedia/doc/opensource.yaml. See the repository README.md for additional details and to test your change locally.
Examples of URLs in scope:
- https://doc.wikimedia.org/
- https://doc.wikimedia.org/cover/ (domain-specific directory index).
- https://doc.wikimedia.org/unicodejs/ (generic directory index for software versions).
- https://doc.wikimedia.org/mediawiki-core/1.34.0/ (generic directory index for different outputs for a given software version).
- https://doc.wikimedia.org/mediawiki-libs-Timestamp/ (to produce a redirect to the only subdirectory).
Microsites
The landing pages provide discovery of the microsites for API documentation, demos, and coverage reports. Each microsite is standalone, statically generated, and individually deployed. They are published by post-merge Jenkins CI jobs into the /srv/doc/
directory outside the landing pages' document root.
Examples of URLs in scope:
- https://doc.wikimedia.org/mediawiki-core/1.33.0/php/ (Doxygen output).
- https://doc.wikimedia.org/mediawiki-core/1.33.0/php/search.php?query=load.php (Doxygen output supports search via PHP script).
- https://doc.wikimedia.org/unicodejs/v13.0.3/ (JSDuck output).
- https://doc.wikimedia.org/oojs/v6.0.0/ (JSDoc3 output).
- https://doc.wikimedia.org/scap/ (Sphinx for Python output).
- https://doc.wikimedia.org/cover/mediawiki-libs-Minify/ (PHPUnit test coverage).
- https://doc.wikimedia.org/cover/unicodejs/ (Istanbul.JS test coverage).
Publishing
Each microsite is generated by a post-merge Jenkins CI job, which executes on a Jenkins CI agent. Each Jenkins CI agent is a WMCS instance in the integration
project. These execute untrusted containers, packages, and user-submitted code and are not allowed to communicate with production hosts (such as the doc.wikimedia.org host).
After a microsite is generated, the publish-to-doc
job executes on the primary Jenkins host in production, which pulls the microsite from the job workspace on the CI agent where the post-merge job ran (using rsync and ssh). This is then relayed further using rsync to doc.discovery.wmnet
(rsync://doc.discovery.wmnet/doc/
).
Caching
doc.wikimedia.org is behind ATS/Varnish with pages being cached with a one hour TTL. Nothing purges the cache when content gets published, so you may see some outdated content until the cache expires.
Runbook
Switch primary host
Pre-switch checklist:
- The new host must be provisioned with the
doc
role. - For the landing site:
- The new host must be a scap/dsh target for
ci-docroot
, and initialised after the last deployment. - Else do at least one dummy "scap deploy" after adding it to the targets to ensure it has the latest version.
- The new host must be a scap/dsh target for
- For the microsites content:
- The new host must be included in Puppet
profile::doc::all_hosts
, which ensures the current primary rsyncs its content first.
- The new host must be included in Puppet
Actual switch:
- Prepare a DNS change to set
doc.discovery.wmnet
to the new host. This will control where new content is synced to by Jenkins, and which backend external HTTP traffic is routed to. - Prepare a Puppet change to set
profile::doc::active_host
to the new host. This will toggle the hourly cronjob that rsyncs microsites content from the primary to secondary hosts. It is important to run Puppet agent on all hosts after merging this so as to not have a 30-minute window during which Jenkins may be syncing to a read-only server. - Deploy the DNS change, quickly followed by the Puppet merge and run.
If a post-merge Jenkins job happens to be running around this time, or if there are reports of lost content, a CI admin can re-run the last post-merge job for their software documentation (in Jenkins, or via Zuul) to generate and sync it a second time after the switch.
Deploy landing page
The "landing page" part of this service comes from the integration/docroot Git repository, and is deployed from a deployment server using Scap:
ssh deployment.eqiad.wmnet cd /srv/deployment/integration/docroot git fetch origin # review the diff git rebase scap deploy 'My Change-Id and/or other explanation'