Services/Deployment
This page describes the legacy system for deploying services to the Wikimedia infrastructure. Currently, services are deployed to Kubernetes via the deployment pipeline.
Regular Deployment
There are a lot of moving parts in our production stack -- MediaWiki, its extensions, various back-end services, HTTPS handlers, caches, just to name a few. It is thus important that you communicate your deployment schedules on the Deployments page.
Preparing the Deploy Repository
The deployment process starts with updating the deploy repository. Go into your source repository and update it with:
$ ./server.js build --deploy-repo --force --review
The build script will update the pointer of the deploy repository's submodule, create a Docker container in which it will install the module dependencies and send the changes to Gerrit. Review them and merge.
BetaCluster Deployment
Before deploying to production, remember to update and test the deployment in BetaCluster. Log onto deployment-deploy01.deployment-prep.eqiad.wmflabs and update the repo there:
$ cd /srv/deployment/<service-name>/deploy $ git pull && git submodule update --init
Time to deploy to BetaCluster:
$ scap deploy '<a-message-here-to-describe-the-changes-being-deployed>'
After the deploy, check the output of your service in BetaCluster.
Deploying to Production
Next, log onto deployment.eqiad.wmnet and update the repo there:
$ cd /srv/deployment/<service-name>/deploy $ git pull && git submodule update --init
In the #wikimedia-operations IRC channel announce the deployment by logging it into the Server Admin Log with !log <service-name> deploying <deploy-repo-sha1>
. Now, proceed to do the deployment from deployment.eqiad.wmnet:
$ scap deploy '<a-message-here-to-describe-the-changes-being-deployed>'
Scap3 will deploy the code, restart the service and check its port and health. In case it detects some problems on the canary node, it will suggest to perform a roll-back. Otherwise it will proceed to deploying it to the rest of the nodes, which completes the deployment process.
Dealing with Problems
Deployment Debugging
Scap3 includes a utility which can be used to monitor the output of the commands executed on the target nodes. Fire up a second terminal, connect to deployment.eqiad.wmnet and execute the scap deploy-log
command from /srv/deployment/<service-name>/deploy before starting the deployment. The output should help you figure out what went wrong.
If you haven't started an instance of scap deploy-log during the deploy, but it went badly, you can still recuperate the logs by running scap deploy-log --latest
.
Reverting a Deployment
Sometimes the deployment process goes well, but the code that was deployed isn't functioning properly. To revert a deployment and bring the code on the target nodes to a previous state, find the deploy repository's SHA1 that contained the good code and then deploy it with:
$ scap deploy --rev <sha1>
Service Status Inspection
Each service that provides a monitoring specification (via spec.yaml) can be directly checked for health on each of the target nodes by issuing:
$ check-<service-name>
where service-name is the name of the service in ops/puppet. If all is well, you should receive the response:
All endpoints are healthy
For services that locally log their entries, there is an additional command that allows you to look at logs on a target node in a human-readable format:
$ tail-<service-name>
This command accepts all of the arguments that tail does, so if you monitor the logs as they come, you should use:
$ tail-<service-name> -f