MariaDB/Decommissioning a DB Host

Prerequisites:

SSH access to one of the cluster management hosts (cumin1002.eqiad.wmnet, cumin2002.codfw.wmnet) to depool + run the decommissioning script
SSH access to puppetmaster1001.eqiad.wmnet to merge puppet changes
Access to Pwstore
Git repositories cloned to your host:
- git clone ssh://gerrit.wikimedia.org:29418/operations/puppet

Create a decommission ticket with the following template: https://phabricator.wikimedia.org/maniphest/task/edit/form/52/
If there is hardware problems, please specify so for the DCOps to label it so we do not re-use broken pieces.

SSH to one of the cluster management hosts (cumin1002.eqiad.wmnet, cumin2002.codfw.wmnet)
dbctl instance HOSTNAME depool && dbctl config commit -m "Depool db1091 TASKNUMBER"

Create a puppet patch (example: https://gerrit.wikimedia.org/r/c/operations/puppet/+/638343)
SSH to puppetmaster1001
sudo puppet-merge - if you see any changes other than yours here, contact the owners to see if these are ok to merge
SSH to one of the cluster management hosts (cumin1002.eqiad.wmnet, cumin2002.codfw.wmnet)
sudo dbctl config commit -m "Remove HOSTNAME from dbctl TASKNUMBER"

Create a puppet patch (example: https://gerrit.wikimedia.org/r/c/operations/puppet/+/638352)
1. Changes to dhcp are no longer needed, so no need to edit: linux-host-entries.ttyS1-115200
DO NOT merge the patch yet

SSH to one of the cluster management hosts (cumin1002.eqiad.wmnet, cumin2002.codfw.wmnet)
Start a screen or tmux session
sudo cookbook sre.hosts.decommission -t TASKNUMBER HOSTNAME.DC.wmnet
Enter console password from Pwstore

SSH to puppetmaster1001
sudo puppet-merge - if you see any changes other than yours here, contact the owners to see if these are ok to merge

Log the action in IRC (#wikimedia-operations) - !log Removing HOSTNAME from zarcillo TASKNUMBER
SSH to one of the cluster management hosts (cumin1002.eqiad.wmnet, cumin2002.codfw.wmnet)
sudo -i
Zarcillo
1. db-mysql db1215 -A zarcillo
2. Execute the following queries in the MySQL prompt (remember about the semicolon):
  1. set binlog_format='ROW';
  2. delete from servers where hostname like 'HOSTNAME%';
  3. delete from instances where name like 'HOSTNAME%'; (INSTANCE is normally HOSTNAME or HOSTNAME:PORT)
  4. delete from section_instances where instance like 'HOSTNAME%';

Orchestrator will purge the host automatically within 1-2 weeks but to avoid that delay it should be removed manually

From the GUI (admin users only)
From the CLI:
1. Log the action in IRC (#wikimedia-operations) - !log Removing HOSTNAME from orchestrator TASKNUMBER
2. SSH to dborch1001.wikimedia.org
3. Single-instance host: sudo orchestrator -c forget -i HOSTNAME:3306 (use the FQDN for the HOSTNAME)
4. Multi-instance host: sudo orchestrator -c forget -i HOSTNAME:PORT for each HOSTNAME:PORT combination (use the FQDN for the HOSTNAME)

mark all the steps for "step for service owners" on: https://phabricator.wikimedia.org/T267088
Reassign:
- for eqiad to wiki_willy
- for codfw to wiki_willy
Remove #DBA tag and add #dc-ops and #ops-eqiad OR #ops-codfw.
Add the following comment: "This host is ready for DC-Ops to decommission".