RESTBase
More information and discussion about changes to this draft on the talk page.
RESTBase is an API proxy serving the REST API at /api/rest_v1/
. It uses Cassandra as a storage backend.
It is currently running on hosts with the profile::restbase
class.
Deployment and config changes
RESTBase is deployed by Scap.
What to check after a deploy
Deploys to do not always go according to plan, and regressions are not always obvious. Here is a list of things you should check after each deploy:
- Does the API documentation still load? Consider exercising some of the endpoints from the UI (perhaps by requesting an html render).
- Check error logs in logstash.
- Have a look at the metrics in Grafana. Have latencies increased, or error rates jumped? Is memory utilization consistent with expectations? What about storage (op rates, exceptions, etc)?
- Consider making an edit to a page using Visual Editor.
- Take a look at some recent Visual Editor-performed changes (French Wikipedia works great for this, as they use VE by default). Do the diffs looks reasonable?
- Keep a close eye on
#wikimedia-operations
, if someone spots a problem, they're likely to raise the issue there.
Other considerations
Be sure to log all actions ahead of time in #wikimedia-operations
. Don't be shy about including details.
Administration
Adding a new RESTBase host
Before following these instructions, ensure you follow the provisioning documentation for a new Cassandra node.
- Add hosts to the deployment list in the Restbase deploy repo
- If there have been changes to the restbase service since you applied the correct roles to the host (the latest deployed version should be pulled via Puppet during the first puppet runs), deploy restbase to the hosts: from deployment.eqiad.wmnet,
cd /srv/deployment/restbase/deploy/
,git pull
and thenscap deploy -f -l restbaseNNNN.DC.wmnet "First deploy to restbaseNNNN"
- Add the hosts to conftool-data
- If the hosts are healthy in Icinga at this point and if you feel it is safe as regards deployment timing and so on, pool the hosts:
sudo confctl select name=restbaseNNNN.DC.wmnet set/pooled=yes:weight=10
- Verify that the hosts have been added and are healthy via the pybal API
Renewing expired certificates
Every now and again Cassandra certificates will come close to expiry (for example: SSL WARNING - Certificate restbase2016-a valid until 2020-11-29 09:26:14 +0000 (expires in 53 days)). Certificates need to be deleted and recreated in the Puppet secrets directory - See the Cassandra documentation for details.
Monitoring
instance-data
In production, the instance-data
path is usually a RAID array. It is used for hints, commitlogs and caches - all vital to the stable operation of the Cassandra instances. Under unusual circumstances (a large rebalancing, an instance behaving erroneously etc) this mount can fill up quickly and space will sometimes be required to back out of this condition. For this reason, we set a lower threshold for disk free on this path than for other disks.
Debugging
To temporarily switch to local logging for debugging, you can change the config.yaml log stanza like this:
logging: name: restbase streams: # level can be trace, debug, info, warn, error - level: info path: /tmp/debug.log
Alternatively, you can log to stdout by commenting out the streams sub-object. This is useful for debugging startup failures like this:
cd /srv/deployment/restbase/deploy/ sudo -u restbase node restbase/server.js -c /etc/restbase/config.yaml -n 0
The -n 0
parameter avoids forking off any workers, which reduces log noise. Instead, a single worker is started up right in the master process.
Analytics and metrics
Hive query for action API & rest API traffic:
use wmf;
SELECT
SUM(IF (uri_path LIKE '/api/rest_v1/%', 1, 0)) as count_rest,
SUM(IF (uri_path LIKE '/w/api.php%', 1, 0)) as count_action
FROM wmf.webrequest
WHERE webrequest_source = 'text'
AND year = 2017
AND month = 9
AND (uri_path LIKE '/api/rest_v1/%' OR uri_path LIKE '/w/api.php%');
Notes on purging
RESTBase is using cache-control headers to handle cache on request/response cycle. In order to purge a URL we can try one of the following:
- Run a purge on wiki
- Changeprop will eventually pregenerate the content on restbase
- Remove the entry from the cassandra table that is stored
- Run a manual HTTP request to purge
For a given URL eg. /en.wikipedia.org/v1/page/mobile-html/Dog to purge the cassandra content you need to run the following request:
curl restbase.svc.codfw.wmnet:7233/en.wikipedia.org/v1/page/mobile-html/Dog -H "cache-control: no-cache"
Script
import pandas
import requests
import mpire
BASE_PATH = "http://127.0.0.1:7231"
WORKER_POOL_SIZE = 6
def read_input(path, column):
df = pandas.read_csv(path)
df = df[[column]]
df = df.drop_duplicates(subset=[column])
df.rename(columns={column: "path"}, inplace=True)
return df
def purge_url(url):
rest_url = f"{BASE_PATH}{url}"
res = requests.get(rest_url, headers={"cache-control": "no-cache"})
return res
def purge_urls(urls):
with mpire.WorkerPool(n_jobs=WORKER_POOL_SIZE) as pool:
results = pool.map(purge_url, urls, progress_bar=True)
if __name__ == "__main__":
df = read_input("<CSV_PATH>", "<COLUMN>")
purge_urls(df.path)