Jump to content

Requestctl/Tutorials

From Wikitech

This page contains tutorials for using the requestctl tool.

Tutorial: Add a new action in Varnish

In this tutorial, you will learn how to use requestctl to create an action that throttles per-IP requests coming from Azure that don't have an accept-encoding header, have Connect: keep-alive as a header, and go to a special page.

Get ipblocks

Use the get command to get existing ipblocks in the datastore:

:~$ requestctl get ipblock -o json | jq -r 'keys[]'
abuse/blocked_nets
abuse/bot_blocked_nets
abuse/bot_posts_blocked_nets
abuse/phabricator_abusers
abuse/text_abuse_nets
cloud/akamai
cloud/aws
cloud/azure
cloud/digitalocean
cloud/gcp
cloud/linode
cloud/oci
cloud/public_cloud_nets
known-clients/googlebot

We already have the ipblocks from Azure, originating from a cronjob running on the puppetservers, in the file /srv/git/conftool/auditlog/request-ipblocks/cloud/azure.yaml

Look for request pattern

Next, check if we have a request pattern that corresponds to not having an accept-encoding header:

:~$ requestctl get pattern
name                pattern
------------------  --------------------------------
req/cache_buster
req/cache_buster_q  ?q=\w{12}
req/specific_page
ua/urllib3          User-Agent: ^python-urllib3/.*$
ua/requests         User-Agent: ^python-requests/.*$
ua/curl             User-Agent: ^curl/.*$
ua/MediaWiki        User-Agent: ^MediaWiki/.*$
sites/commonswiki   Host: commons.wikimedia.org
sites/wikidata      Host: www.wikidata.org
sites/enwiki        Host: en.wikipedia.org
url/api             url:^/w/(api|rest).php
url/docroot         url:^/[?$]
url/page            url:^/wiki/
url/semicolon_page  url:^/wiki/.+:+

It doesn't look like we have a pattern for that, so let's add one!

Add a pattern

To add a new pattern that corresponds to not having an accept-encoding header:

  1. Create a file named no_accept_encoding.yaml on a puppetserver
  2. Populate the file with the following content:
header: 'Accept-Encoding'

Omitting any header_value translates to "no header present" (see requestctl#Add_a_new_pattern).

Sync to etcd

Next, sync the pattern object you created to etcdː

puppetserver1001:~$ sudo requestctl apply pattern req/no_accept_encoding -f no_accept_encoding.yaml
2022-03-28 14:56:24,094 - reqctl (cli:_write:362) - INFO - Creating pattern req/no_accept_encoding

This command creates an object in the datastore based on your YAML file. To confirm that your object has been created, you can use the get command:

$ requestctl get pattern req/no_accept_encoding -o yaml

Add another pattern

Now you can do the same with Connect: keep-alive. Create the file keepalive.yaml containing:

header: Connect
header_value: keep-alive

...and sync again with the apply command:

puppetserver1001:~$ sudo requestctl apply pattern req/keepalive -f keepalive.yaml

Write the action

Now that you have all the ingredients, you can create a new action using the patterns you just created. Make a new file called bot_from_azure.yaml, and populate its contents (see the requestctl user guide for field details):

# This should tell anyone what this rule does
comment: "Throttle requests with keepalive but no accept-encoding, coming from azure."
# Please note: enabled will NOT be considered when syncing, as the enabled state
# of actions is controlled by the `enable` command, see below
enabled: false
# each pattern and ipblock is referred to using {pattern,ipblock}@<scope>/<name>
expression: pattern@req/keepalive AND pattern@req/no_accept_encoding AND ipblock@cloud/azure
# Only bother with cache misses
cache_miss_only: true
# We want to throttle individual ips
do_throttle: true
throttle_per_ip: true
# Allow 10 rqp per 10 seconds, and if exceeeded, ban for 1 minute
throttle_requests: 100
throttle_interval: 10
throttle_duration: 60

Sync to etcd

To sync your action object in the datastore, run sudo requestctl apply action cache-text/bot_from_azure -f bot_from_azure.yaml:

:~$ sudo requestctl get action cache-text/bot_from_azure -o yaml
cache-text/bot_from_azure:
  cache_miss_only: true
  comment: Throttle requests with keepalive but no accept-encoding, coming from azure
  do_throttle: true
  enabled: false
  expression: pattern@req/keepalive AND pattern@req/no_accept_encoding AND ipblock@cloud/azure
  resp_reason: ''
  resp_status: 429
  sites: []
  throttle_duration: 60
  throttle_interval: 10
  throttle_per_ip: true
  throttle_requests: 100

Check your new action

Your new action won't show up now in Varnish, but you can still use the requestctl vcl command to check the action you created:

puppetserver2001:~$ requestctl vcl 'cache-text/bot_from_azure'

// FILTER bot_from_azure
// Throttle requests with keepalive but no accept-encoding, coming from azure
// This filter is generated from data in etcd. To disable it, run the following command:
// sudo requestctl disable 'cache-text/bot_from_azure'
if (req.http.Connect ~ "keep-alive" && !req.http.Accept-Encoding && req.http.X-Public-Cloud ~ "azure" && vsthrottle.is_denied("requestctl:bot_from_azure:" + req.http.X-Client-IP, 100, 10s, 60s)) {
    return (synth(429, ""));
}

This allows you to do a first check of the action you'll be creating. To add an additional layer of security to your rollout, you can also obtain a VSL expression to match the same condition in logs of a cache server using varnishlog:

puppetserver2001:~$ requestctl log 'cache-text/bot_from_azure'

Monitor requests matching this action using the following command:
sudo varnishncsa -n frontend -g request \
  -F '"%{X-Client-IP}i" %l %u %t "%r" %s %b "%{Referer}i" "%{User-agent}i" "%{X-Public-Cloud}i"' \
  -q 'ReqHeader:Connect ~ "keep-alive" and not ReqHeader:Accept-Encoding and ReqHeader:X-Public-Cloud ~ "azure" and VCL_ACL eq "NO_MATCH wikimedia_nets"'

Inject to Varnish and commit

To actually get the action injected into the Varnish configuration, run:

puppetserver1001:~$ sudo requestctl enable cache-text/bot_from_azure

And finally commit all of your changes to the injected vcl:

puppetserver1001:~$ sudo requestctl commit
--- cache-text/global.old

+++ cache-text/global.new

@@ -1,3 +1,12 @@

+
+// FILTER bot_from_azure
+// Throttle requests with keepalive but no accept-encoding, coming from azure
+// This filter is generated from data in etcd. To disable it, run the following command:
+// sudo requestctl disable 'cache-text/bot_from_azure'
+if (req.http.Connect ~ "keep-alive" && !req.http.Accept-Encoding && req.http.X-Public-Cloud ~ "azure" && vsthrottle.is_denied("requestctl:bot_from_azure:" + req.http.X-Client-IP, 100, 10s, 60s)) {
+    return (synth(429, ""));
+}
+
 
 // FILTER parameter_1
 // Common cache-busting attack that is recurring

==> Ok to commit these changes?
Type "go" to proceed or "abort" to interrupt the execution
> abort

At this point, if you type "go" instead of "abort" at the input, the action will appear on all cache-text nodes. That is because you didn't define the sites property for your new action object.

BEFORE YOU GO: Remove the files you've created earlier - if you want to see your objects in file form, you'll find them shortly after adding them to etcd under /srv/git/conftool/auditlog on the puppet servers.

Tutorial: Add a new action in haproxy

In this tutorial, you will learn how to use requestctl to limit bandwidth usage and/or concurrency for specific users or request patterns, by injecting rules in the TLS termination layer at the edge (haproxy).

Imagine there is a set-top-box or streaming device for a specific provider in southeast Asia that is used by millions of people. Every set-top-box downloads the English Wikipedia image of the day more or less at the same time. We want to limit the bandwidth that users of this ISP can globally consume on each TLS terminator in the eqsin datacenter. The rule will be restricted to the upload cluster, and specifically to upload.wikimedia.org.

Add an ipblock

Let's assume we don't have an ipblock for this ISP already, so we'll need to create a new one. Our new ipblock should capture all the ipv4/ipv6 CIDRs belonging to the AS of the provider. We need to add the new ipblock entry to any of the existing three categories of ipblocks: abuse, known-clients or cloud. Let's pick known-clients for now, as it seems to make sense.

First, on any puppetserver, create a file named acme_corp.yaml. Populate the file content as follows:

comment: Acme corp ISP
cidrs:
- 115.243.116.0/22
- 115.243.120.0/21

Sync to etcd

To sync your new ipblock object in the datastore, run:

$ sudo requestctl apply ipblock known-clients/acmecorp -f acme_corp.yaml

Add an haproxy action

Now that you have an ipblock object for this ISP, you can create an action to to limit their bandwidth. Create another file named acmecorp_throttle.yaml. In this file, you define the action to be applied to the ipblock object you just created (see the requestctl user guide for field explanations):

comment: Limit bandwidth for acmecorp clients on upload.wikimedia.org
expression: ipblock@known-clients/acmecorp AND pattern@sites/upload
bw_throttle: true
bw_throttle_duration: 1000
bw_throttle_rate: 100000
log_matching: true
per_ip_concurrency: false
per_ip_concurrency_counter_index: -1
per_ip_concurrency_limit: 50
sites:
 - eqsin
resp_status: 429
resp_reason: Bandwidth exceeded
silent_drop: false

Sync to etcd

Sync your new haproxy action object to the datastore by running:

$ sudo requestctl apply haproxy_action cache-upload/acmecorp -f acmecorp_throttle.yaml

At this point, you could already commit your changes to etcd, and the rule would be created just for log requests that match. But that doesn't make sense in this scenario: we are under duress, so we want to enable the rule globally.

Enable and sync the change to the TLS terminators

To enable and commit your new haproxy_action, run:

$ sudo requestctl enable -s haproxy cache-upload/acmecorp_bw_trhottle

Then, to commit the change:

$ sudo requestctl commit
### Varnish VCL changes ###
### HAProxy DSL changes ###
--- null

+++ cache-upload/eqsin.new

@@ -0,0 +1,11 @@

+# ACLs generated for requestctl actions
+acl ipblock_cloud_acmecorp src,map_ip(/etc/haproxy/ipblocks.d/cloud.map) -m str "acmecorp"
+acl sites_upload req.fhdr(Host) -m reg -i upload.wikimedia.org
+
+# requestctl filter cache-upload/acmecorp_bw_trhottle
+# Throttle requests from acmecorp
+# This filter is generated from data in etcd. To disable it, run the following command:
+# sudo requestctl disable -s haproxy 'cache-upload/acmecorp_bw_trhottle'
+filter bwlim-out acmecorp_bw_trhottle limit 100000 period 1000
+http-request set-header x-requestctl "%[req.fhdr(x-requestctl),add_item(',',,' hap:acmecorp_bw_trhottle')]" if ipblock_cloud_acmecorp sites_upload
+http-request set-bandwidth-limit acmecorp_bw_trhottle if ipblock_cloud_acmecorp sites_upload

==> Ok to commit these changes?
Type "go" to proceed or "abort" to interrupt the execution
>

Note how the rules are only being added to cache-upload in the eqsin datacenter.

Next steps