Jump to content

BounceHandler

From Wikitech

The BounceHandler extension is currently installed 'everywhere' in the production and beta cluster. Here are few notes on where the knobs are, where to look for whats happening.

How it works

MediaWiki sends outgoing email to account holders, such as for notifications from their Watchlist, Recent changes, Echo events, account management, and more. MediaWiki sends these via php-sendmail (T325131) and msmtp. The BounceHandler extension adds VERP headers to each email, which advertises support for the Variable Envelope Return Path feature to recipient mail servers out in the world.

If someone's inbox is unreachable or bounces for any other reason, the recipient mail server may send us a bounce mail. Upon receipt of such a bounce, Postfix is configured to send a POST request to the MediaWiki API (to the BounceHandler API specifically).

The BounceHandler API validates the request ($wgBounceHandlerInternalIPs) and queues a job to the JobQueue to handle the bounce.

The job then records the bounce event for this address is in its database (MariaDB#x1) and periodically prunes old bounces from the database for any address (wgBounceRecordMaxAge). If the bounce count for a given email address reaches a predefined threshold (wgBounceRecordLimit), the email address is marked as "unconfirmed" (similar to how it would have been right after account creation). This effectively unsubscribes the address from all future email, and prevents our servers from sending more undeliverable emails for this address to the recipient's mail server. The account in question is also notified via the Echo extension of this fact, which the user will see next time they login or otherwise browse on the wiki.

Production

enwiki --> sends email to someuser@somedomain.com ( with 'return-path'=> 'wiki-someuser.somedomain,com-{hash}@wikimedia.org' ) 
--> routes through polonium.wikimedia.org --> rejected in midway/ rejected by mx.somedomain.org ( bounce created )
--> bounce ( 'To' => 'wiki-someuser.somedomain,com-{hash}@wikimedia.org' ) reach polonium.wikimedia.org
--> bounce HTTP POSTED to test2.wikipedia.org --> test2.wikipedia.org lookup the CA user table,adds in to 'bounce_records' table
--> if bounces > threshold, user is unsubscribed.

Beta Cluster

enwiki --> sends email to someuser@somedomain.com ( with 'return-path'=> 'wiki-someuser.somedomain,com-{hash}@beta.wmflabs.org ' ) 
--> routes through mx.beta.wmflabs.org --> rejected in midway/ rejected by mx.somedomain.org ( bounce created )
--> bounce ( 'To' => 'wiki-someuser.somedomain,com-{hash}@beta.wmflabs.org ' ) reach mx.beta.wmflabs.org
--> bounce HTTP POSTED to --> meta.wikimedia.beta.wmflabs.org lookup the CA user table,adds in to 'bounce_records' table
--> if bounces > threshold, user is unsubscribed.

Main Configuration's

Kept in wmf-config/InitialiseSettings.php
Toggle un-subscribe action:

$wgBounceHandlerUnconfirmUsers = true;

The threshold limit for maximum number of allowed bounces is:

$wgBounceRecordLimit = 5;

Deployment information

Database used: wikishared Database cluster: extension1

View logs/records

To query into bounce_records table from production :

$ mwscript sql.php --wiki=mediawikiwiki --cluster extension1 --wikidb 'wikishared' --replicadb any
mysql> SELECT * from bounce_records

That would throw up a pretty long list ( maybe greater than few 10k's ). To print bounces recorded in the past 24 hours, give:

SELECT * from bounce_records where br_timestamp > date_format((now() - interval 1 day),'%Y%m%d%H%i%s');

and for the past month records COUNT , give:

SELECT count(*) from bounce_records where br_timestamp > date_format((now() - interval 1 month),'%Y%m%d%H%i%s');

To check for the bouncehandler logs, please log into mwlog1001 and :

 
jgreen@mwlog1001:/a/mw-log$ cat BounceHandler.log

To get number of unsubscribes for the past 80 days:

 
jgreen@mwlog1001:/a/mw-log $ gunzip -c BounceHandler*gz |grep -i un-sub|wc

To get the number of unsubscribes/day for the past 80 days:

 
mwlog1001:~$ gunzip -c /a/mw-log/archive/BounceHandler.log-2015* | grep Un-sub|awk '{print $1}'|sort|uniq -c|awk '{print $2 " " $1}'|sort -n

See also