Swift/Open Issues Aug - Sept 2012/Cruft on ms7
Here's the todo list of things left on ms7 so we can knock 'em off one at a time.
You can see what things the web server is still referencing by:
- ssh over to ms7 as root
- cd into the /opt/local/share/ directory
- run the command
dtrace -qs ./access_log.d
You can see what files are being accessed via nfs but it's more work and there's a lot more traffic to sift through:
- ssh over to ms7 as root
- run the command (for maybe 20 seconds, half a minute)
snoop -o /root/snoop-out.txt ms7
- check the contents for nfs creations via
snoop -i /root/snoop-out.txt | grep NFS | grep CREATE | grep ' -> ms7' | more
- check for nfs lookups via
snoop -i /root/snoop-out.txt | grep NFS | grep LOOKUP | grep ' -> ms7' | more
You can also see the HTTP requests that hit the server (much easier):
- ssh over to ms7 as root
snoop -c 1000 -d aggr1 port 80 |grep GET
And finally you can get a raw count of mkdirs, creates etc for the various nfs versions (3,4, etc) by:
- ssh to ms7 as root and run (for 6 repetitions every 10 seconds)
nfsstat -s 10 6
Note that as of Nov 21 2012, a half minute of snoops result in no nfs create and no nfs lookup traffic, yay!
NEED TO BE MOVED
docroot
There are a few files, like index.html, robots.txt, favicon.ico and perhaps crossdomain.xml (see the discussion on JARs above) that could be nice to continue having them. One idea is hosting them as another site in docroot repository and host them in a misc webserver.
- index.html http://upload.wikimedia.org/ gives a 403, so index.html can probably be removed safely ( https://upload.wikimedia.org/index.html ). Hashar
- robots.txt instructs crawlers to NOT browse upload.wikimedia.org. Maybe it can be served directly by the caches? Hashar
- Seems reasonable. Aaron 17:44, 7 September 2012 (UTC)
- favicon.ico Probably just used for https://upload.wikimedia.org/. Need a solution similar to the robots.txt one. Hashar
ALREADY MOVED
ext-dist
MediaWiki currently writes these tarballs to a specified (nfs-mounted) filesystem and they are then served by a web server living on the host which houses the filesystem. It creates the tarballs after first svn updating the contents of another nfs-mounted filesystem. Currrently both of these nfs-mmounted filesystems live on ms7 (which has a webserver running on it).
Status:
- ExtensionDistributor is rewritten and deployed. fenari has had the xinetd service and extdist crons removed. The extension distributor stuff on ms7 can probably be safely deleted now.
Questions:
- Can we move the rest of the extensions into git, or package up the ones in svn, put them somewhere for download and be done with it? Or do we expect folks to continue to develop in svn?
- We're dropping support from the new ExtensionDistributor for old SVN extensions. If you want something old & unmaintained, you can get it from SVN yourself.
- If we really need to save this extension, how much of a pain would it be to avoid nfs? This entails, as Aaron says below, running a separate MediaWiki installation on a box that has local storage just for this. That amounts to under 5 GB now (/export/upload/ext-dist and /export/upload/private/ExtensionDistributor/mw-snapshot on ms7). The web server to serve the tarballs would live on the same host as the MW installation. This would be a hack like the old test.wp, which is to say only one host would be the real backend for whatever mw project name we gave it.
- The extension makes it *much* easier for people to download extensions for their wiki. We want to keep it--just kill the NFS requirement
- If we want a short term fix, this means finding a misc box with web server and some space that can nfs export some directory. download.wikimedia.org is on kaulen now, it has 200gb of room, but it's had some performance issues lately related to bugzilla. Do we have a better location?
- Found a short-term fix that's also a long-term fix, see below.
Final solution:
- Rewrote ExtensionDistributor to not require clones and disk space to make archives in
- Instead, fetch the archive urls from Github's API (configurable, if we decide Github is a bad choice later) and provide those to users to download directly.
- This needed a small change to the url downloading proxy on linne.
- No more NFS, no more shelling out to git/tar, way more reliable!
Earlier discussion: Used by ExtensionDistributor. Where should source tarballs live anyways? MW tarballs live on dataset2, extensions on ms7, there's talk about nightly mw tarballs. Where do we want them all?
- This can be moved to another NFS server if $wgExtDistTarDir/$wgExtDistTarUrl/$wgExtDistWorkingCopy are updated, along with the $extdist_working_dir cron variable (http://wikitech.wikimedia.org/view/Cron#misc::extension-distributor). Aaron 01:37, 7 September 2012 (UTC)
- To avoid NFS, this can also just go on it's own mediawiki install and server and be linked to from mw.org. Aaron 22:00, 17 September 2012 (UTC)
ConfirmEdit
FancyCapture's code has been updated so that it can use the swift or multi-write backends. I think what's left is updating filebackend.php with an appropriate backend stanza (or will one of the existing ones work?), creating the container(s?), and then adding a value for $wgCaptchaFileBackend in CommonSettings.php.
Earlier discussion:
There are also subdirectories captcha and captcha2. In CommonSettings.php:
- $wgCaptchaDirectory = '/mnt/upload6/private/captcha
That's used by ConfirmEdit's FancyCapture module. I don't see references to captcha2 anywhere. I hope we can find someone who knows about it.
- This could be moved to another NFS server if $wgCaptchaDirectory is changed. The files are streamed via MW special page, so there is no url path global. Aaron 17:35, 7 September 2012 (UTC)
- Ideally, this would go elsewhere. My NFS criteria is (1) files can be regenerated, (2) amount/size of files is very finite, (3) brief downtime is tolerable. Captcha doesn't quite meet (3). The way it picks a random files is not very efficent though and would be annoying to move to swift. Aaron 17:42, 7 September 2012 (UTC)
- This will be moved to Swift and be changed to use memcached. Aaron 08:07, 16 September 2012 (UTC)
I see that change here: [1]
math
Update: we still see GETS for these on ms7, with paths upload.wikimedia.org/project/lang/math/.... There was nothing that would cause the parser cache to reparse these pages with the correct path. On Oct 5 wgCacheEpoch was changed to Sept 08 00:00 2012, and deployed. Articles revalidated between those dates by the squids (and not edited recently) will have these paths in them. We can expect them to gradually expire until Dec 5th, at which point they should be gone.
Still seeing some hits for these on ms7, expected as it's not yet Dec 5th. -- ArielGlenn (talk) 16:54, 21 November 2012 (UTC)
Used by Extension:Math for rendering of mathematical formulas in articles. Aaron has commited code to move these files into Swift; see [2]. Like thumbs, they can be regenerated at any time from the original content. Do we want to consider periodic cleanups of them?
NOTE ALSO that the subdirectories /wikipedia/(langcode)/math are still in use; I see GETS to these urls and files being returned. These are obsolete and will gradually disappear as cached content referencing them expires.
math.tmp
Scratch area for Extension:Math, still used. We don't want this cruft in Swift, where can it go? Can't temporary files live on the apaches that generate them, until they are cleaned out by cron or a reboot?
Last directory modification was Aug 22, last file created was
-rw-r--r-- 1 apache apache 306 Aug 19 00:27 c475af0fc6a341d865339933e251aba7.png
So this is now dead.
wikipedia/(langcode)/timeline
Update: these should now be written to swift and to NFS, and very few GET requests for these seem to show up on ms7.
This is still being referenced by Extension:EasyTimeline. Aaron and a little bit of Faidon are working on moving them to Swift, along with math (see above).
jars
This has the Cortado video player, used by Extension:OggHandler and Extension:TimedMediaHandler.
We reference it in CommonSettings.php:
$wgCortadoJarFile = "$urlprotocol//upload.wikimedia.org/jars/cortado.jar";
and /extensions/TimedMediaHandler/MwEmbedModules/EmbedPlayer/resources/mw.EmbedPlayerJava.js (warning: hardcoded!):
( function( mw, $ ) { "use strict"; /** * List of domains and hosted location of cortado. Lets clients avoid the security warning * for cross domain java applet loading. */ window.cortadoDomainLocations = { 'upload.wikimedia.org' : 'http://upload.wikimedia.org/jars/cortado.jar' };
Cortado is hosted in upload.wikimedia.org because of Java cross-domain restrictions that could perhaps be work-arounded using crossdomain.xml (Java >= 6 upd. 10, needs to be checked if that's sufficient and/or works). If that works, what would be an appropriate place for it? Bits?
- From tests with a cortado.jar on my own webserver and a commons video, crossdomain.xml works, so we now have one on ms7. I also grepped through the squid logs to check for Java versions. The cortado.jar are very few: 29 hits in all of sampled-1000.log; from those only 2 were running an older Java. As for a new location, Tim said that bits would be indeed the proper place to move it to. Tim is fixing OggHandler and I opened a bug for TMH to make the path configurable Faidon 06:57, 7 September 2012 (UTC)
favicons
Besides favicon.ico (see below), there's also /{{project}}/favicon-piece.ico used by pretty much all the sites and /{{project}}/favicon-si.ico. Antoine was working on a BZ ticket for a request for a new favicon and Faidon noticed; Antoine is working on moving them to bits.
- Changes 23074 and 23077. Files will need to be removed from /mnt/upload6 when the change has been merged AND verified. Hashar 14:56, 7 September 2012 (UTC)
- I have merged and deployed both changes. Hashar 23:17, 12 September 2012 (UTC)
Test favicon was migrated today with 23675 Hashar 07:50, 25 September 2012 (UTC)
private
We need to verify that deleted/oversighted images and images on private wikis are stored and served from Swift and handled correctly. When that's done most things under here can go.
wikipedia/commons/scan
The ScanSet extension appears to use a subdirectory under wikipedia/commons. From CommonSettings.php:
$wgScanSetSettings = array( 'baseDirectory' => '/mnt/upload6/wikipedia/commons/scans', 'basePath' => "$urlprotocol//upload.wikimedia.org/wikipedia/commons/scans", );
But everything in there is from 2005 with Tim's name on it.
So the scans directory is not actually written to by this extension. And there have been no updates for many years. The short term fix would be to create a scans container in swift, make sure that things like opendir, readdir, @file(...) are rpelaced by abstractions and be done with it. This will take some work.
Or if it turns out that this extension is not likely to be more broadly used in its present form we could save the contents of the directory elsewhere for future use. Need to check with Tim.
Scanset really serves a few static (= that have to be put up there by someone with SSH access) files that Tim uploaded a while back. Tim originally planned for the extension to be able to receive user content but this never happened. As such, it seems quite an exception to me and we really should find a way to not keep it that way. Could we perhaps move the Encyclopedia Britannica files to Commons and let ScanSet reference them from there? Faidon 03:49, 6 September 2012 (UTC)
Wikisource has a more current scan of Britannica, so this copy can be tossed, I believe an email will be going around about this later from Tim. -- ArielGlenn 21:30, 12 September 2012 (UTC)
Fixed, in revision 736661fc6a38caa703cdd650419d94d76565d8ba by Tim, who disabled the extension, archived the data and notified wikisource folks.
NEED TO BE FIXED
originals??
A few originals are still being served by ms7. How is this possible? Maybe this is the "extra dot in the url" trick mentioned in an email. Needs fixed.
I see one image with a referer from an article. Here's that info:
GET /wikipedia/commons/d/d5/Apollo_11_Lunar_Module_Eagle_in_landing_configuration_in_lunar_orbit_from_the_Command_and_Service_Module_Columbia.jpg 200 (http://en.wikipedia.org/wiki/Apollo_11)
How did it get to ms7?
I haven't seen such hits at all and double-checked squid configs. Let's move everything else and then leave a running tcpdump^Wsnoop for a longer period to see if we get any more hits. Faidon 03:49, 6 September 2012 (UTC)
Here's one from just now: GET /wikipedia/commons/9/9d/Monarch.a321-200.g-ozbi.arp.jpg 200 (-) on ms7. I see about one of these every second or two. -- ArielGlenn 01:15, 7 September 2012 (UTC)
Still seeing a few. Example: GET /wikipedia/commons/3/38/Chocolate_coated_citrus_peel_01.jpg 200 (-) -- ArielGlenn (talk) 16:53, 21 November 2012 (UTC)
Thumb example: GET /wikipedia/en/shared/thumb/d/d0/Bilingual_signage-Gallo.jpg/180px-Bilingual_signage-Gallo.jpg 200 (-)
broken urls
We occasionally get things like
GET /wikipedia/en/0/
which should be rejected (by the squids I guess) so they never make it here.
- This might be requests for files in the DB but that are missing (they generate links like this). Aaron 16:55, 5 September 2012 (UTC)
NEED EQUIV ON SWIFT
pybaltestfile.txt
Once everything's off of here, this can go. Note from Ben or Faidon on the etherpad:
Swift does have this at monitoring/pybal.txt; see lvs.pp for full URL but rewrite.py will probably have to be modified to serve this file
404
While not blocking ms7 in any way, it'd be nice to continue having 404 (and other error?) handlers in Swift. Currently we reply with some cryptic default Swift error page that even has the rewrited URL in it. This could implemented using rewrite.py or perhaps be implemented in Varnish, if/when we switch upload traffic to it.
DON'T NEED ANYMORE
lost-image-thumb-backup
We don't use this for anything; it looks like a copy of some files that were made when trying to restore media after fixing some bug. in 2008. Tim would know whether it's worth preserving a copy of this directory someplace on a backup or offline, in case we need to dig through it. We're talking about 13500 files for a total of around 320 MB.
created by this: fenari:~tstarling/thumbDirScrewup/pass2.php
Tim says we can toss.
mime.php
Can't find any references to this anywhere, not on ms7 nor in the mw configs. It just contains a single (old) copy of the wfMimeType function. I hpe this means we can move it out of the way.
portal
This used to be used by the fundraiser with settings like these in CommonSettings.php:
- $wgFundraiserPortalDirectory = "/mnt/upload6/portal";
- $wgFundraiserPortalPath = "$urlprotocol//upload.wikimedia.org/portal";
We should check to see if it's still needed. Jeff?
I see references:
GET /portal/wikipedia/en/fundraiserportal.js 200 (http://test.prototype.wikimedia.org/wiki/Main_Page)
Yet another reason to get rid of prototype :-P
These are from FundraiserPortal, extension by Trevor, a donate button in the SideBar, no longer in use. Must track down the people on prototype and get that crap turned off.
Roan is turning it off. yay.
scripts
symlink to sync-from-home.
skins
These used to be served from ms7 (see old CommonSettings.php):
$wgStyleSheetPath = 'http://upload.wikimedia.org/skins'
but for quite some time now they are served from the bits cluster:
$wgStyleSheetPath = "$urlprotocol//bits.wikimedia.org/static-$wmfVersionNumber/skins"
so it should be ok to move this out of the way.
And yet.... I see some requests like
GET /skins/monobook/main.css GET /skins/common/commonPrint.css
Not very many, and they all have 301's. I guess we oughta think about those. I have referers for those too.
- Serving skins from upload.wm.o should no longer be necessary, they have been on bits.wm.o for years. There may be some legacy custom stuff that's still using upload.wm.o URLs, but depending on what those are (you say you have referers, can I see them?) we can fix them, or decide to send 301s to bits, or decide to just screw them and stop serving the skins URLs. --Catrope 19:17, 29 August 2012 (UTC)
- Ariel gave me the referer and it looks like it's just a mirror with very outdated code, displaying Commons images and adding ads. If that's the only referer, I don't feel bad 404ing some CSS files on them. --Catrope 19:29, 29 August 2012 (UTC)
sync-from-home
Some pretty old scripts live here. One copies things to ms4 from the days it was a thumbs server; another is a very old copy of thumb-handler.php. Most are no longer used since ms7 doesn't serve thumbs. However /export/upload/scripts/404.php is still in use, which handles 404 errors for everything else. (See /opt/webserver7/https-ms7/config/obj.conf) When all ms7 service is turned off, this can go away.
x1
A one line shell script of JeLuf's which appears to test timezone and date formatting. I expect it can go.