Swift/Logging and Metrics
Appearance
< Swift
This page may be outdated or contain incorrect details. Please update it if you can.
The "Swift" project is Current as of 2012-04-01. Owner: Bhartshorne. See also RT:1384
Trending graphs
Ganglia graphs a number of metrics (count, average duration, 90th percentile duration, maximum duration) for HTTP queries against swift. These are broken out by HTTP method (GET, HEAD, PUT, etc.) as well as the status code of the query (200, 204, 404, etc.)
Ganglia graphs for:
- proxy nodes: http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=&tab=v&vn=swift+fronted+proxies
- storage nodes: http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&m=&tab=v&vn=swift+backend+storage
Monitoring
Nagios currently attempts to connect to each Swift proxy server on port 80 to verify that the proxy is running. It is not currently checking any other swift processes (but should).
Log lines
proxy examples
A few example log lines:
- The object doesn't exist, 404 handler falls through and finds the object, successfully returned.
- note: PUT before GET (the PUT is happening entirely within the GET, so finishes first and logging writes out at the end of the query, not the beginning)
- the GET returns 404, even though the object was successfully returned (contrast to the third log entry here - completely missing object)
- the HTTP response from the client's perspective was 200
Jan 27 01:39:08 copper proxy-server 127.0.0.1 127.0.0.1 27/Jan/2012/01/39/08 PUT /v1/AUTH_ade95207-9bcc- 4bc9-bb67-06b417895b49/wikipedia-commons-local-thumb.97/9/97/Subaru_XV.jpg/800px-Subaru_XV.jpg HTTP/1.0 201 - - test%3Atester%2CAUTH_tka3106db61d6d47d9801162a4d9c3d174 75403 - - - - 0.1305
Jan 27 01:39:08 copper proxy-server 208.80.152.165 208.80.152.165 27/Jan/2012/01/39/08 GET /v1/AUTH_ade95207-9bcc- 4bc9-bb67-06b417895b49/wikipedia-commons-local-thumb.97/9/97/Subaru_XV.jpg/800px-Subaru_XV.jpg HTTP/1.0 404 - curl/7.19.7%20%28x86_64-pc-linux-gnu%29%20libcurl/7.19.7%20OpenSSL/0.9.8k%20zlib/1.2.3.3%20libidn/1.15 - - - - - - 0.3171
- the object exists - just return it
- note the increase in speed (0.1088 vs. 0.3171)
- note 200 HTTP result
Jan 27 01:40:18 copper proxy-server 208.80.152.165 208.80.152.165 27/Jan/2012/01/40/18 GET /v1/AUTH_ade95207-9bcc- 4bc9-bb67-06b417895b49/wikipedia-commons-local-thumb.97/9/97/Subaru_XV.jpg/800px-Subaru_XV.jpg HTTP/1.0 200 - curl/7.19.7%20%28x86_64-pc-linux-gnu%29%20libcurl/7.19.7%20OpenSSL/0.9.8k%20zlib/1.2.3.3%20libidn/1.15 - - 75403 - - - 0.1088
- the object doesn't exist
- note no PUT
- note GET returns 404, exactly the same as successfully returned object (first log entry above)
- the HTTP response from the client's perspective was 404
Jan 27 01:42:24 copper proxy-server 208.80.152.165 208.80.152.165 27/Jan/2012/01/42/24 GET /v1/AUTH_ade95207-9bcc- 4bc9-bb67-06b417895b49/wikipedia-commons-local-thumb.97/9/97/Subaru_XaoeuaoeuaeoV.jpg/800px-Subaaoeuaoeuaeoru_XV.jpg HTTP/1.0 404 - curl/7.19.7%20%28x86_64-pc-linux-gnu%29%20libcurl/7.19.7%20OpenSSL/0.9.8k%20zlib/1.2.3.3%20libidn/1.15 - - - - - - 0.2171
full format description for proxy logs
The format is in /usr/lib/pymodules/python2.6/swift/proxy/server.py, lines 1697-1713:
1697 self.access_logger.info(' '.join(quote(str(x)) for x in ( 1698 client or '-', 1699 req.remote_addr or '-', 1700 time.strftime('%d/%b/%Y/%H/%M/%S', time.gmtime()), 1701 req.method, 1702 the_request, 1703 req.environ['SERVER_PROTOCOL'], 1704 status_int, 1705 req.referer or '-', 1706 req.user_agent or '-', 1707 req.headers.get('x-auth-token', '-'), 1708 getattr(req, 'bytes_transferred', 0) or '-', 1709 getattr(response, 'bytes_transferred', 0) or '-', 1710 req.headers.get('etag', '-'), 1711 req.headers.get('x-trans-id', '-'), 1712 logged_headers or '-', 1713 trans_time, 1714 )))
Additional monitoring we need
Please add suggestions here of additional monitoring that would be useful.