Obsolete:Google storage
We have been donated some space from Google for storage of XML dumps and other items.
How to get set up
We like gsutil, a python client built on the boto library. You can get it here. Put it where it's convenient and unpack it; each user who runs it is going to wind up with a ~/.boto config file in their home directory. Likewise, each user will need to add lines equivalent to the following, to their .bashrc:
export PATH=${PATH}:full-path-to-where-you-unpacked-it/gsutil export PYTHONPATH=${PYTHONPATH}:full-path-to-where-you-unpacked-it/gsutil/boto
If you run gsutil without args it give a usage message.
The first run should be something like gsutil ls (something with an argument); it will prompt you for your dev keys. Then it will exit. Now you can run real commands.
Getting keys
Ah yes, you need to get a set. We're in the process of working out a procedure for that.
Basics
- gsutil ls
- lists all buckets
There are no directories or subdirectories, only "buckets". Filenames can contain forward slashes.
- gsutil ls -L -b gs://my-bucket-name
- gives additional detail about a specific bucket
- gsutil ls gs://my-bucket-name
- lists the files in the specified bucket
- gsutil cp reallygreatfile gs://my-bucket-name
- copies a file from local system to the bucket
Copies can be done from one bucket to another as well.
Debugging
gsutil takes the -d option which gives some HTTP headers, the -D option which gives more headers, and when both those aren't enough you can edit your ~/.boto config file, uncomment the line "#is_secure False" and then tcpdump to capture the HTTP packets as the command runs.