Portal:Data Services/Admin/Runbooks/Enable NFS for a project
Overview
NFS is the primary shared storage system for projects in Cloud VPS and is the main platform for users to place code on the Toolforge execution environment. When a Cloud VPS project would like to use shared storage for one reason or another, we provide a fairly simple path for them to do so. Generally all this will be done in response to a ticket.
We use NFS for several purposes within cloud-vps:
- dumps -> Uses an existing shared NFS server, you'll need to update the yaml.
- scratch -> Uses an existing shared NFS server, you'll need to update the yaml.
- $home -> Uses a project internal NFS server, you'll need to create it and update the yaml.
- /data/project -> Uses a project internal NFS server, you'll need to create it and update the yaml.
And for all, you'll have to enable it on the VMs you want it mounted.
Create a project internal NFS server
There are ready-made cookbooks for creating a new project-local NFS server; those cookbooks are documented at Portal:Data_Services/Admin/Runbooks/Create_an_NFS_server. You may also want to also create a second, failover server if users of the project are sensitive to downtime.
Typically for a project named 'foo' you would create a server with the prefix 'foo-nfs' and the volume name 'foo'.
Scaling up to a larger VM (e.g. with more cores or RAM) is fairly easy, so feel free to start with a small-sized server flavor. Scaling up the storage size of an NFS server should also be fairly straightforward but it's best to leave some slack in the first place.
Once the new NFS server is built, note the path to the NFS volume. It will be something like /srv/foo. If you are planning to host multiple shares from this server (e.g. both $home and /data/project) create subdirs for those shares: /srv/foo/home and /srv/foo/project.
Note that the new NFS server will not launch an NFS server OR export any shares until the yaml configuration step, below.
Get the new service fqdn name
Any server created using the wmcs.nfs.add_server cookbook with the --service-ip flag will already have a service address associated with it which you can find via the horizon dns->zones interface.
We will need this fqdn later to update the yaml file.
Update yaml config in Puppet
First we'll need to find out the project gid
:
Find out the GID for the project
The NFS server will want to know the project GID. On any cloud VM, you can run:
$ getent group project-$project_name
Add entry to the yaml
Add a section for the new project to modules/labstore/templates/nfs-mounts.yaml.erb with the project gid and whichever mounts are required. An entry for a volume that mounts everything we've got would look like this:
testlabs: gid: 50302 mounts: dumps: true home: testlabs-nfs.svc.testlabs.eqiad1.wikimedia.cloud:/srv/testlabs/home project: testlabs-nfs.svc.testlabs.eqiad1.wikimedia.cloud:/srv/testlabs/project scratch: scratch.svc.cloudinfra-nfs.eqiad1.wikimedia.cloud:/srv/scratch
Dumps is a special case and only needs to be set to 'true' to work. Any other mount must specify the path to the nfs server followed by the path to the share on the server. Rather than using a specific VM's fqdn you can use the service name which will ease an future failovers. Any server created using the wmcs.nfs.add_server cookbook with the --service-ip flag will already have a service address associated with it which you can find via the horizon dns->zones interface.
Once puppet is patched, run run-puppet-agent
on labstore1004. This will trigger nfs-exportd's configuration changes and restart it. That should create a new file for the project under /etc/exports.d
on labstore1004 that will be configured with the project's ips. There should also be a lot of 'nfsd' processes running on the server.
Note that this same yaml file is also consumed by the NFS client hosts: it tells them what to mount.
Enabling on the VMs
Utilize hiera key mount_nfs
to opt-in / out. (e.g. mount_nfs: true
) The default is false at this time. A puppet run after the above work is completed on a VM with this key set to true will mount the NFS as specified.
Users can be instructed to do this step themselves. This will also enable tc traffic shaping on the VM client which will not remove itself if NFS is later removed. Setting mount_nfs: false
will not remove NFS mounts. You must do that by hand after changing hiera.
Historical notes for bare-metal pre-NFS servers
As of 2022-02-15 we are rapidly moving cloud-vps projects onto project-specific NFS servers. Straggler projects may still use the bare-metal servers but any new projects should follow the newer docs, above.
Old $home and /data/project mounts were stored on labstore1004; the old maps and scratch mounts were on cloudstore1009 in /srv/maps and /srv/scratch.
Support contacts
Communication and support
Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:
- Chat in real time in the IRC channel #wikimedia-cloud connect or the bridged Telegram group
- Discuss via email after you have subscribed to the cloud@ mailing list
- Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
- Read the News wiki page
Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself
Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)