Help:Toolforge/ToolsDB
Toolforge tools have access to a shared MariaDB server at tools.db.svc.wikimedia.cloud
for tool-created databases. Database names must start with the name of the credential user followed by two underscores and then the name of the database: <credentialUser>__<DBName> (e.g. "s51234__mydb").
The credential user is not your user name. It can be found in your $HOME/replica.my.cnf file. The name of the credential user looks something like 'u1234' for a user and 's51234' for a tool account. You can also find the name of the credential user using a live database connection:
SELECT SUBSTRING_INDEX(CURRENT_USER(), '@', 1);
Privileges on the database
Users have all privileges and have access to all grant options on their databases. Database names ending with _p
are granted read access for everyone. Please create a ticket if you need more fine-grained permissions, like sharing a database only between 2 users, or other special permissions.
_p
) can also be accessed from Quarry and Superset.Steps to create a user database
To create a database on tools.db.svc.wikimedia.cloud
:
- Become your tool account.
maintainer@tools-login:~$ become toolaccount
- Connect to tools.db.svc.wikimedia.cloud with the replica.my.cnf credentials:
mariadb --defaults-file=$HOME/replica.my.cnf -h tools.db.svc.wikimedia.cloud
You could also just type:sql tools
- In the MariaDB console, create a new database (where CREDENTIALUSER is your credentials user, which can be found in your ~/replica.my.cnf file, and DBNAME the name you want to give to your database. Note that there are 2 underscores between CREDENTIALUSER and DBNAME):
MariaDB [(none)]> CREATE DATABASE CREDENTIALUSER__DBNAME;
You can then connect to your database using:
$ mariadb --defaults-file=$HOME/replica.my.cnf -h tools.db.svc.wikimedia.cloud CREDENTIALUSER__DBNAME
Or:
$ sql tools
MariaDB [(none)]> USE CREDENTIALUSER__DBNAME;
Example
Assuming that your tool account is called "mytool", this is what it would look like:
$ maintainer@tools-login:~$ become mytool
$ tools.mytool@tools-login:~$ mariadb --defaults-file=$HOME/replica.my.cnf -h tools.db.svc.wikimedia.cloud
MariaDB [(none)]> select substring_index(current_user(), '@', 1) as uname;
+---------------+
| uname |
+---------------+
| u123something |
+---------------+
1 row in set (0.00 sec)
MariaDB [(none)]> create database u123something__wiki;
tools-db
service name was deprecated in September 2017 and removed in May 2019. Use tools.db.svc.wikimedia.cloud instead.Note: Some projects like python-Django
can throw an exception like MySQLdb._exceptions.OperationalError: (1709, 'Index column size too large. The maximum column size is 767 bytes.')
when migrated using the setup above. This can be fixed by altering the database charset to utf-8
in most cases. To avoid this, create the database using the following command instead to specify the charset:
MariaDB [(none)]> CREATE DATABASE CREDENTIALUSER__DBNAME CHARACTER SET utf8;
Read-only replica host
We maintain two copies of the ToolsDB database, using a MariaDB primary-replica setup.
The read-only replica host can be accessed using the same credentials and the following hostname: tools-readonly.db.svc.wikimedia.cloud
Using the read-only replica host is recommended if you have to run queries that take a long time to complete, as in this way you will reduce the load on the primary host.
Please note that the replica host can sometimes lag behind the primary host, but we are doing our best to keep this lag at a minimum.
Backups
We don't do offline backups of any of the databases in ToolsDB. ToolsDB users can backup their data using mariadb-dump (included in the mariadb
image) if necessary:
:# use umask to make the dump private (use unless the database is public)
$ toolforge jobs run --command "umask o-r; ( mariadb-dump --defaults-file=~/replica.my.cnf --host=tools-readonly.db.svc.wikimedia.cloud credentialUser__DBName > ~/DBname-$(date -I).sql )" --image mariadb backup
Note that we don't recommend storing backups permanently on NFS (/data/project
, /home
, or /data/scratch
on Toolforge) or on any other Cloud VPS hosted drive. True backups should be kept offsite.
Caveats
The Toolsforge team tries to keep ToolsDB configurations as close to MariaDB defaults as possible. This can lead to surprising behaviors, such as:
- Transactions not rolled back on query timeouts, which can be common during high load on a shared database (see this issue)
If you encounter an issue, feel free to add it above.
Communication and support
Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:
- Chat in real time in the IRC channel #wikimedia-cloud connect or the bridged Telegram group
- Discuss via email after you have subscribed to the cloud@ mailing list
- Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
- Read the News wiki page
Use a subproject of the #Cloud-Services Phabricator project to track confirmed bug reports and feature requests about the Cloud Services infrastructure itself
Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)