Data Platform/Systems/System users
The Analytics Cluster is more multi-tenant than any other system in Wikimedia production. It is often used by individual users to do analysis and run jobs, but there is often a need for productionized jobs to run as a user that is not tied to a real person's user account. To accomplish this, we create Posix system users and groups, and then allow real users in a certain group to sudo as that system user.
There are a number of team-specific system users, plus a general analytics-privatedata
that can be used by any user with private data access.
Example
The Search team wants to productionize jobs to run in Hadoop. Members of the Search team need to be able to schedule and maintain these jobs as a Posix user that is not a real human user account. They have:
- System user
analytics-search
: will run jobs and own files. This user's main group is also calledanalytics-search
. - Group
analytics-search-users
: the individual accounts of Search team members belong to this group and are allowed to sudo as theanalytics-search
user.
Running a command as a system user
Basic commands can be run simply with sudo
.
However, most commands we're interested in using require Kerberos authentication. This requires using a Kerberos keytab: a file with permissions set that only the owner can read, holding the password to authenticate to Kerberos.
Every user in the system user group will be able to sudo as the system user, which in turn will be able to read a keytab and authenticate to Kerberos without password. In order to hide the complexity, we have the kerberos-run-command
tool.
To use it, run sudo -u {system user} kerberos-run-command {your command}
.
NOTE: currently kerberos-run-command supports executables, but not scripts. The workaround is to make the user you're trying to sudo as kinit via a simple kerberos-run-command. Example: sudo -u analytics kerberos-run-command analytics hdfs dfs -ls
and after that you can run commands as that user relying on the cached Kerberos credentials: sudo -u analytics spark2-sql ...
.
Detailed example
# No kinit done, hence no credentials for my user
elukey@stat1007:~$ klist
klist: No credentials cache found (filename: /tmp/krb5cc_13926)
# As expected, a simple ls will fail
elukey@stat1007:~$ hdfs dfs -ls /
[..cut..]
java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "stat1007.eqiad.wmnet/10.64.5.32"; destination host is: "analytics1029.eqiad.wmnet":8020;
[..cut..]
# First attempt to use the kerberos-run-command
elukey@stat1007:~$ kerberos-run-command analytics-privatedata hdfs dfs -ls /
The user keytab that you are trying to use (/etc/security/keytabs/analytics/analytics-privatedata.keytab) doesn't exist or it isn't readable from your user, aborting...
# The problem is that the keytab is not readable by all
# analytics-privatedata-users members directly, but they
# have to sudo first:
elukey@stat1007:~$ sudo -u analytics-privatedata kerberos-run-command analytics-privatedata hdfs dfs -ls /
Found 5 items
drwxr-xr-x - hdfs hadoop 0 2019-06-20 06:00 /system
drwxrwxrwt - hdfs hdfs 0 2019-11-14 15:03 /tmp
drwxr-xr-x - hdfs hadoop 0 2019-10-25 16:24 /user
drwxr-xr-x - hdfs hdfs 0 2019-01-17 13:42 /var
drwxr-xr-x - hdfs hadoop 0 2019-06-25 13:46 /wmf
# It worked! Interesting question: does my user have credentials now? Let's check...
elukey@stat1007:~$ klist
klist: No credentials cache found (filename: /tmp/krb5cc_13926)
# Why not? Because only the analytics-privatedata user has:
elukey@stat1007:~$ sudo -u analytics-privatedata klist
Ticket cache: FILE:/tmp/krb5cc_498
Default principal: analytics-privatedata/stat1007.eqiad.wmnet@WIKIMEDIA
Valid starting Expires Service principal
11/14/2019 15:03:33 11/15/2019 01:03:33 krbtgt/WIKIMEDIA@WIKIMEDIA
renew until 11/15/2019 15:03:33
# This may seem confusing at first, but it makes sense, since we had to sudo
# to be able to read the keytab.
# Corollary: the analytics-privatedata user is not a replacement for your
# kerberos authentication, only a convenient way to run recurrent jobs via
# cron or similar.
Administration
[WIP] Creating a new system user
Let's walk through creating a system user and associated groups for a hypothetical "sandwich engineering team".
Edit modules/admin/data/data.yaml
to do the following:
- Declare the
analytics-sandwich
user and group - Declare the
analytics-sandwich-users
group and its members, includinganalytics-sandwich
in itssystem_members
. - Add the
analytics-sandwich
user to thesystem_members
ofanalytics-privatedata-users
so it can access Hadoop.
groups:
analytics-privatedata-users:
# ...
system_members: [..., analytics-sandwich]
# ...
analytics-sandwich:
gid: 920 # pick the next gid in the list
system: true
members: []
analytics-sandwich-users:
gid: 921 # next gid
description: Group of users for managing sandwich engineering related analytics jobs
members: [userA, userB, ...etc]
privileges: ['ALL = (analytics-sandwich) NOPASSWD: ALL']
system_members: [analytics-sandwich]
# ...
users:
# ...
analytics-sandwich:
ensure: present
system: true
uid: 920 # pick the next uid in the list
gid: 920 # pick the next gid in the list
shell: '/bin/false'
More to do: Hiera, Kerberos keytabs, etc.