Jump to content

Data Platform/Systems/System users

From Wikitech

The Analytics Cluster is more multi-tenant than any other system in Wikimedia production. It is often used by individual users to do analysis and run jobs, but there is often a need for productionized jobs to run as a user that is not tied to a real person's user account. To accomplish this, we create Posix system users and groups, and then allow real users in a certain group to sudo as that system user.

There are a number of team-specific system users, plus a general analytics-privatedata that can be used by any user with private data access.

Example

The Search team wants to productionize jobs to run in Hadoop. Members of the Search team need to be able to schedule and maintain these jobs as a Posix user that is not a real human user account. They have:

  • System user analytics-search: will run jobs and own files. This user's main group is also called analytics-search.
  • Group analytics-search-users: the individual accounts of Search team members belong to this group and are allowed to sudo as the analytics-search user.

Running a command as a system user

Basic commands can be run simply with sudo.

However, most commands we're interested in using require Kerberos authentication. This requires using a Kerberos keytab: a file with permissions set that only the owner can read, holding the password to authenticate to Kerberos.

Every user in the system user group will be able to sudo as the system user, which in turn will be able to read a keytab and authenticate to Kerberos without password. In order to hide the complexity, we have the kerberos-run-command tool.

To use it, run sudo -u {system user} kerberos-run-command {your command}.

NOTE: currently kerberos-run-command supports executables, but not scripts. The workaround is to make the user you're trying to sudo as kinit via a simple kerberos-run-command. Example: sudo -u analytics kerberos-run-command analytics hdfs dfs -ls and after that you can run commands as that user relying on the cached Kerberos credentials: sudo -u analytics spark2-sql ....

Detailed example

# No kinit done, hence no credentials for my user
elukey@stat1007:~$ klist
klist: No credentials cache found (filename: /tmp/krb5cc_13926)

# As expected, a simple ls will fail
elukey@stat1007:~$ hdfs dfs -ls /
[..cut..]
java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "stat1007.eqiad.wmnet/10.64.5.32"; destination host is: "analytics1029.eqiad.wmnet":8020;
[..cut..]

# First attempt to use the kerberos-run-command 
elukey@stat1007:~$ kerberos-run-command analytics-privatedata hdfs dfs -ls /
The user keytab that you are trying to use (/etc/security/keytabs/analytics/analytics-privatedata.keytab) doesn't exist or it isn't readable from your user, aborting...

# The problem is that the keytab is not readable by all
# analytics-privatedata-users members directly, but they
# have to sudo first:
elukey@stat1007:~$ sudo -u analytics-privatedata kerberos-run-command analytics-privatedata hdfs dfs -ls /
Found 5 items
drwxr-xr-x   - hdfs hadoop          0 2019-06-20 06:00 /system
drwxrwxrwt   - hdfs hdfs            0 2019-11-14 15:03 /tmp
drwxr-xr-x   - hdfs hadoop          0 2019-10-25 16:24 /user
drwxr-xr-x   - hdfs hdfs            0 2019-01-17 13:42 /var
drwxr-xr-x   - hdfs hadoop          0 2019-06-25 13:46 /wmf

# It worked! Interesting question: does my user have credentials now? Let's check...
elukey@stat1007:~$ klist
klist: No credentials cache found (filename: /tmp/krb5cc_13926)

# Why not? Because only the analytics-privatedata user has:
elukey@stat1007:~$ sudo -u analytics-privatedata klist
Ticket cache: FILE:/tmp/krb5cc_498
Default principal: analytics-privatedata/stat1007.eqiad.wmnet@WIKIMEDIA

Valid starting       Expires              Service principal
11/14/2019 15:03:33  11/15/2019 01:03:33  krbtgt/WIKIMEDIA@WIKIMEDIA
	renew until 11/15/2019 15:03:33

# This may seem confusing at first, but it makes sense, since we had to sudo
# to be able to read the keytab.
# Corollary: the analytics-privatedata user is not a replacement for your
# kerberos authentication, only a convenient way to run recurrent jobs via
# cron or similar.

Administration

[WIP] Creating a new system user

Let's walk through creating a system user and associated groups for a hypothetical "sandwich engineering team".

Edit modules/admin/data/data.yaml to do the following:

  • Declare the analytics-sandwich user and group
  • Declare the analytics-sandwich-users group and its members, including analytics-sandwich in its system_members.
  • Add the analytics-sandwich user to the system_members of analytics-privatedata-users so it can access Hadoop.
groups:
  analytics-privatedata-users:
    # ...
    system_members: [..., analytics-sandwich]
  # ...
  analytics-sandwich:
    gid: 920 # pick the next gid in the list
    system: true
    members: []
  analytics-sandwich-users:
    gid: 921 # next gid
    description: Group of users for managing sandwich engineering related analytics jobs
    members: [userA, userB, ...etc]
    privileges: ['ALL = (analytics-sandwich) NOPASSWD: ALL']
    system_members: [analytics-sandwich]
# ...
users:
  # ...
  analytics-sandwich:
    ensure: present
    system: true
    uid: 920 # pick the next uid in the list
    gid: 920 # pick the next gid in the list
    shell: '/bin/false'


More to do: Hiera, Kerberos keytabs, etc.