Torque and Auks
Getting torque and auks to work together so that a batch job run by torque has access to a kerberos ticket of the user who submitted the batch job.
Overview
The idea behind this is to run an auks server to provide kerberos tickets to batch nodes when a job starts to run. For security the batch nodes do not access the auks server directly, if they did then a compromise on a batch node would compromise every kerberos ticket, which depending on what the kerberos tickets are being used for could well compromise the entire site and file system. Instead the batch headnode pulls the kerberos tickets from auks and then pushes the tickets to the batch jobs just before a job starts. For security the headnode should only run the torque and maui servers and should not run any batch jobs.
Install Torque and Maui
In order to push the kerberos ticket from the batch headnode to the batch worker node the torque server requires a patch. This adds the ability to run a script on the torque server before a job is sent to a batch node. Maui can be installed without any changes.
The
server_prologue.patch for torque was written for torque 2.5.9 and while it works it is not production quality. It also has not been tested with other versions of torque.
I also use the a second patch,
acl_fix.patch, which fixed some compilation issues. To make torque rpms I use this
torque.spec file. Both the acl patch comes from searching the web, the spec file is included in the torque tarball.
It is probably best to install torque and maui without any patches first and get test batch jobs to run. After that install the patched version of torque and create a file
/var/spool/torque/server_priv/prologue
(this is hard coded in the patch, if you want to install torque somewhere else change the patch). The script is run just before a job is sent to a batch node with a single argument of the job id.
To test the server prologue script try the following for
/var/spool/torque/server_priv/prologue
:
#!/bin/bash
user_id=`id -u ${2}`
job_id=${1}
echo ${user_id} ${job_id} > /tmp/test_${user_id}_{job_id}
If everything is working this will create a file on the torque server for each batch job run which contains the submitting users uid and the batch job id.
Install AUKS
AUKS needs to be installed on the kerberos KDC so that kerberos tickets can be extracted by a cron job. For auks I use a patch
aukspriv_init.patch to change the order of the aukspriv startup and shutdown. To make the auks rpms I use this
auks.spec file which is adapted from the spec file which comes with the auks tarball.
After installing auks the KDC and batch headnode will each require a kerberos host principle, usually of the form host/machine.name@KERBEROS.REALM. Then export the principles on to the host machines into the /etc/krb5.keytab file.
aukspriv
On the KDC first configure the aukspriv daemon. For the rpm installation the configuration is held in the
/etc/sysconfig/aukspriv
file.
The default setup for
/etc/sysconfig/aukspriv
is mostly fine however setting
AUKS_PRIV_CCACHE_APPEND
helps stop tickets being accidentally overwritten. I use:
# The user to get credentials for. The default value is root.
# export AUKS_PRIV_USER="root"
# The keytab file to use. The default value is /etc/krb5.keytab
# export AUKS_PRIV_KEYTAB="/etc/krb5.keytab"
# The krb5 principal name to use. The default value is the first principal found in the keytab.
# export AUKS_PRIV_PRINC=
# The string to append to ccache pattern /tmp/krb5cc_%uid. The default value is no string to append.
export AUKS_PRIV_CCACHE_APPEND=".auks"
# The credential lifetime in seconds. The default value is 36000 (10h)
export AUKS_PRIV_LIFETIME="36000"
# The time to wait between two build stages. The default value is 35000 (seconds).
export AUKS_PRIV_RENEW_INT="18000"
# The syslog facility to use for log information. The default value is local3.notice . A none value means logs on stdout.
# export AUKS_PRIV_SYSLOG_PRIO="local3.notice"
Which adds ".auks" onto the file name of the credentials cache auks uses and also changes the credential renewal to every 5 hours.
If things are correctly setup starting the aukspriv daemon on the KDC server will create a file
/tmp/krb5cc_0.auks
which will contain a kerberos ticket for the KDC server machine principle found in
/etc/krb5.keytab
. This ticket will then be maintained by the aukspriv daemon.
--
AndrewPickford - 2013-06-05