LCAS and LCMAPS
mkgltmpdir example pattern
gLExec Operating System Interoperability
Batch systems in the presence of gLExec
gLExec attempts really hard to be neutral to its OS environment. In
particular, gLExec will not break the process tree, and will accumulate
CPU and system usage times from the child processes it spawns.
We recognise that this is particularly important in the gLExec-on-WN
scenario, where the entire process (pilot job and target user
processes) should be managed as a whole by the node-local batch system
We have verified that, on the Torque batch system, the forking of a
process with a different uid does not impair the functioning of pbs_mom
in being able to kill any and all of the child processes. We tested this
with Torque version 2.1.6; to verify this with your own batch system you
can use the steps below.
The (simple) program will do the exact same uid change that glexec
does, but does not require that you install anything grid-like on your
site. It's a completely stand-alone program that does a uid change,
so you can test how your batch system reacts.
If you notice any anomalies after testing, i.e. the job will not die,
please notify the developers at grid dash mw dash security at nikhef dot nl.
If you run this on Torque 2.1.6 and notice any issues, please repeat your
- download the code for the sUTest programme and compile it on your
system: sutest.c. We on purpose do not provide
a pre-compiled binary for this, as you need to configure two pre-defined
constants in the source that are site-specific:
#define UNOBODY 99
#define GNOBODY 99
#define SRCUID 502
specify the uid and gid numbers to switch to, as well as the uid
(numeric) of the user account you will use for testing (i.e. the account
that will do the batch system qsub). This must be a trusted uid
as that user will have effective super-user privileges at any time.
- Compile this program:
cc -o sutest sutest.c
- Copy this program to a worker node, and make it setuid root:
cp sutest /usr/local/bin/
chown root:root /usr/local/bin/sutest
chmod u+s /usr/local/bin/sutest
(make sure your batch job goes to this worker node, please refer
to your batch system manual for details).
- Submit a batch job that takes a while:
echo "/usr/local/bin/sutest sleep 600" | qsub
(make sure your batch job goes to the worker node with the setuid
sutest program, please refer to your batch system manual for details).
Check on the worker node to see if the sleep job is running, and as
who (usually: the nobody user).
- As soon as the job starts, kill it (in PBS/Torque that would be a "qdel"),
and look on the worker node to see if the sleep process goes away. It
- Remove the setuid bit from the sutest executable on the WN:
chmod 0755 /usr/local/bin/sutest
Comments to email@example.com