gLExec Operating System Interoperability

Batch systems in the presence of gLExec

gLExec attempts really hard to be neutral to its OS environment. In particular, gLExec will not break the process tree, and will accumulate CPU and system usage times from the child processes it spawns. We recognise that this is particularly important in the gLExec-on-WN scenario, where the entire process (pilot job and target user processes) should be managed as a whole by the node-local batch system daemon.

We have verified that, on the Torque batch system, the forking of a process with a different uid does not impair the functioning of pbs_mom in being able to kill any and all of the child processes. We tested this with Torque version 2.1.6; to verify this with your own batch system you can use the steps below.
The (simple) program will do the exact same uid change that glexec does, but does not require that you install anything grid-like on your site. It's a completely stand-alone program that does a uid change, so you can test how your batch system reacts.

If you notice any anomalies after testing, i.e. the job will not die, please notify the developers at grid dash mw dash security at nikhef dot nl. If you run this on Torque 2.1.6 and notice any issues, please repeat your tests...