Generic Components of the eScience Infrastructure Ecosystem — 14th IEEE eScience Conference Amsterdam, Monday 2018-10-29
Large scale common science infrastructure for high throughput batch computing.
CVMFS is great for large organisations. But for small teams it can be a real challenge:
I imagined dozens of small e-science groups knocking on my door to get their repositories mounted.
Nikhef and SURFSara have jointly set up /cvmfs/softdrive.nl to offer a single CVMFS repository for all e-science users in the Netherlands.
The system consists of
/cvmfs/softdrive.nl/zwamborn/ /cvmfs/softdrive.nl/ceitan/ /cvmfs/softdrive.nl/phop/ /cvmfs/softdrive.nl/svdaele/ /cvmfs/softdrive.nl/tseker/ /cvmfs/softdrive.nl/ajones/ /cvmfs/softdrive.nl/rbyrne/ /cvmfs/softdrive.nl/fsweijen/ /cvmfs/softdrive.nl/kooyman/
Catalog size exploded when monitoring was put in place. The monitoring triggered an update every five minutes and thereby a completely new, full catalog of all files.
This was ultimately understood and remedied by making subcatalogs per user.
To complement the technical implementation, the total user experience was taken care of by having proper documentation, monitoring and guidance.
The user documentation is right there when logging on to the system. The message of the day, printed for login shells, gives a summary of the workings of the system and how to publish data.
More extensive documentation was written and placed on-line.
End to end monitoring of the system is done by automatically triggering a change to the system every hour and measuring the time it takes for the data to reach a client machine. Alerts are raised if the delay reaches a certain threshold, prompting the technicians to inspect what went wrong.
The softdrive model has proven to be succesful; it is easy for users to maintain their own software; the software is lightweight and the maintainance burden on the administrators is very light.
There is no plan at this point to add more bells and whistles to the system.
Even as the PaaS infrastructure dwindles in favour of IaaS (infrastructure as a service), the CVMFS system could still be a viable component for delivering software.
Some other national grid infrastructures offer something similar to softdrive, but I've not heard of anyone interested in cloning our setup. If you have plans to provide CVMFS to your users, and would perhaps like to use (parts of) the softdrive system, don't hesitate to contact me.