TL;DR : It’s Friday !
It’s been a rough week - what with #bashpocalypse aka #shellshock - but a productive one. Here’s a brief update of how things are going on the Ops side, and what’s in Dev at the moment. Last week we wrote about how we’re leveraging #DevOps to build reproducible configuration of sites in order to improve sustainability of the infrastructure as a whole. This week, we’ll talk a but about how we’re doing on that front, and what else we’re doing with the magic.
Gridops - still getting better
We provided a situation report a few days ago, detailing the state of the services at the sites. While there are many sites registered in the GOCDB, only 3 are actually configured “well” enough to be able to pass all the operations tests and publish an ‘Availability/Reliability’ (A/R). Since then things have gotten a little better; MA-01-CNRST has fixed a configuration problem with their site which prevented them from finding message servers, and are now passing all tests, publishing nonzero A/R, ZA-WITS-CORE is following up on a configuration problem regarding the SE and MPI configuration. ZA-UCT-ICTS is dependent on ZA-CHPC for the SE; but the CHPC is still undergoing an extensive downtime due to reconfiguration of the grid services.
We’ve made some headway with Perun which is the identity consolidation mechanism for the infrastructure - adding a new user interface and a testing instance of the VOMS. Perun does some pretty magic stuff - by signing up at the registration page, a user can join one or more VO’s (which are not strictly-speaking grid VOs, but can be other groups at e.g. the site level) and consolidate their various identities from certificates, external identity providers, etc. These identities are then propagated to facilities which support the group or VO, all managed centrally. So, if you want a new user account on a user interface, just sign up, and your identity will automatically get pushed to the various user interfaces from which you can submit jobs. Currently these are the default and dev ui’s at Meraka and we’re working on getting the CHPC’s connected as well.
We previously discussed work we started on an executable configuration of LDAP and Shibboleth Identity Providers. The work is done mostly in the context of ei4Africa, which has a final conference coming up, but it’s also a way to better undertand what the services do and how to maintain them (more on that later). The LDAP role in particular has seen a lot of development and interest.
We’ve got the LDAP role done and committed - work is now ongoing in that role to tweak it in order to be more idempotent. One of the issues we had to face was that there is no ldap module in LDAP, so we need to make lower-level changes, which often trigger ‘changed’ statuses. @fmarco76 is working on addressing these issues. The role works best on virgin machines, and only a few variables need to be defined at the start. The IdP role on the other hand is pretty much done - we’re working on making it somewhat OS-independent, in order to support Debian/Ubuntu as well as CentOS6. If you want to stay up to date with what’s going on, just watch the repo !
We announced some campaigns during July and it’s time to close them and move on. The accounting campaign has been a good success, apart from the regional APEL server unfortunately dying and missing much of August; sites have re-run their parsers and publishers and (strangeness at UJ aside) are publishing regularly. We gained a lot of insighed during the campaign on how the APEL regional instance actually works, so thanks a lot to Uli and Bouchra for making it a success.
The ARGUS campaign is also done with a success - although it was quite simple. We now have a central user banning instance at
argus.c4.csir.co.za, which sites can subscribe to.
The only remaining campaign is CVMFS, which in all honesty we just haven’t had time to finish. It stays open and we’ll be dedicating some time to it this month.
If you have suggestions for a new campaign, please let us know at
That’s all, folks !
Right, off to do the weekend ! TGIF everyone !
Tagged with AAROC • operations • security • updates • blog