Göttingen 2012 – scientific programme
Parts | Days | Selection | Search | Updates | Downloads | Help
T: Fachverband Teilchenphysik
T 76: Computing 1
T 76.5: Talk
Wednesday, February 29, 2012, 17:45–18:00, VG 0.111
Validation of ATLAS distributed analysis resources using HammerCloud — •Federica Legger, Philippe Calfayan, Guenter Duckeck, Johannes Ebke, Johannes Elmsheuser, Christoph Anton Mitterer, Dorothee Schaile, Cedric Serfon, and Rodney Walker — Ludwig-Maximilians-Universität München
Data from the LHC (Large Hadron Collider) are routinely being analysed over the grid. More than 100 sites worldwide are used daily for ATLAS data reconstruction and simulation (centrally managed by the production system) and for distributed user analysis. Frequent validation of the network, storage and CPU resources is necessary to ensure high performance and reliability of such a complex infrastructure. We report on the development, optimization and results of an automated functional testing suite using the HammerCloud framework. Functional tests are short light-weight applications covering typical user analysis and production schemes, which are periodically submitted to all ATLAS grid sites. Sites that fail or are unable to run the tests are automatically excluded from the PanDA brokerage system, therefore avoiding user or production jobs to be sent to problematic sites. We show that stricter exclusion policies help to increase the grid reliability, and the percentage of user and production jobs aborted due to network or storage failures can be sensibly reduced using such a system.