Parts | Days | Selection | Search | Downloads | Help

HK: Fachverband Physik der Hadronen und Kerne

HK 70: Instrumentierung XIV

HK 70.6: Talk

Friday, March 19, 2010, 15:15–15:30, HG ÜR 6

Operation & Control Interfaces based upon Distributed Agent Networks — •Pierre Zelnicek¹, Udo Kebschull¹, and Volker Lindenstruth² — ¹Kirchhoff Institute of Physics, Ruprecht-Karls-University Heidelberg, Heidelberg, Germany — ²Frankfurt Institute for Advanced Studies, Frankfurt, Germany

The majority of todays large scale compute clusters and software systems running on them are using operation and control interfaces (OCI) for monitoring and control. The majority of these OCI's are still based upon single node applications, which are limited by the physical system they are running on. In areas where hundred thousand and more statistical values have to be analyzed and taken into account for visualization and decision making this kind of OCI's are no option at all. Furthermore, this kind of OCI's do not empower whole collaborations to control and operate cluster at the same time from around the world. Distributed agent networks (DAN) tend to have the possibility to overcome this limitations. A distributed agent network is per design a multi-node approach. Together with a web based OCI, automatic data propagation and distributed locking algorithms they provide simultaneous operation and control, distributed state tracking and visualization to world wide collaborations. The first compute cluster in the scientific world using this combination of technologies is the ALICE HLT at CERN.