Introduction to SA1: Grid Operations, Support and Management.
The European Grid Operations, Support and Management activity (SA1)
will create, operate, support and manage a production quality European
grid infrastructure which will provide computing, storage, instrumentation
and informational resources at many resource centres (RCs) acrossEurope.
The resources will be accessible to user communities and virtual organisations according to
agreed access management policies and service level agreements. The terms of engagement
of the resource centres and users will be driven by policies determined by the European einfrastructure
reflection group.
This activity will build on current national and international grid
initiatives. The key aim in assembling this infrastructure is to incorporate
and exploit existing expertise and experience in deploying, supporting
and operating prototype grids. The LCG
project will play a central role in providing an operational infrastructure
from the earliest stage of the EGEE grid.
The key objectives of the European Grid Operations, Support and Management team include:
- Core infrastructure services: to operate a set of
essential services, such as the information services, resource brokers,
data management services and administration of the virtual organisations
that bind distributed resources into a coherent infrastructure
- Grid monitoring and control: to actively monitor the
operational state of the grid and its performance, initiating corrective
action to correct problems arising with either core infrastructure
or grid resources.
- Middleware deployment and resource induction: to validate
middleware releases and then to deploy them to resource centres throughout
the grid. Strict criteria will be placed on validating new middleware
before production deployment. This will involve close interaction
and feedback with the Middleware Re-engineering and Integration activity
(JRA1) and the
Application Identification and Support activity (NA4).
Where new resource centres are to join the grid, assistance must be
provided both with middleware installation and with the introduction
of operational procedures at resource centres. Extra effort will be
offered to resource centres offering resources such as parallel and
vector supercomputers that play strategic roles for a number of scientific
applications.
- Resource and user support: to receive, respond to
and coordinate the resolution of problems with grid operations from
both resource centres and users; this role will filter and aggregate
problems, providing solutions where known, and engaging core infrastructure
or middleware engineering or other appropriate experts to resolve
new problems.
- Grid management: to co-ordinate the fulfilment of
the above objectives by Regional Operations Centres (ROC) and Core
Infrastructure Centres (CIC), together with managing the relationships
with resource providers, through negotiation of service-level agreements,
and the wider grid community, through participation in liaison and
standards bodies.
- International collaboration: to drive collaboration
with peer organisations in the Americas and in Asia- Pacific; to ensure
the interoperability of grid infrastructures and services in order
that the project can seamlessly access resources both within and outside
those provided through EGEE. The first of these two objectives is
the responsibility of Core Infrastructure Centres; the second objective
is the responsibility of the Regional Operations Centres, which bring
new resources into the Grid. Both the ROCs and the CICs will be overseen
by an Operations Management Centre (OMC) which will be responsible
for their coordination.
Download PDF