DATA CENTRE OPERATIONS

OFFSHORE SERVICE OFFERING – Data Centre Operations Activities


Event Monitoring and Management , Log all Critical Incidents during out of Service Desk hours


Daily Activities

  • Monitoring and report all Events that occur throughout the in scope IT environment and the infrastructure that supports it, in order to detect and escalate exception conditions
  • Providing an Event Management System that is a single consolidated repository of all Events arising from the in scope IT components that comprise the environment
  • Correlating and filtering the Events such that only the Event recording the related issue is presented for resolution
  • Monitoring the status, availability, capacity, performance and utilization of all in scope IT infrastructure components that comprise the environment
  • At any time, in the event that an In scope IT component becomes unavailable, an exception Event shall automatically be forwarded to the Event Management System and the DC-Ops Team shall initiate appropriate recovery actions without the need for any End User to log a call
  • When any Interface Process and Batch Job fails unexpectedly, a warning Event or exception Event shall automatically be forwarded to the Event Management System and the DC-Ops shall initiate appropriate recovery actions without the need for any user to log a call.
  • All known warning Events or exception Events shall have a defined Event action plan to expedite their Resolution, which is made available to the Resolver Group at all times
  • Monitoring the status, availability, capacity, performance and utilization of all in scope IT infrastructure components that comprise the environment
  • Acknowledge and create the ticket with appropriate priority for all the alerts received within SLA
  • Whenever DC-Ops team receives an alert from the infrastructure about an issue which is about to happen or which has happened, it will immediately inform the Service Desk about the incident
  • Resolving first level issues based on the information captured in KEDB (as agreed in the SOP’s defined for DC-OPs), wherein the support teams can update the resolution which can be used by the DC-OPS team with appropriate training.(Eg : Disk full alerts for windows servers)
  • If DC-OPS not able to resolve, pass it on to the respective next level support team with appropriate work log as per the SOP defined for DC-OPS
  • Get regular updates from the support teams involved in the issue and update the same in the alert tracker
  • Ensure all the issues are logged in the issue tracker with appropriate resolution
  • Maintain Shift Hand-over process with the team
  • Manage and Maintain physical access control requests for permanent and temporary entry to the Data Centers
  • Act as a back-up for Service Desk during Major Incident Management process for initiating bridge for conference calls between Support, Business and Major Incident Manager (relevant stakeholders)
  • Performing daily DC-Ops health checks
  • Maintain daily Operations Trouble Report of the Critical Incidents/alerts

During Out of Service Desk Hours 


  • Log all Critical Incidents
  • Assigning critical incidents to appropriate resolver group and update the Major Incident Manager
  • Initiate bridge for conference calls between Support, Business and Major Incident Manager, wherever necessary (relevant stakeholders)
  • Closing Critical Incidents when the End User is satisfied with the solution to the Critical Incident
  • Coordinating with OEM’s/third parties for hardware issues, software upgrades and regular maintenance
  • Maintain daily hand-over process between Service Desk

Weekly


  • Documenting and updating the DC-Ops operational procedures and it relevant documents
  • Prepare weekly ticket reports for DC-Ops

    Monthly


  • Prepare monthly ticket reports for DC-Ops
  • Perform quality check on the tickets (Opened by the team members)
  • Identifying gaps in terms of training requirements if issues are repeated and for new processes and updates