Job Description:
Senior Delivery Manager (Production Support and DevOps)
The person is responsible for ensuring the smooth and reliable operation of production systems and applications, acting as a point of contact for incidents and ensuring the efficient resolution of issues. They also play a key role in incident management, root cause analysis, and continuous improvement efforts. The person should have excellent leadership and people management skills and should be able to lead a large team of Production Support and DevOps Engineers.
Key roles and responsibilities:
Incident Management and Support:
Monitoring and Troubleshooting: Continuously monitor systems and applications for performance issues, incidents, and alerts, and proactively respond to incidents.
Issue Resolution: Diagnose and resolve production issues using advanced troubleshooting techniques.
Root Cause Analysis: Perform in-depth analysis to identify the root causes of incidents and prevent recurrence.
Documentation and Communication: Create and maintain documentation related to production issues and resolutions, and effectively communicate with stakeholders, including development and operations teams.
Incident Management: Oversee the incident management process, including prioritization, escalation, and resolution, ensuring timely and effective incident resolution.
System Performance and Optimization:
Performance Monitoring: Monitor system performance metrics, identify bottlenecks, and recommend solutions for performance optimization.
Process Improvement: Implement and maintain processes and procedures to improve production support efficiency and reduce downtime.
Automation: Identify and implement automation opportunities to streamline repetitive tasks and reduce manual effort.
Data Analysis: Analyze data related to production performance, incident trends, and support requests to identify areas for improvement and optimization.
Cross-Functional Collaboration:
Collaboration with Development and Operations: Work closely with development, operations, and other relevant teams to ensure seamless software deployment and integration.
Communication and Reporting: Provide regular reports on system performance, incident status, and support metrics to senior management and stakeholders.
On-Call Support: Participate in on-call rotations and respond to production issues after business hours.
Other Responsibilities:
Training and Documentation: Develop and deliver training materials and documentation to support production support teams.
Process Improvement: Identify and implement improvements to production support processes and procedures.
Knowledge Management: Maintain and update knowledge databases and documentation to support troubleshooting and incident resolution.
Continuous Improvement: Drive continuous improvement initiatives to enhance the overall efficiency and reliability of production support.
Technical Skills:
Excellent knowledge of ServiceNow, NewRelic, AWS Cloud, Application, System, Network, Cloud and DevOps.
Experience:
12+ years
Certification:
ITIL, AWS Certification are desired
We offer you a competitive total rewards package, continuing education & training, and tremendous potential with a growing worldwide organization.
DISCLAIMER:
Nothing in this job description restricts management's right to assign or reassign duties and responsibilities of this job to other entities; including but not limited to subsidiaries, partners, or purchasers of Alight business units.
.
Read Full Description