DevX (Developer Experience) group is the central team responsible for building critical software that enables over 10,000 engineers to globally deliver Cisco’s flagship products. Through continuous innovation and delivery of solutions that support all phases of the product life cycle, we strive to accelerate the pace at which Cisco delivers value to our customers with high quality. And we do this at high scale that involves billions of lines of code across all our products. By bringing development and customer context together to derive valuable insights, we believe we can fundamentally transform how Cisco builds, tests, and releases software.
We’re a group of engineers, technologists, designers, and more building products and capabilities that are simple to use and easy to manage. We seek forward-thinking, committed team members who are passionate about new innovative way to deliver value to customers. Surrounded by intelligent and creative people who are determined to innovative products and services.
We are looking for an inspiring innovator, mentor, and coach who is conversant and passionate about emerging opportunities to be part of a software development team, building the Next Generation Service for our fellow Cisco Engineers. Help us design, build and deploy a microservices-based private cloud, delivering state-of-the-art: opensource, 3rd party and custom-designed applications for our users. We are at an exciting stage of this journey and looking for a passionate, innovative and action-oriented engineering leader who wants to be part of the team building a Next Generation Developer Experience at Cisco.
Your Impact
We are seeking an experienced and highly motivated Production Engineering Technical Lead to oversee the development, optimization, and maintenance of our production systems. The ideal candidate will play a key role in ensuring system reliability, scalability, and performance while collaborating with cross-functional teams to drive technical excellence.
Minimum Requirement
Design, implement, and maintain scalable, highly available, and fault-tolerant systems in cloud environments
Optimize the performance, cost, and efficiency of cloud infrastructure by leveraging cloud-native tools and services.
Supervise infrastructure and applications to ensure optimal performance, availability, and security.
Troubleshoot production issues in infrastructure platforms, applications, and services, including root cause analysis and resolution.
Implement automated monitoring and alerting to identify performance bottlenecks and downtime before it impacts users.
Collaborate with Devx application teams to automate and streamline the deployment of applications and updates to production environments using CI/CD pipelines.
Ensure smooth and efficient release management, including managing environment configurations and ensuring minimal downtime during production releases.
Maintain version control and manage rollback strategies for production releases.
Participate in on-call rotations to provide 24/7 production support for critical incidents in the cloud platform.
Lead incident management processes, including troubleshooting, escalation, and resolution of production issues.
Document incidents and solutions for future reference and continuous improvement.
Minimum Qualification:
BS/MS in Computer Science
At least 12+ years of experience includes years of experience in production engineering, site reliability, or a similar role.
In-depth experience with Container platforms such as Google Anthos.
Strong understanding of networking, containers (e.g., Docker, Kubernetes), microservices architectures and distributed systems. Proficient in CI/CD tools (e.g., Jenkins, ArgoCD) and version control systems (e.g., Github).
Strong understanding of CI/CD pipelines, observability (monitoring, logging, tracing), and incident management frameworks.
Excellent problem-solving skills, with the ability to diagnose and resolve complex production issues.
Strong communication and leadership skills, with a track record of driving technical initiatives.
#WeAreCisco where every individual brings their unique skills and perspectives together to pursue our purpose of powering an inclusive future for all.
Our passion is connection—we celebrate our employees’ diverse set of backgrounds and focus on unlocking potential. Cisconians often experience one company, many careers where learning and development are encouraged and supported at every stage. Our technology, tools, and culture pioneered hybrid work trends, allowing all to not only give their best, but be their best.
We understand our outstanding opportunity to bring communities together and at the heart of that is our people. One-third of Cisconians collaborate in our 30 employee resource organizations, called Inclusive Communities, to connect, cultivate belonging, learn to be advised allies, and make a difference. Dedicated paid time off to volunteer—80 hours each year—allows us to give back to causes we are passionate about, and nearly 86% do!
Our purpose, driven by our people, is what makes us the worldwide leader in technology that powers the internet. Helping our customers reinvent their applications, secure their enterprise, transform their infrastructure, and meet their sustainability goals is what we do best. We ensure that every step we take is a step towards a more inclusive future for all. Take your next step and be you, with us!