Lead DevOps Strategy: Define and drive the DevOps roadmap, aligning with business and engineering goals. Infrastructure as Code (IaC): Design and implement scalable, secure, and resilient infrastructure using tools like Terraform. Building Tools: Design, implement, and maintain build automation using Apache Maven, and SBT. CI/CD Pipelines: Architect and maintain robust CI/CD pipelines using tools such as Jenkins, GitHub Actions, or GitLab CI. Cloud Operations: Manage and optimize cloud environments (AWS, Azure, GCP), ensuring high availability and cost efficiency. Monitoring & Observability: Implement and maintain monitoring, logging, and alerting systems (e.g., Prometheus, Grafana, ELK, Datadog). Security & Compliance: Enforce security best practices and ensure compliance with industry standards (e.g., SOC 2). Mentorship: Provide technical leadership and mentorship to DevOps engineers and other team members. Incident Management: Lead incident response efforts and post-mortem analysis to improve system reliability. Bachelor's or Master's degree in Computer Science, Engineering, or related field. 10+ years in DevOps, Site Reliability Engineering, or related roles, with at least 3 years in a senior or principal capacity. Proficiency in scripting languages (Python, Bash, Go, etc.) Deep understanding of containerization and orchestration (Docker, Kubernetes) Expertise in cloud platforms (AWS, Azure, GCP) Experience with configuration management tools (Ansible, Chef, Puppet) Strong leadership and communication skills Ability to work cross-functionally and influence without authority Strategic thinking with a hands-on approach Certifications in AWS, Azure, or Kubernetes Experience with service mesh technologies (Istio, Linkerd) Background in software development or SRE
Read Full Description