OPENTEXT

OpenText is a global leader in information management, where innovation, creativity, and collaboration are the key components of our corporate culture. As a member of our team, you will have the opportunity to partner with the most highly regarded companies in the world, tackle complex issues, and contribute to projects that shape the future of digital transformation.

YOUR IMPACT:

We are seeking a Lead Site Reliability Engineer (SRE) with advanced expertise in AWS, infrastructure automation, and modern deployment practices to join our Cloud Operations team. As a Lead SRE, you will own the design, implementation, and operational excellence of production systems while leading a team of engineers and driving a culture of reliability, performance, and innovation.

This role combines deep technical hands-on work with strategic leadership across serverless computing, container orchestration, and Terraform-driven automation. The ideal candidate is proactive, collaborative, and passionate about building scalable, secure, and resilient cloud-native systems.

WHAT THE ROLE OFFERS:

☁️ Cloud Infrastructure Architecture & Engineering

  • Architect, build, and maintain secure, scalable, and highly available AWS infrastructure.
  • Lead infrastructure design using services such as VPC, EC2, S3, RDS, Route53, IAM, ACM, and Security Hub.
  • Design and implement multi-region, fault-tolerant, and cost-optimized cloud architectures.

???? Infrastructure as Code & Automation

  • Define and enforce standards for Infrastructure as Code using Terraform, Terragrunt, and CloudFormation.
  • Automate full-stack provisioning, environment setup, and teardown processes.
  • Integrate automated security controls, compliance checks, and audit logging into deployment workflows.

???? Containerization & Orchestration

  • Architect and operate containerized workloads using Amazon EKS (Kubernetes), ECS, and Fargate.
  • Implement Helm charts and Kubernetes-native deployment strategies.
  • Optimize service scaling, auto-healing, and cluster resource management.

⚡ Serverless & Event-Driven Systems

  • Design and deploy serverless architectures using AWS Lambda, API Gateway, Step Functions, and DynamoDB.
  • Build event-driven automations using SNS, SQS, EventBridge, and custom Lambda integrations.
  • Champion lightweight and efficient operational patterns using serverless-first principles.

???? Monitoring, Observability & Incident Response

  • Build observability pipelines using CloudWatch, Prometheus, Grafana, and ELK.
  • Define and measure SLIs, SLOs, and error budgets to drive service reliability.
  • Lead production incident management, postmortems, and continuous reliability improvement.

???? CI/CD and DevOps Leadership

  • Lead CI/CD pipeline design using GitLab CI/CD, Jenkins, or AWS CodePipeline.
  • Implement blue/green, canary, and rolling deployments with rollback strategies.
  • Integrate security scans and quality gates into CI/CD workflows for production readiness.

???? Team Leadership & Strategic Impact

  • Mentor and guide junior and mid-level SREs, fostering a learning and high-performance culture.
  • Collaborate with product, engineering, and security teams to align infrastructure with business needs.
  • Lead operational reviews, cloud architecture assessments, and reliability-focused design reviews.
  • Set goals and KPIs for availability, deployment velocity, automation coverage, and incident reduction.

WHAT YOU NEED TO SUCCEED:

  • 8–14 years in SRE, CloudOps, or Infrastructure Engineering roles, with at least 3+ years in a technical leadership capacity.
  • Strong hands-on expertise with AWS services, especially in compute, networking, and serverless.
  • Expert-level skills in Terraform, including module design, CI/CD integration, and state management.
  • Experience in deploying and managing production workloads on ECS, EKS, and Lambda.
  • Deep scripting knowledge in Python, Bash, or PowerShell for automation.
  • Proven experience with CI/CD pipelines, Git workflows, and release automation.
  • Solid understanding of security principles, compliance (ISO 27001, SOC2), and cloud governance.
  • Experience with performance tuning, cost optimization, and cloud-native resilience design.

Preferred Qualifications

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or equivalent practical experience.
  • AWS certifications such as AWS DevOps Engineer, Solutions Architect, or SysOps Administrator.
  • Experience in chaos engineering, incident simulation, or SRE maturity assessments.
  • Familiarity with GitOps tools (e.g., ArgoCD, FluxCD), secrets management (Vault, AWS Secrets Manager), and cloud cost tooling

OpenText's efforts to build an inclusive work environment go beyond simply complying with applicable laws. Our Employment Equity and Diversity Policy provides direction on maintaining a working environment that is inclusive of everyone, regardless of culture, national origin, race, color, gender, gender identification, sexual orientation, family status, age, veteran status, disability, religion, or other basis protected by applicable laws.

If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please contact us at hr@opentext.com. Our proactive approach fosters collaboration, innovation, and personal growth, enriching OpenText's vibrant workplace.

Read Full Description
Confirmed 5 hours ago. Posted 30+ days ago.

Discover Similar Jobs

Suggested Articles