O que você fará
As a member of our Reliability Engineering Product SRE team, you will be responsible for building production applications with the highest level of MVRs and SMMs, ensuring customer satisfaction through your expertise in SRE domain skills. We are seeking an individual who is passionate about automation, efficiency, and operational excellence. You will be using a bundled tech stack to provide deep visibility into customer, product, and infrastructure interactions. You will have a keen eye for SLOs, SLIs, SLAs, and the golden metrics that drive reliability. You will programmatically approach MVRs using coding and scripting languages, while also leveraging AI/ML-driven insights where applicable.
Quais serão as suas responsabilidades
- Build products with MVRs and reliability standards, ensuring system resilience and scalability.
- Set up and operate observability tools across multiple cloud providers, incorporating AI-powered anomaly detection to enhance monitoring.
- Assist development teams in defining SLO/SLI dashboards and alerts, optimizing alerting signals with ML-based noise reduction techniques.
- Use Go, Python, or Terraform to automate operational tasks and build self-healing mechanisms.
- Manage and administer Grafana, Prometheus, Loki, and other observability tools, integrating predictive analytics where beneficial.
- Troubleshoot and support production environments, using AI-assisted diagnostics where applicable for faster root cause identification.
- Automate incident response workflows, leveraging AIOps to reduce manual toil and improve MTTR.
O que precisa possuir para ser bem sucedido
Experience
- Minimum 8 years of experience in a SaaS environment
- Bachelor's degree in computer science or equivalent
- Ability to participate in an on-call rotation
Qualifications
- AI-powered: Interest in AI-powered automation, including AIOps tools, ML-based alert tuning, and predictive maintenance.
- Networking: Strong understanding of the OSI model, TCP/IP, and DNS; particularly as it relates to cloud environments.
- Linux Fundamentals: Solid experience with the administration, security hardening, and performance tuning of one or more distributions of Linux.
- Troubleshooting: A passion for tracking down technical root causes of distributed systems, and software.
- Observability: Experience with developing service level indicators and objectives, instrumenting software, and building alerts. ML-based anomaly detection is a plus.
- Software Engineering: An understanding of software engineering fundamentals with experience developing software with a team of engineers.
- Automation: A strong desire to automate all of the things and eliminate toil.
- Containers: A solid understanding of the underpinnings of container technology such as groups and namespaces.
- Container Orchestration Systems: Experience with the operations, administration, and development of orchestration systems such as Kubernetes, ECS, Mesos, and Nomad.
- IaC: Experience with deploying and maintaining infrastructure as code with tools such as Terraform, and Pulumi.
- Technical Writing: Most of the services we develop are greenfield, and you will need to build documentation and diagrams for other engineering teams.
- Customer Satisfaction: Keen eye for customer satisfaction (our customers are other engineering teams and Avalara customers).
- Passion for Learning: Interest in the broader technology space with a constant desire to expand your understanding.
- Adaptability: Experience working on a variety of projects. In short, we want people with T-shaped skills.
- Tools & Technologies we are looking at as part of the skillset: Terraform, Grafana, Prometheus, Loki, Alert manager, Pushgateway, Prometheus exporters & client libraries, PromQL, LogQL, Fluentd, Fluent-bit, Sumologic, Splunk, Tempo, Jaeger, OpenTelemetry, Cortex, etc
- Other Common Tools & Technologies expected: AWS, GCP, Oracle Cloud, Azure, Terraform, Pulumi, GitLab, Artifactory, Atlassian suite, GIT, Kubernetes, Go, C#, Python, Bash, Powershell, Docker, Windows, Linux, etc
Preferred Qualifications
- Programer Language: GO and Python
- Distributed Computing: Experience architecting, developing, and deploying distributed services across regions and clouds.
- GitLab: Experience in working with, managing, and deploying.
- Artifactory: Experience in working with, managing, and deploying.
- Open Source: Build side-projects or contribute to other open-source projects.
Como cuidaremos de você
Total Rewards
In addition to a great compensation package, paid time off, and paid parental leave, many Avalara employees are eligible for bonuses.
Health & Wellness
Benefits vary by location but generally include private medical, life, and disability insurance.
Inclusive culture and diversity
Avalara strongly supports diversity, equity, and inclusion, and is committed to integrating them into our business practices and our organizational culture. We also have a total of 8 employee-run resource groups, each with senior leadership and exec sponsorship.
O que você precisa saber sobre Avalara
We’re Avalara. We’re defining the relationship between tax and tech.
We’ve already built an industry-leading cloud compliance platform, processing nearly 40 billion customer API calls and over 5 million tax returns a year, and this year we became a billion-dollar business. Our growth is real, and we’re not slowing down until we’ve achieved our mission - to be part of every transaction in the world.
We’re bright, innovative, and disruptive, like the orange we love to wear. It captures our quirky spirit and optimistic mindset. It shows off the culture we’ve designed, that empowers our people to win. Ownership and achievement go hand in hand here. We instill passion in our people through the trust we place in them.
We’ve been different from day one. Join us, and your career will be too.
Read Full Description