As a Senior Cloud Operations Engineer to join our dynamic team. The ideal candidate will have a strong background in cloud technologies, with a focus on designing, implementing, and managing cloud-based solutions. As a Senior Cloud Operations Engineer, you will play a key role in ensuring the availability, performance, and security of our cloud infrastructure.
In this role, you will...
- Leading the day-to-day technical operations, providing highest levels of availability, reliability, and scalability of the Services
- Implement best practices for cloud security, including identity and access management, encryption, and network security
- Provide technical expertise to handle customer escalations and customer’s environment stability
- Conduct performance analysis, lead monitoring initiatives on multiple hosted products/platform
- Maintain operational run book procedures for all production systems and document the knowledge base
- Administer the Incident Management activities (detect, record, classify and close) and provide timely escalations and notifications as required by procedure
- Design, build, and maintain scalable, reliable, and secure infrastructure.
- Participate in on-call rotation to respond to cloud-related incidents and emergencies.
- Troubleshoot and resolve complex technical issues in a timely manner.
- Monitor and optimize cloud infrastructure for performance, cost, and security.
- Collaborate with cross-functional teams to troubleshoot and resolve complex cloud-related issues.
- Mentor junior team members and provide technical guidance and support.
You’ve Got What It Takes If You Have...
- Minimum bachelor’s degree in computer science, engineering, or a related field.
- 5+ years of experience in cloud operations.
- Strong communication and collaboration skills.
- Excellent troubleshooting and problem-solving skills.
- Comprehensive understanding of cloud computing principles and architectures.
- Extensive experience in Linux/Unix environments.
- Proficiency in containerization technologies like Docker and Kubernetes.
- Strong scripting skills in Python or Bash.
- Proficient in debugging and optimizing Java-based applications.
- Hands-on experience in deploying, optimizing, and troubleshooting applications on Tomcat and JBoss application servers.
- Hands-on experience in managing and optimizing Memcached, Nginx, ActiveMQ, Elasticsearch, and Redis applications.
- Experience with monitoring and logging tools such as Newrelic and ELK stack.
- Sound knowledge of networking concepts, including TCP/IP, DNS, and VPN.
- Proficient in automation and configuration management tools like Ansible, Jenkins, and Bitbucket.
- Thorough understanding of monitoring and alerting tools such as Nagios, New Relic, Grafana, and CheckMk.
- Experience with distributed storage technologies such as NFS, Netapp, and Amazon S3, as well as dynamic resource management frameworks (e.g: Kubernetes).
- Experience working in Datacenter and AWS cloud platforms.
#LI-Onsite
Read Full Description