About Trafilea
Trafilea is a global company that builds communities and transformative brands. We own the brands and take care of the entire customer journey, to deliver wow-worthy experiences that influence and empower millions of people globally.
Our culture is fast-paced and dynamic. We are data-driven enthusiasts, passionate about marketing, exponential technologies, and innovation.
We have over 300 hundred employees working around the world, connected by the same purpose and core values. Our support for this new way of working has led to being featured in Forbes and FlexJobs as one of the Top 25 Companies for Remote Workers.
We are looking for dynamic, dedicated, and committed individuals with a strong desire to grow, that can drive the brand forward on its truly exciting journey.
You want to know more about our Brands? Shapermint,Truekind & Empetua
We are looking for Cloud Infrastructure Engineers, responsible for designing, building, and maintaining Trafilea’s cloud infrastructure that supports a variety of mission-critical services. They plan, coordinate and execute projects for new implementations or integrations between systems or services. They collaborate effectively with developers and a committed team to understand their challenges, work through their issues and provide solutions that can be adopted widely.
Expected Outcomes & Responsibilities
- Manage and maintain infrastructure systems.
- Maintaining and developing a highly automated services landscape and open source services.
- Take over the ownership for integral components of technology and make sure it grows aligned with company success.
- Scale systems and ensure the availability of services with developers on changes to the infrastructure, required by new features and/or products.
- Build, engineer, and support cloud platform IaaS services
- Partner with application teams to provision scalable workloads reliable across distributed compute resources
- Provide engineering and operational support for infrastructure systems including configuration management, troubleshooting, and provisioning.
- Implement and maintain security controls.
- Work closely with development teams to understand application performance and behavior patterns to proactively monitor, tune, and correct issues.
- Identify opportunities to improve reliability, performance, and security.
- Develop tools and automation to eliminate manual and repetitive efforts.
- Continuously optimize secure, scalable, and performant tools and services.
- Drive fault detection and correction, performance, and uptime at a global scale.
- Instrument systems to gain visibility and understanding of how they are performing at any time
- Automate and orchestrate systems to enable accelerated software configuration deployment
- Build and maintain CI/CD infrastructure.
- Cloud Infrastructure Design and Architecture:
- Design scalable and reliable cloud architectures that meet business requirements.
- Develop and document cloud infrastructure design patterns and best practices.
- Ensure security, compliance, and resilience in cloud infrastructure designs.
- Cloud Deployment and Management:
- Deploy and configure cloud resources using infrastructure-as-code (IaC) tools.
- Manage and maintain cloud environments, including virtual machines, containers, and serverless platforms.
- Implement automation and orchestration to streamline cloud deployment processes.
- Cloud Security and Compliance:
- Implement robust security measures within the cloud environment, including access controls, encryption, and monitoring.
- Collaborate with security teams to ensure compliance with industry regulations and internal policies.
- Conduct regular security assessments and vulnerability scanning of cloud infrastructure.
- Cloud Optimization and Performance:
- Optimize cloud infrastructure for cost-efficiency, scalability, and performance.
- Monitor and analyze cloud resource utilization, identify bottlenecks, and recommend optimization strategies.
- Implement auto-scaling, load balancing, and caching mechanisms to enhance application performance.
- Incident Response and Troubleshooting:
- Respond to and resolve cloud-related incidents, such as infrastructure failures, performance issues, and security breaches.
- Conduct root cause analysis for incidents and implement preventive measures.
- Collaborate with cross-functional teams to troubleshoot and resolve cloud-related issues.
- Collaboration and Communication:
- Collaborate with development, operations, and other teams to integrate cloud resources into the overall architecture.
- Communicate effectively with stakeholders to gather requirements, provide updates, and address concerns.
- Share knowledge, best practices, and documentation with the team to foster continuous learning.
- Research and Innovation:
- Stay updated on emerging cloud technologies, trends, and best practices.
- Evaluate new cloud services and tools to enhance the organization's cloud infrastructure.
- Proactively propose innovative solutions to improve cloud processes, security, and performance.
- ROLE RELATED SKILLS
- (those skills that demonstrate the behavior that the person must have to achieve the above results)
- BS/MS in Computer Science, Engineering or related field.
- Knowledge of network layers protocols (IP, TCP, DNS, TLS, HTTP).
- Vast experience in Unix based (Linux) operating systems.
- Experience with containers and cluster management tools (Kubernetes, OpenShift, Docker Swarm, Docker).
- Excellent documentation and verbal+written communication skills in English.
- Experience with IaaS cloud providers, AWS is a must (GCP, Azure).
- Experience with Terraform IaC language.
- Experience with monitoring platforms such as Datadog, Prometheus, New Relic.
- Experience automating repetitive operational tasks (Python, shell, Node).
- Must have expertise in at least one development language, such as Python or other.
- Experience managing database engines such as PostgreSQL, MySQL, MariaDB.
- Deep, hands-on experience with foundational infrastructure capabilities, including HA, DR, Network, Routing, Firewall, DNS, Storage, DB, Linux Administration and configuration management.
- Experience developing Ansible, Puppet or Salt.
What We Have to Offer:
- Proximity doesn’t influence productivity. As a globally distributed team, you can live and work wherever you want.
- A rich experience including the opportunity to collaborate with world-class talents. Encouraging transparency and open communication to all.
- A data-driven, dynamic, energetic work environment, full of talented, goal-oriented, and empathetic people working together to grow and develop both as professionals and human beings.
- A safe space to be who you truly are. We embrace and support diversity, equity and work hard every day to keep becoming more inclusive.
- Openness to new ideas and initiatives: You can always join a squad, tribe, or committee, start new ones. Bring your hobbies and passions and transform them into projects!
For more benefits please visit our Trafilea web Site.
Are you ready? Apply for this position today and join the fastest-growing startup in the world!
Read Full Description