DevOps Engineer - Shanghai

Thatgamecompany

thatgamecompany designs and develops artistically crafted, broadly accessible video games that push the boundaries of interactive entertainment. We respect our players and want to contribute meaningful, enriching experiences that touch and inspire them.

We seek talent that values integrity and personal growth within an environment of intense collaboration and experimentation.

Our mission - create timeless entertainment that makes a positive change to the human psyche worldwide. If you’ve been wanting to have a positive impact on people's lives and the creative challenges associated with doing something different and unique, then come help us shine brighter together.

thatgamecompany is building an R&D team in China. The new team will help the company with a long-term tech stack, including backend infrastructure, multi-tenant platforms and microservices, data warehouse, and data pipeline.

Location: Shanghai

On any given day at thatgamecompany, you might:

  • Write code to describe the backend infrastructure, and make the deployment and configurations visible, readable and maintainable.
  • Build tools for rapid iteration, CI/CD, monitoring, diagnosis, and easy access to the backend systems.
  • Embrace modern container and cluster management technology to make our backend stack more elastic and robust.
  • Improve and maintain an agile and reliable development environment for the backend stack, so that people with different skillsets in the company can do social experiments easily, and new hires can ramp up quickly.
  • Monitor the backend health and respond to any failures or glitches to deliver a smooth online experience to players worldwide; keep improving dev-ops tools to make the job more automatic and error-proof. 

We expect you to:

  • Have deep passion and thoughts for video games; be a gamer and think on behalf of players.
  • Be comfortable taking risks and accomplish engineering achievements that no one else has done.
  • Enjoy working with fast-moving and rapidly-growing small teams.
  • Comfortable with periodic on-call duty.

Required Skills

  • One year or more experience in DevOps in a production environment.
  • Be comfortable working with the Linux ecosystem. Be fluent in Linux or macOS bash CLI tools and Python scripting.
  • Have basic knowledge of operating systems and low-level network protocols.
  • Be able to extract useful information from different sources of logs, find correlations between multiple layers of systems and diagnose failures, suspicious behaviors, and performance bottlenecks from bottom to top.
  • Eager to learn any new technology and always open to jumping out of your comfort zone. 
  • Capable to understand English documentation. Fluent in written English for technical communications in chat tools. Be able to speak English for daily life.

Preferred Skills

Any of the following would be highly preferred, but most of all, we value engineers who are eager to learn new ways to deliver value to players: 

  • Experienced in the production deployment of Docker and Kubernetes.
  • Managed and maintained production environment on AWS or GCP.
  • Deployed services in Kubernetes with CI/CD tools.
  • Have deep knowledge of Terraform or Ansible.
  • Have deep knowledge of one SQL or NoSQL database and be aware of how its storage engine works under the hood.
  • Have experience using and configuring monitor tools such as DataDog or Grafana.
  • Be familiar with ElasticSearch and Kibana.
  • Fluent in spoken English for professional communications.

thatgamecompany正在组建其在中国的后端研发团队。该团队将协助公司进行长线技术储备,包括服务器端基础设施、能够用于多款游戏的平台及微服务、海量数据仓库及配套数据管线等。

岗位职责

你将在日常工作中涉及到:

  • 对系统后台部署进行代码描述,实现Infrastructure as Code。改进服务器组件的部署与配置流程,提高其可见性、可读性,使其更加易于维护。
  • 开发内部工具,以帮助和改进后端的快速迭代、CI/CD自动化部署、实时监控、错误排查等各类运维任务。
  • 使用前沿的容器及集群管理技术,使我们的后台系统更加稳定并易于伸缩。
  • 为工程师改善和维护我们日常的后端开发环境,以便让不同岗位的开发者都能够参与后端工作(例如快速简便地实现一些线上社交试验),并让新员工能够更快地上手工作。
  • 日常监控服务器的运行情况,并对任何系统抖动和事故做出快速反应,以保证我们全球的玩家都能有平稳舒适的游戏体验。同时,不断改进我们的内部运维工具,使得这些日常维护工作能够更加安全和自动化。

我们希望你:

  • 热爱电子游戏并对其有深刻理解和思考。作为一名游戏玩家,从玩家的角度考虑问题。
  • 敢于挑战困难、承担风险,实现别人从未做过的工程成就。
  • 能够适应快节奏、快速扩张的小团队。
  • 参与轮班on-call安排。

基本技能要求

  • 一年以上生产环境运维经验。
  • 熟悉Linux生态系统。熟练掌握Linux或macOS的常用命令行工具,能够编写Python脚本。
  • 对操作系统及底层计算机网络原理有基本的了解。
  • 善于从海量系统日志中提取有用信息,以及在后端系统各层级的数据及指标中寻找相关性,以便准确排查故障、识别可疑行为、定位性能瓶颈。
  • 热衷于学习新技术,对不同观点持开放态度并能跳出自己的舒适区。
  • 能够阅读并理解英文文档,进行书面英文技术交流。

进阶技能要求

我们非常看重以下技能。但是最重要的是你愿意学习新知识并不断为玩家贡献价值。

  • 具有在生产环境部署和维护Docker及Kubernetes的经验。
  • 具有在生产环境管理AWS或GCP公有云的经验。
  • 使用过自动化运维工具(CI/CD)来部署Kubernetes服务。
  • 对Terraform或Ansible有较深刻的理解或使用经验。
  • 至少熟悉任意一种SQL或NoSQL数据库,理解其底层存储引擎是如何运作的。
  • 有配置DataDog或Grafana等监控工具的经验。
  • 熟悉ElasticSearch及Kibana。
  • 具有熟练的英语口语交流能力。
Read Full Description
Confirmed 6 hours ago. Posted 30+ days ago.

Discover Similar Jobs

Suggested Articles