Joining Razer will place you on a global mission to revolutionize the way the world games. Razer is a place to do great work, offering you the opportunity to make an impact globally while working across a global team located across 5 continents. Razer is also a great place to work, providing you the unique, gamer-centric #LifeAtRazer experience that will put you in an accelerated growth, both personally and professionally.
Job Responsibilities/ 工作职责 :
Job Description
Responsibilities:
- Design, develop, and deploy web scraping solutions to collect specific datasets for AI training purposes.
- Build robust and scalable web crawlers to extract structured and unstructured data from various online sources.
- Ensure data accuracy, integrity, and compliance with relevant laws and regulations.
- Clean, preprocess, and organize scraped data for use in machine learning models.
- Monitor and optimize crawling performance to ensure efficiency and reliability.
- Collaborate with AI teams to define data requirements and ensure the relevance of collected data.
- Document crawling workflows, tools, and results for future reference.
Requirements:
- Bachelor's or master’s degree in computer science, Software Engineering, or a related field.
- Strong experience with web scraping tools and frameworks (e.g., Scrapy, Selenium, BeautifulSoup).
- Proficiency in programming languages like Python, Java, or Node.js.
- Familiarity with HTTP protocols, HTML parsing, and JSON data formats.
- Knowledge of database systems (SQL, NoSQL) for data storage and management.
- Experience with cloud platforms (e.g., AWS, GCP) and containerization tools (e.g., Docker).
- Strong understanding of web crawling ethics, regulations, and best practices.
- Excellent analytical skills and attention to detail.
Preferred Qualifications:
- Experience with large-scale data scraping and handling distributed crawlers.
- Familiarity with AI and machine learning concepts, especially data preprocessing for AI models.
- Knowledge of browser automation and tools for rendering dynamic content.
- Ability to handle multilingual data and diverse data formats.
岗位职责:
- 设计、开发并部署网页爬虫解决方案,收集特定数据用于AI模型训练。
- 构建稳健且可扩展的爬虫,提取结构化与非结构化数据。
- 确保数据的准确性、完整性,并符合相关法律法规。
- 对爬取的数据进行清理、预处理和组织,以便应用于机器学习模型。
- 监控并优化爬虫性能,确保其高效可靠运行。
- 与AI团队合作,明确数据需求,确保采集数据的相关性和价值。
- 记录爬虫工作流、工具和结果,以便未来参考和改进。
岗位要求:
- 计算机科学、软件工程或相关领域的学士或硕士学位。
- 熟练掌握网页爬取工具与框架(如Scrapy、Selenium、BeautifulSoup)。
- 熟悉Python、Java或Node.js等编程语言。
- 熟悉HTTP协议、HTML解析和JSON数据格式。
- 了解数据库系统(SQL、NoSQL)用于数据存储与管理。
- 有云平台(如AWS、GCP)及容器化工具(如Docker)使用经验。
- 深刻理解爬虫的伦理、法规及最佳实践。
- 具备优秀的分析能力与细节关注度。
优先条件:
- 有大规模数据爬取及分布式爬虫经验者优先。
- 熟悉AI与机器学习概念,尤其是AI模型的数据预处理者优先。
- 了解浏览器自动化及动态内容渲染工具者优先。
- 能处理多语言数据及多样化数据格式者优先。
Pre-Requisites/ 任职要求 :
Are you game?
Read Full Description