Overview
Data Science Team works in developing Machine Learning (ML) and Artificial Intelligence (AI) projects. Specific scope of this role is to develop ML solution in support of ML/AI projects using big analytics toolsets in a CI/CD environment. Analytics toolsets may include DS tools/Spark/Databricks, and other technologies offered by Microsoft Azure or open-source toolsets. This role will also help automate the end-to-end cycle with Azure Pipelines.
You will be part of a collaborative interdisciplinary team around data, where you will be responsible of our continuous delivery of statistical/ML models. You will work closely with process owners, product owners and final business users. This will provide you the correct visibility and understanding of criticality of your developments.
Responsibilities
- Delivery of key Advanced Analytics/Data Science projects within time and budget, particularly around DevOps/MLOps and Machine Learning models in scope
- Active contributor to code & development in projects and services
- Partner with data engineers to ensure data access for discovery and proper data is prepared for model consumption.
- Partner with ML engineers working on industrialization.
- Communicate with business stakeholders in the process of service design, training and knowledge transfer.
- Support large-scale experimentation and build data-driven models.
- Refine requirements into modelling problems.
- Influence product teams through data-based recommendations.
- Research in state-of-the-art methodologies.
- Create documentation for learnings and knowledge transfer.
- Create reusable packages or libraries.
- Ensure on time and on budget delivery which satisfies project requirements, while adhering to enterprise architecture standards
- Leverage big data technologies to help process data and build scaled data pipelines (batch to real time)
- Implement end-to-end ML lifecycle with Azure Databricks and Azure Pipelines
- Automate ML models deployments
Qualifications
- BE/B.Tech in Computer Science, Maths, technical fields.
- Overall 2-4 years of experience working as a Data Scientist.
- 2+ years’ experience building solutions in the commercial or in the supply chain space.
- 2+ years working in a team to deliver production level analytic solutions. Fluent in git (version control). Understanding of Jenkins, Docker are a plus.
- Fluent in SQL syntaxis.
- 2+ years’ experience in Statistical/ML techniques to solve supervised (regression, classification) and unsupervised problems.
- 2+ years’ experience in developing business problem related statistical/ML modeling with industry tools with primary focus on Python or Pyspark development.
- Data Science – Hands on experience and strong knowledge of building machine learning models – supervised and unsupervised models. Knowledge of Time series/Demand Forecast models is a plus
- Programming Skills – Hands-on experience in statistical programming languages like Python, Pyspark and database query languages like SQL
- Statistics – Good applied statistical skills, including knowledge of statistical tests, distributions, regression, maximum likelihood estimators
- Cloud (Azure) – Experience in Databricks and ADF is desirable
- Familiarity with Spark, Hive, Pig is an added advantage
- Business storytelling and communicating data insights in business consumable format. Fluent in one Visualization tool.
- Strong communications and organizational skills with the ability to deal with ambiguity while juggling multiple priorities
- Experience with Agile methodology for team work and analytics ‘product’ creation.
- Experience in Reinforcement Learning is a plus.
- Experience in Simulation and Optimization problems in any space is a plus.
- Experience with Bayesian methods is a plus.
- Experience with Causal inference is a plus.
- Experience with NLP is a plus.
- Experience with Responsible AI is a plus.
- Experience with distributed machine learning is a plus
- Experience in DevOps, hands-on experience with one or more cloud service providers AWS, GCP, Azure(preferred)
- Model deployment experience is a plus
- Experience with version control systems like GitHub and CI/CD tools
- Experience in Exploratory data Analysis
- Knowledge of ML Ops / DevOps and deploying ML models is preferred
- Experience using MLFlow, Kubeflow etc. will be preferred
- Experience executing and contributing to ML OPS automation infrastructure is good to have
- Exceptional analytical and problem-solving skills
- Stakeholder engagement-BU, Vendors.
- Experience building statistical models in the Retail or Supply chain space is a plus
Read Full Description