The Data Engineer Lead will be responsible for delivering data pipelines and associated data service components. The data engineer will need to design highly efficient pipelines for performance and reliability. Custom components will need to be developed to ensure privacy and integrity of the data. As the data ocean is established, the data engineer will also expose data with a common interface allowing the monetization of data across the whole of Vodafone in a controlled and secure fashion. Key focus for and engineer will be :-
Work with Scrum master, Product Owner and Test Engineer in an Agile Squad to create data pipelines for each data source.
Integrate the necessary diverse and distributed data from multiple sources transforming this to a central data format. Replacing existing data sources with a central solution on Google Cloud Platform.
Build data pipelines that make use of large volumes of data and generate outputs that allow commercial and business actions that generate incremental value.
Create semantic layers for Business Intelligence users and feature engineering for data science.
Deliver and implement core capabilities (frameworks, platform, development infrastructure, documentation, guidelines and support) to speed up the delivery.
Coach and mentor other Data engineers in the team to ensure industry best practices are being followed and the implementation is aligned with the guidelines and principles published by Analytics Center of Excellence team.
Expert level experience in designing, building and managing data pipelines to process large amounts of data in a Big Data ecosystem;
Experience of build data pipelines and data layers on Google Cloud Platform
Experience building Data pipeline to perform real-time and Batch data processing using Cloud Data Fusion, CDAP, Pub/sub and Data Flow
Experience with common SDLC, including SCM, build tools, unit testing, continuous delivery and agile practises.
Experience working on SFTP, File based Data processing, Rest API and Oracle
Experience on creating ETL Pipeline with Extraction and transformation.
Experience on Google Tink and Encryption.
Very Good Technical and Analytical knowledge
Experience of Google Cloud Platform
Google Data Engineering certification.
Expert level experience with Hadoop ecosystem (Spark, Hive/Impala, HBase, Yarn);
Strong software development experience in Java, Scala and Python programming languages.
Experience on Java Plugin Development .
Experience with Unix-based systems, including bash programming
Experience with Large volume data movement.
Experience with Big Query Datasets and GCP Cloud Storage
Experience of SQL, Oracle, Rest API and CSV
Worked in Agile environment
Read Full Description