ML Data Engineer #978695

Dexian

Education
Benefits
Qualifications
Special Commitments
Skills

Job Title: Data Engineer – AI/ML Pipelines

Location: Seffner, FL

Work Model: Hybrid

Duration: CTH

Position Summary

The Data Engineer – AI/ML Pipelines plays a key role in designing, building, and maintaining scalable data infrastructure that powers analytics and machine learning initiatives. This position focuses on developing production-grade data pipelines that support end-to-end ML workflows—from data ingestion and transformation to feature engineering, model deployment, and monitoring.

The ideal candidate has hands-on experience working with operational systems such as Warehouse Management Systems (WMS) or ERP platforms, and is comfortable partnering closely with data scientists, ML engineers, and operational stakeholders to deliver high-quality, ML-ready datasets.

Key Responsibilities

ML-Focused Data Engineering

  • Build, optimize, and maintain data pipelines specifically designed for machine learning workflows.
  • Collaborate with data scientists to develop feature sets, implement data versioning, and support model training, evaluation, and retraining cycles.
  • Participate in initiatives involving feature stores, model input validation, and monitoring of data quality feeding ML systems.

Data Integration from Operational Systems

  • Ingest, normalize, and transform data from WMS, ERP, telemetry, and other operational data sources.
  • Model and enhance operational datasets to support real-time analytics and predictive modeling use cases.

Pipeline Automation & Orchestration

  • Build automated, reliable, and scalable pipelines using tools such as Azure Data Factory, Airflow, or Databricks Workflows.
  • Ensure data availability, accuracy, and timeliness across both batch and streaming systems.

Data Governance & Quality

  • Implement validation frameworks, anomaly detection, and reconciliation processes to ensure high-quality ML inputs.
  • Support metadata management, lineage tracking, and documentation of governed, auditable data flows.

Cross-Functional Collaboration

  • Work closely with data scientists, ML engineers, software engineers, and business teams to gather requirements and deliver ML-ready datasets.
  • Translate modeling and analytics needs into efficient, scalable data architecture solutions.

Documentation & Mentorship

  • Document data flows, data mappings, and pipeline logic in a clear, reproducible format.
  • Provide guidance and mentorship to junior engineers and analysts on ML-focused data engineering best practices.

Required Qualifications

Technical Skills

  • Strong experience building ML-focused data pipelines, including feature engineering and model lifecycle support.
  • Proficiency in Python, SQL, and modern data transformation tools (dbt, Spark, Delta Lake, or similar).
  • Solid understanding of orchestrators and cloud data platforms (Azure, Databricks, etc.).
  • Familiarity with ML operations tools such as MLflow, TFX, or equivalent frameworks.
  • Hands-on experience working with WMS or operational/logistics data.

Experience

  • 5+ years in data engineering, with at least 2 years directly supporting AI/ML applications or teams.
  • Experience designing and maintaining production-grade pipelines in cloud environments.
  • Proven ability to collaborate with data scientists and translate ML requirements into scalable data solutions.

Education & Credentials

  • Bachelor’s degree in Computer Science, Data Engineering, Data Science, or a related field (Master’s preferred).
  • Relevant certifications are a plus (e.g., Azure AI Engineer, Databricks ML, Google Professional Data Engineer).

Preferred Qualifications

  • Experience with real-time ingestion using Kafka, Kinesis, Event Hub, or similar.
  • Exposure to MLOps practices and CI/CD for data pipelines.
  • Background in logistics, warehousing, fulfillment, or similar operational domains.
Read Full Description
Confirmed 9 hours ago. Posted 7 days ago.

Discover Similar Jobs

Suggested Articles