Senior / Lead / Principal Data Engineer (Experienced)

Salesforce

Job Title: Principal /Lead / Senior / Data Engineer

Location: San Francisco, CA

  • If you are currently in college or have less than a year of experience - please check out Future Force job opportunities in Salesforce

https://www.salesforce.com/company/careers/university-recruiting/

The team is made up of data scientists, engineers, growth analysts, and information management experts who are dedicated to driving product strategy with data-driven insights. The team works with executives, product managers, designers, developers, user researchers, marketers, and sales strategy team members across all Cloud businesses to discover new opportunities for growth and optimization, experiment with data, drive adoption, and provide actionable insights that impact product strategy.

Role Description:

The Data Engineer position will be responsible for designing, developing & maintaining all parts of the data pipeline to build interactive and curated data needed to drive insights through data science, reporting & analytics. The role requires to partner with Data Scientists, Analysts, Visualization and Information Management experts within Salesforce. This role involves making an impact by driving continuous improvements in moving, aggregating, profiling, sampling, testing and analyzing terabytes of data.

Responsibilities:

  • Own the technical solution design, lead the technical architecture and implementation of data acquisition and integration projects, both batch and real time. Define the overall solution architecture needed to implement a layered data stack that ensures a high level of data quality and timely insights.
  • Communicate with product owners and analysts to clarify requirements. Craft technical solutions and assemble design artifacts (functional design documents, data flow diagrams, data models, etc.).
  • Build data pipelines data processing tools and technologies in open source and proprietary products like Oracle, Hadoop, Pig, Hive, HBase, Spark, Salesforce API, and Talend big data fabric.
  • Serve the team as a subject matter expert & mentor for Hadoop ecosystem products, ETL design, and other related big data and programming technologies.
  • Identify incomplete data, improve quality of data, and integrate data from several data sources.
  • Proactively identify performance & data quality problems and drive the team to remediate them. Advocate architectural and code improvements to the team to improve execution speed and reliability.
  • Design and develop tailored data structures in database and Hadoop.
  • Quickly create functioning ETL prototypes to address quickly changing business needs.
  • Revamp prototypes to create production-ready data flows.
  • Support Data Science research by designing, developing, and maintaining all parts of the Big Data pipeline for reporting, statistical and machine learning, and computational requirements.
  • Perform data profiling, complex sampling, statistical testing, and testing of reliability on data.
  • Clearly articulate pros and cons of various technologies and platforms in open source and proprietary products. Execute proof of concept on new technology and tools to help the organization pick the best tools and solutions.
  • Harness operational excellence & continuous improvement with a can do leadership attitude.

Own the technical solution design, lead the technical architecture and implementation of data acquisition and integration projects, both batch and real time. Define the overall solution architecture needed to implement a layered data stack that ensures a high level of data quality and timely insights.

Communicate with product owners and analysts to clarify requirements. Craft technical solutions and assemble design artifacts (functional design documents, data flow diagrams, data models, etc.).

Build data pipelines data processing tools and technologies in open source and proprietary products like Oracle, Hadoop, Pig, Hive, HBase, Spark, Salesforce API, and Talend big data fabric.

Serve the team as a subject matter expert & mentor for Hadoop ecosystem products, ETL design, and other related big data and programming technologies.

Identify incomplete data, improve quality of data, and integrate data from several data sources.

Proactively identify performance & data quality problems and drive the team to remediate them. Advocate architectural and code improvements to the team to improve execution speed and reliability.

Design and develop tailored data structures in database and Hadoop.

Quickly create functioning ETL prototypes to address quickly changing business needs.

Revamp prototypes to create production-ready data flows.

Support Data Science research by designing, developing, and maintaining all parts of the Big Data pipeline for reporting, statistical and machine learning, and computational requirements.

Perform data profiling, complex sampling, statistical testing, and testing of reliability on data.

Clearly articulate pros and cons of various technologies and platforms in open source and proprietary products. Execute proof of concept on new technology and tools to help the organization pick the best tools and solutions.

Harness operational excellence & continuous improvement with a can do leadership attitude.

Job Requirements:

  • BS/MS degree in Computer Science, Engineering, Statistics, Mathematics, Physics, Operations Research, Econometrics, or equivalent/related degree.
  • 4+ years of intense experience with large scale data delivery platforms, solutioning and designing modern data systems to support exponentially data growth, and mentoring technical team members.
  • Previous projects should display technical leadership with an emphasis on data lake, data warehouse solutions, business intelligence, big data analytics, enterprise-scale custom data products.
  • Familiarity with new big data management techniques of schema on read, search analytics, graph analytics, semantic data lakes, linked data, etc.
  • Knowledge of data modeling techniques and high-volume ETL/ELT design.
  • Strong SQL optimization and performance tuning experience in a high volume data environment that utilizes parallel processing. Hadoop, Spark, Teradata platform experience a plus.
  • Experience with version control systems (Github, Subversion) and deployment tools (e.g. continuous integration) required.
  • Proven hands on experience with big data technologies like Hadoop MapReduce, Spark, Hive, Pig, HBase, Oozie, Elasticsearch, Talend big data fabric, and others.
  • Experience with programming languages like Java, Scala, C++ & scripting in Python, Perl, Bash.
  • Experience working with Public Cloud platforms like GPC, AWS, or Azure. 
  • Familiarity with agile project management methodologies and SDLC stages required.
  • TDWI/Hadoop/Spark/ML certification is a plus.
  • Hands-on on Salesforce.com knowledge of product and functionality a plus.
  • Ability to work effectively in an unstructured and fast-paced environment both independently and in a team setting, with a high degree of self-management with clear communication and commitment to delivery timelines.
  • Familiarity with Scrum/Agile project management methodologies and SDLC stages.
  • Strong problem solving with acute attention to detail and ability to meet tight deadlines and project plans.
  • Ability to research, analyze, interpret, and produce accurate results within reasonable turnaround times with an iterative mindset with rapid prototyping designs.

BS/MS degree in Computer Science, Engineering, Statistics, Mathematics, Physics, Operations Research, Econometrics, or equivalent/related degree.

4+ years of intense experience with large scale data delivery platforms, solutioning and designing modern data systems to support exponentially data growth, and mentoring technical team members.

Previous projects should display technical leadership with an emphasis on data lake, data warehouse solutions, business intelligence, big data analytics, enterprise-scale custom data products.

Familiarity with new big data management techniques of schema on read, search analytics, graph analytics, semantic data lakes, linked data, etc.

Knowledge of data modeling techniques and high-volume ETL/ELT design.

Strong SQL optimization and performance tuning experience in a high volume data environment that utilizes parallel processing. Hadoop, Spark, Teradata platform experience a plus.

Experience with version control systems (Github, Subversion) and deployment tools (e.g. continuous integration) required.

Proven hands on experience with big data technologies like Hadoop MapReduce, Spark, Hive, Pig, HBase, Oozie, Elasticsearch, Talend big data fabric, and others.

Experience with programming languages like Java, Scala, C++ & scripting in Python, Perl, Bash.

Experience working with Public Cloud platforms like GPC, AWS, or Azure. 

Familiarity with agile project management methodologies and SDLC stages required.

TDWI/Hadoop/Spark/ML certification is a plus.

Hands-on on Salesforce.com knowledge of product and functionality a plus.

Ability to work effectively in an unstructured and fast-paced environment both independently and in a team setting, with a high degree of self-management with clear communication and commitment to delivery timelines.

Familiarity with Scrum/Agile project management methodologies and SDLC stages.

Strong problem solving with acute attention to detail and ability to meet tight deadlines and project plans.

Ability to research, analyze, interpret, and produce accurate results within reasonable turnaround times with an iterative mindset with rapid prototyping designs.

Salesforce, the Customer Success Platform and world's #1 CRM, empowers companies to connect with their customers in a whole new way. The company was founded on three disruptive ideas: a new technology model in cloud computing, a pay-as-you-go business model, and a new integrated corporate philanthropy model. These founding principles have taken our company to great heights, including being named one of Forbes’s “World’s Most Innovative Company” five years in a row and one of Fortune’s “100 Best Companies to Work For” eight years in a row. We are the fastest growing of the top 10 enterprise software companies, and this level of growth equals incredible opportunities to grow a career at Salesforce. Together, with our whole Ohana (Hawaiian for "family") made up of our employees, customers, partners and communities, we are working to improve the state of the world.

Salesforce.com and Salesforce.org are Equal Employment Opportunity and Affirmative Action Employers. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender perception or identity, national origin, age, marital status, protected veteran status, or disability status. Headhunters and recruitment agencies may not submit resumes/CVs through this Web site or directly to managers. Salesforce.com and Salesforce.org do not accept unsolicited headhunter and agency resumes. Salesforce.com and Salesforce.org will not pay fees to any third-party agency or company that does not have a signed agreement with Salesforce.com or Salesforce.org.

Read Full Description
Confirmed 17 hours ago. Posted 30+ days ago.

Discover Similar Jobs

Suggested Articles