Overview

About the Role & Team

IDC is seeking a full time Data Analytics Web Scraping Engineer for our Webscraping and Data Harvesting Team based in Ostrava, Czech Republic. This role involves supporting our established team that focuses on web crawling and gathering data from the Internet. The primary responsibilities include deploying web crawling technology to collect structured and unstructured data from various sites on a specific schedule, as well as data cleaning, classifying, validating, and unifying based on business rules and taxonomy. Additionally, the role involves enriching the data with other information and integrating it into existing products and internal business processes.

What You’ll Do

  • Assist in web crawling and data gathering for our largest data product line.
  • Support the evaluation, creation, and deployment of web crawling technology.
  • Help develop machine learning algorithms with a focus on Natural Language Processing to clean, classify, and match gathered data to existing taxonomy.
  • Collaborate with internal business stakeholders to integrate scraped data into existing research processes and proprietary systems.
  • Work cross-departmentally to define metrics, guidelines, and strategies to measure data coverage and its quality.
  • Contribute to a global team in designing and building new products that aggregate and visualize scraped data from various sources.

What You Bring

  • Bachelor's Degree or equivalent in Mathematics, Computer Science, Statistics, Information Management, or IT in Economics.
  • Experience in data engineering or roles related to data engineering.
  • Demonstrated knowledge of object-oriented programming in Python.
  • Strong analytic skills related to working with unstructured datasets.
  • SQL knowledge and experience working with relational databases.
  • Proven ability to work independently and ensure completion of tasks accurately and on time.
  • Strong English communication skills in both verbal and written form.
  • Open to learn new technologies and tools.

Preferred Qualifications:

  • 1+ years of experience in machine learning or natural language processing.
  • Experience using technologies and tools such as: Browse.ai, Python Scrapy, Octoparse, Beautiful Soup, Mozenda, Pandas, NLTK, PostgreSQL/Snowflake, JavaScript (jQuery)

Why This Role Stands Out

At IDC, your work helps shape how the world understands technology and where it goes next. You collaborate with curious, high-caliber colleagues who value rigor, integrity, and shared success. As the premier global provider of trusted technology intelligence, IDC equips business and technology leaders with the evidence they need to make confident decisions. Our insights inform strategy, investment, and innovation across industries and regions.

Recognized by IIAR as Analyst Firm of the Year for five consecutive years, IDC sets the standard for credibility and impact. With more than 1,000 analysts worldwide and a truly global perspective, we combine deep expertise with practical relevance. Here, your ideas matter, your voice is heard, and your contributions provide the insights leaders rely on every day. It is meaningful work, backed by a culture that supports growth, collaboration, and long-term career development with a globally respected brand.

What We Offer

  • 5 weeks of holidays + extra corporate days off
  • Sick days
  • Flexibility to work from home most of the week
  • Certain flexibility to schedule your working hours
  • Cafeteria system (use points on Flexipasses, pension/life insurance, or Multisport card)

Equal Opportunity Employer

IDC is committed to providing equal employment opportunities for all qualified persons. Employment eligibility verification required. We participate in E-Verify.

#LI-SJ1

Read Full Description
Confirmed 8 hours ago. Posted 8 days ago.

Discover Similar Jobs

Suggested Articles