Senior Java Developer (Web Scraping)

Equifax

Education
Benefits
Qualifications
Skills

We are seeking an experienced and strategic Senior Java Developer specializing in web scraping to lead our data acquisition efforts. This role is pivotal in designing, developing, and optimizing the sophisticated, large-scale scraping solutions that fuel our core business intelligence and analytics platforms.

As a senior member of the team, you will not only tackle the most complex technical challenges but also contribute to our overall data strategy, mentor junior developers, and set the standard for quality and reliability. The ideal candidate is a master of the Java ecosystem, an expert in navigating advanced anti-bot measures, and a proactive leader passionate about building resilient, high-throughput data systems.

What you'll do

  • Lead Scraper Development: Spearhead the development, maintenance, and continuous optimization of our Java-based web scraping solutions, ensuring they are scalable, resilient, and efficient.
  • Ensure Data Quality and Integrity: Implement and enforce robust data validation and quality assurance frameworks to guarantee the utmost accuracy and completeness of all extracted information, meeting stringent business standards.
  • Architect for Scale: Design and optimize web scrapers for high-throughput performance, capable of handling millions of records daily while maintaining stability and efficiency.
  • Strategic Contribution: Actively contribute to defining and implementing the organization's data acquisition strategy, ensuring tight alignment with overarching business goals and objectives.
  • Advanced Problem-Solving: Tackle and solve the most difficult scraping challenges, including navigating dynamic content, evading sophisticated anti-bot countermeasures, and reverse-engineering data sources.
  • Documentation and Best Practices: Create and maintain comprehensive documentation for scraper logic, configurations, and data extraction processes to ensure maintainability and facilitate knowledge sharing across the team.
  • Mentorship and Guidance: Provide mentorship and technical guidance to junior and mid-level team members, fostering their growth and elevating the team's overall technical capabilities.
  • Compliance and Ethics: Uphold a solid understanding of the legal and ethical considerations related to web scraping, including data privacy regulations (e.g., GDPR, CCPA) and website terms of service, ensuring all operations are compliant.
  • Cross-Functional Collaboration: Collaborate effectively with data analysts, product managers, and other key stakeholders to translate complex data requirements into high-quality, reliable technical solutions.

What experience do you need

  • A Bachelor’s or Master's degree in Computer Science, Software Engineering, Information Technology, or a related subject.
  • 5+ years of demonstrated expertise in Java development, with a significant focus on data extraction and processing, with at least 3+ years of hands-on expertise with the Java web scraping ecosystem, including deep knowledge of libraries like Jsoup for parsing, and extensive experience with browser automation frameworks such as Selenium, HtmlUnit, or Jauntium.
  • 5+ years of Experience with large-scale crawling frameworks like Crawler4j or WebMagic to manage complex, multi-domain data acquisition projects and demonstrated experience with advanced anti-bot evasion techniques, including proxy management solutions (e.g., rotating residential proxies) and CAPTCHA-solving strategies.
  • Proven history of managing and processing large, complex datasets, including robust data validation and quality control, with at least 1+ years of experience in related complex domains such as reverse engineering, distributed systems, or data analysis.
  • English proficiency of B2 minimum.

What could set you apart

  • Knowledge of asynchronous and multi-threaded programming in Java to build highly concurrent applications.
  • Experience with Java HTTP clients like Apache HttpClient or the native Java 11+ HttpClient for building custom, high-performance request logic.
  • Familiarity with CI/CD pipelines and tools like Airflow or Jenkins for automated testing and deployment of scraping solutions.
  • Practical experience implementing data quality frameworks and monitoring systems within a cloud environment.
  • Familiarity with deploying and managing scraping infrastructure on cloud platforms (e.g., AWS, Google Cloud, Azure).

Primary Location:

CRI-Sabana

Function:

Function - Tech Dev and Client Services

Schedule:

Full time

Read Full Description
Confirmed 6 hours ago. Posted 3 days ago.

Discover Similar Jobs

Suggested Articles