Equifax is looking for a motivated and enthusiastic Junior Python Developer to join our team. In this role, you will learn to develop, maintain, and optimize Python-based web scraping solutions. You will gain experience using libraries like Beautiful Soup, Scrapy, requests, Playwright, and Selenium to collect, clean, and structure data from various sources, ensuring its accuracy for analysis or automation.
This position involves learning to extract, refine, and process data from online sources, while understanding challenges such as dynamic content and anti-bot measures. The ideal candidate will have a foundational proficiency in Python, some exposure to web scraping concepts, and a basic understanding of HTML, CSS, and how browsers work.
What you'll do
- Write clean Python code to extract data from websites, ensuring efficiency, accuracy, and adherence to best practices.
- Handle structured and unstructured data from various sources, cleaning, and transforming it into usable formats.
- Identify and resolve issues related to website changes, access restrictions, and performance bottlenecks.
- Work with other developers, data scientists, and stakeholders to understand data requirements and ensure proper documentation of scraping processes.
- Continuously learn about new tools and techniques for web scraping and adapting to changes in website structures.
- Implement testing procedures for Python-based web scraping scripts.
- Adhere to web scraping best practices and legal standards to avoid issues like CAPTCHAs and IP blocking.
What experience do you need
- Bachelor’s degree in Computer Science, Software Engineering, Information Technology, or a related subject.
- 1-2 years of commercial experience in Python coding and scripting.
- 1+ years of experience in code development using JavaScript, HTML, and CSS, along with HTML structure knowledge for entity extraction.
- Any experience in web scraping, or conceptual understanding of it will be a plus.
- English proficiency B1+ or above.
What could set you apart
- Web crawling/scraping experience.
- Proficiency in Google Cloud Platform (GCP) services or equivalent cloud platforms.
- Network traffic understanding or experience.
- Proficiency in Git and experience with both relational or non-relational databases.
- Experience working with diverse data sources and formats.
- Familiarity with CI/CD pipelines.
Primary Location:
CRI-Sabana
Function:
Function - Tech Dev and Client Services
Schedule:
Full time
Read Full Description