About Us
We are a fast-growing startup seeking a talented Backend Engineer to join our team in a fully remote capacity. Our company operates a large-scale web scraping/scrawling system, and we're looking for someone to take on a crucial role in overseeing and improving this system.
Position Overview
We are seeking a full-time Backend Python/Nodejs Engineer with extensive web scraping experience to oversee our entire web scraping system. This system currently covers 20 different domains and processes millions of requests weekly. The ideal candidate will have a strong background in backend engineering and be comfortable working with very large datasets.
Key Responsibilities
Oversee and manage our extensive web scraping system
Optimize and scale our web scraping operations to handle millions of requests weekly
Develop and maintain web scraping scripts using Puppeteer (NodeJS) and Scrapy (Python)
Perform data cleansing, validation, and processing on large datasets
Implement best practices for data storage, retrieval, and analysis
Collaborate with cross-functional teams to integrate scraped data into our products and services
Continuously improve the efficiency and reliability of our web scraping infrastructure
Stay up-to-date with the latest web scraping techniques and technologies
Required Qualifications
Bachelor's degree in Computer Science, Software Engineering, or a related field
Proven experience as a Backend Engineer, with a strong focus on web scraping
Extensive experience with Puppeteer (NodeJS) and Scrapy (Python)
Proficiency in working with large datasets, including data cleansing and validation
Strong understanding of web technologies, HTTP protocols, and HTML/CSS
Experience with distributed systems and high-volume data processing
Excellent problem-solving skills and attention to detail
Strong communication skills and ability to work in a remote team environment
Preferred Qualifications
Experience with cloud computing platforms (e.g., AWS, Google Cloud, Azure)
Knowledge of anti-bot detection techniques and how to overcome them ethically
Familiarity with data storage solutions (e.g., SQL, NoSQL databases)
Experience with data pipeline tools and ETL processes
Understanding of legal and ethical considerations in web scraping
If you're passionate about web scraping, enjoy working with large datasets, and are looking for a challenging role in a dynamic startup environment, we want to hear from you!