Job Description
Primary Responsibility: Build a resilient web scraping system to discover, extract, and refresh scholarship data regularly from various sources (including less-known but safe sites).
Skills Required:
Python (Scrapy, BeautifulSoup, Selenium)
Async scraping architecture (e.g., asyncio, AIOHTTP)
CAPTCHA bypassing & rate limiting techniques
Link safety & verification checks
URL normalization and deduplication
Key Deliverables:
A scraping engine that can scale to thousands of sources
Scheduler for regular refresh & dead-link detection
Normalized scholarship data ingestion system (CSV/JSON ready)