Lead Data Engineer - AI-Powered Solutions

Strategic Employment

Lancaster, PA

Full Time

Paid

Responsibilities
Exciting opportunity to join a $100M+ home services company building the next generation of AI-driven solutions. As the Founding Data Engineer, you will design and implement the entire data infrastructure that powers AI across the business. You’ll work with high-volume, messy data from multiple sources, clean and transform it, and make it accessible for advanced analytics and machine learning.
Key Responsibilities
- Architect and maintain scalable data infrastructure from scratch.
- Collect and integrate large datasets from APIs, relational databases, and external sources.
- Build ETL pipelines to clean, transform, and prepare data for AI models.
- Implement data storage solutions (data lakes, warehouses) for accessibility and performance.
- Ensure data quality, governance, and reliability across all pipelines.
- Collaborate with AI/ML teams to deliver production-ready datasets.
Technologies & Skills - A combination of the below is expected (not all):
- Programming: Python (Pandas, NumPy), SQL
- Databases: Relational (PostgreSQL, MySQL)
- Data Warehousing / Lakehouse: Snowflake or Databricks (Delta Lake)
- ETL & Workflow Orchestration: Apache Airflow, dbt
- API Integration: Strong experience consuming APIs for data ingestion (REST, GraphQL)
- Cloud Platforms: AWS, Azure, or GCP
- Data Quality: Great Expectations or similar
- Version Control & CI/CD: Git, GitHub Actions
Ideal Candidate
- 4+ years in data engineering or similar roles.
- Proven experience with high-volume data ingestion and cleaning.
- Strong understanding of data architecture and pipeline design.
- Comfortable in a fast-paced, startup environment.
- Passionate about building systems that power AI.