Lead Data Engineer - AI-Powered Solutions

Strategic Employment

Lead Data Engineer - AI-Powered Solutions

Lancaster, PA
Full Time
Paid
  • Responsibilities

    Exciting opportunity to join a $100M+ home services company building the next generation of AI-driven solutions. As the Founding Data Engineer, you will design and implement the entire data infrastructure that powers AI across the business. You’ll work with high-volume, messy data from multiple sources, clean and transform it, and make it accessible for advanced analytics and machine learning.

    Key Responsibilities

    • Architect and maintain scalable data infrastructure from scratch.
    • Collect and integrate large datasets from APIs, relational databases, and external sources.
    • Build ETL pipelines to clean, transform, and prepare data for AI models.
    • Implement data storage solutions (data lakes, warehouses) for accessibility and performance.
    • Ensure data quality, governance, and reliability across all pipelines.
    • Collaborate with AI/ML teams to deliver production-ready datasets.

    Technologies & Skills - A combination of the below is expected (not all):

    • Programming: Python (Pandas, NumPy), SQL
    • Databases: Relational (PostgreSQL, MySQL)
    • Data Warehousing / Lakehouse: Snowflake or Databricks (Delta Lake)
    • ETL & Workflow Orchestration: Apache Airflow, dbt
    • API Integration: Strong experience consuming APIs for data ingestion (REST, GraphQL)
    • Cloud Platforms: AWS, Azure, or GCP
    • Data Quality: Great Expectations or similar
    • Version Control & CI/CD: Git, GitHub Actions

    Ideal Candidate

    • 4+ years in data engineering or similar roles.
    • Proven experience with high-volume data ingestion and cleaning.
    • Strong understanding of data architecture and pipeline design.
    • Comfortable in a fast-paced, startup environment.
    • Passionate about building systems that power AI.