Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Databricks Engineer

AlxTel, Inc.

Databricks Engineer

Silver Spring, MD
Full Time
Paid
  • Responsibilities

    Benefits:

    401(k)

    Dental insurance

    Health insurance

    Paid time off

    Summary You will be responsible for the design, construction, and operation of a Data & AI platform centered around the Medallion Architecture (raw/bronze, curated/silver, and mart/gold layers). This role is critical in orchestrating complex data workflows and scalable ELT pipelines to integrate data from enterprise systems like PeopleSoft, D2L, and Salesforce, ensuring the delivery of high-quality, governed data suitable for machine learning, AI/BI, and analytics at scale.

    Responsibilities

    1. Data & AI Platform Engineering (Databricks-Centric):

    ◦ Design, implement, and optimize end-to-end data pipelines on Databricks, adhering to the Medallion Architecture principles.

    ◦ Build robust and scalable ETL/ELT pipelines using Apache Spark and Delta Lake to transform raw data (bronze) into trusted curated (silver) and analytics-ready (gold) data layers.

    ◦ Operationalize Databricks Workflows for orchestration, dependency management, and pipeline automation.

    ◦ Apply schema evolution and data versioning to support agile data development.

    1. Platform Integration & Data Ingestion:

    ◦ Connect and ingest data from enterprise systems such as PeopleSoft, D2L, and Salesforce using APIs, JDBC, or other integration frameworks.

    ◦ Design standardized data ingestion processes with automated error handling, retries, and alerting.

    1. Data Quality, Monitoring, and Governance:

    ◦ Implement data masking, tokenization, and anonymization for compliance with privacy regulations (e.g., GDPR, FERPA).

    ◦ Work with security teams to audit and certify compliance controls.

    1. AI/ML-Ready Data Foundation:

    ◦ Support AIOps/MLOps lifecycle workflows using MLflow for experiment tracking, model registry, and deployment within Databricks.

    ◦ Create reusable feature stores and training pipelines in collaboration with AI/ML teams.

    1. Cloud Data Architecture and Storage:

    ◦ Architect and manage data lakes on Azure Data Lake Storage (ADLS) or Amazon S3.

    ◦ Optimize data storage and access patterns for performance and cost-efficiency.

    Required Qualifications

    • Hands-on experience with Databricks, Delta Lake, and Apache Spark for large-scale data engineering.

    • Deep understanding of ELT pipeline development, orchestration, and monitoring in cloud-native environments.

    • Experience implementing Medallion Architecture (Bronze/Silver/Gold) and working with data versioning and schema enforcement in enterprise-grade environments.

    • Strong proficiency in SQL, Python, or Scala for data transformations and workflow logic.

    • Proven experience integrating enterprise platforms (e.g., PeopleSoft, Salesforce, D2L) into centralized data platforms.

    • Familiarity with data governance, lineage tracking, and metadata management tools.

    Preferred Qualifications

    • Experience with Databricks Unity Catalog for metadata management and access control.

    • Experience deploying ML models at scale using MLFlow or similar MLOps tools.

    • Familiarity with cloud platforms like Azure or AWS, including storage, security, and networking aspects.

    • Knowledge of data warehouse design and star/snowflake schema modeling.