ETL/Data Engineer

Vergence

ETL/Data Engineer

Indianapolis, IN
Full Time
Paid
  • Responsibilities

    Vergence is seeking a Senior Azure Data Engineer to help design, build, and operate our next-generation
    enterprise data platform on Microsoft Azure. You will own end-to-end delivery of data pipelines and
    data products that power analytics, regulatory reporting, operational dashboards, and emerging AI/ML
    use cases. You will partner closely with data architects, analytics engineers, data scientists, business
    stakeholders, and platform engineering teams to deliver reliable, performance, secure, and costefficient data solutions.
    This role is ideal for an engineer with strong hands-on depth in Azure Data Factory, Azure Synapse
    Analytics and/or Databricks, and modern Lakehouse patterns, who is comfortable leading migration
    programs (e.g., Informatica-to-ADF, on-prem warehouse-to-cloud), mentoring mid-level engineers, and
    shaping engineering standards across the team.

    Key Responsibilities:
    Pipeline Design & Development
    • Design and build robust, reusable, parameter-driven ingestion and transformation pipelines
    using Azure Data Factory, Synapse Pipelines, Data Bricks and/or Microsoft Fabric Data Factory.
    • Implement medallion architecture (Bronze / Silver / Gold) on Azure Data Lake Storage Gen2
    using Delta Lake, Parquet, and structured streaming patterns.
    • Build performant ELT workflows that leverage pushdown to source systems (Synapse Dedicated
    SQL Pool, Azure SQL, Teradata) where appropriate.
    • Develop and optimize PySpark notebooks and jobs on Azure Databricks or Synapse Spark.
    Data Modeling & Warehousing
    • Design dimensional models (Kimball star/snowflake) and data vault patterns for analytics
    consumption.
    • Implement Slowly Changing Dimensions (Type 1/2/3), Change Data Capture, and late-arriving
    data patterns.
    • Tune distributed SQL workloads in Synapse Dedicated SQL Pool / Fabric Warehouse, including
    distribution keys, partitioning, and clustered column store indexes.
    Platform Engineering & DevOps
    • Implement CI/CD for data pipelines using Azure DevOps (YAML pipelines,
    ARM/Bicep/Terraform) across Dev / SIT / UAT / Prod environments.
    • Instrument pipelines with robust logging, auditing, and monitoring using Azure Monitor, Log
    Analytics, and KQL.
    • Define and enforce coding standards, code review practices, branching strategies, and release
    management.
    Migration & Modernization
    • Lead or contribute to legacy-to-cloud migrations — e.g., Informatica PowerCenter to Azure Data
    Factory, on-premises Teradata / Oracle / SQL Server to Synapse or Fabric.
    • Perform workload assessment, capacity planning, and cost modeling for target-state
    architectures.
    • production incident response for critical pipelines.
    Required Qualifications:
    • Deep hands-on expertise with Azure Data Factory: pipelines, datasets, linked services, triggers,
    parameterization, mapping data flows, and all three Integration Runtime types (Azure, Selfhosted, SSIS).
    • Strong Experience in Data Bricks and PySpark.
    • Production experience with one or more of: Azure Synapse Analytics (Dedicated and Serverless
    SQL Pools, Spark Pools) OR Azure Databricks (Delta Lake, Unity Catalog) OR Microsoft Fabric
    (Warehouse, Lakehouse, OneLake).
    • Strong working knowledge of Azure Data Lake Storage Gen2 (hierarchical namespace, RBAC +
    ACLs, lifecycle management, security).
    • Experience with Azure Key Vault, Azure AD / Entra ID (including managed identities and service
    principals), and private networking (VNet integration, private endpoints).
    • Monitoring and troubleshooting with Azure Monitor, Log Analytics, and KQL.
    • Advanced SQL — window functions, CTEs, query optimization, execution plan analysis,
    performance tuning.
    • Strong Python for data engineering — pandas, PySpark, REST API integration, unit testing
    (pytest).
    • Proficient in T-SQL; familiarity with Spark SQL, KQL, PowerShell, and Bash shell scripting.

    Required Qualifications:
    • Deep hands-on expertise with Azure Data Factory: pipelines, datasets, linked services, triggers,
    parameterization, mapping data flows, and all three Integration Runtime types (Azure, Selfhosted, SSIS).
    • Strong Experience in Data Bricks and PySpark.
    • Production experience with one or more of: Azure Synapse Analytics (Dedicated and Serverless
    SQL Pools, Spark Pools) OR Azure Databricks (Delta Lake, Unity Catalog) OR Microsoft Fabric
    (Warehouse, Lakehouse, OneLake).
    • Strong working knowledge of Azure Data Lake Storage Gen2 (hierarchical namespace, RBAC +
    ACLs, lifecycle management, security).
    • Experience with Azure Key Vault, Azure AD / Entra ID (including managed identities and service
    principals), and private networking (VNet integration, private endpoints).
    • Monitoring and troubleshooting with Azure Monitor, Log Analytics, and KQL.
    • Advanced SQL — window functions, CTEs, query optimization, execution plan analysis,
    performance tuning.
    • Strong Python for data engineering — pandas, PySpark, REST API integration, unit testing
    (pytest).
    • Proficient in T-SQL; familiarity with Spark SQL, KQL, PowerShell, and Bash shell scripting.
    Preferred Qualifications:
    • 5+ years of data warehouse development experience.
    • 5+ years of data modeling experience using ERWIN or similar tools.
    • 2+ years of experience with Azure Data Factory and Snowflake.
    • Medicaid Domain Knowledge is a plus