Lead Site Reliability Engineer

duvari group

Lead Site Reliability Engineer

Saint Louis, MO
Full Time
Paid
  • Responsibilities

    Lead the charge in building secure, high-performing cloud systems in Azure while shaping how reliability and automation are done at scale!

    This is a rare opportunity to step into a true leadership role where you influence architecture, elevate engineering standards, and drive real impact across modern cloud environments.

     

    Lead Site Reliability Engineer (Hybrid | St. Louis Area)

    Are you energized by building and scaling secure, cloud-first platforms? We are seeking a Lead Site Reliability Engineer to take on a key leadership role in shaping resilient, observable, and secure cloud environments within Microsoft Azure. This is a hands-on position where you will partner closely with platform and application engineering teams to elevate standards around automation, infrastructure, and secure delivery practices.

    In this role, you will help drive improvements across CI/CD pipelines, Infrastructure-as-Code with Terraform, and overall system reliability. You will guide teams in adopting reliability metrics, promote a security-first mindset, and actively participate in architecture discussions and incident reviews. If you enjoy working at the intersection of automation, cloud technologies, and system reliability, this is an opportunity to make a meaningful impact at scale.

    Work Authorization Requirement
    Candidates must be authorized to work in the United States without current or future sponsorship.

    Work Environment
    This position follows a hybrid model, with a minimum of three days per week onsite in the St. Louis area. Remote work requests are subject to company policy and approval.

    Required Qualifications

    • 10+ years of experience in Information Technology
    • 5+ years in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles
    • Strong experience designing and securing CI/CD pipelines using tools such as Azure DevOps or GitHub Actions
    • Advanced proficiency with Terraform for consistent and secure infrastructure deployments
    • Hands-on experience with Microsoft Azure services, particularly monitoring and security solutions like Application Insights, Azure Monitor, Log Analytics, and Defender for Cloud
    • Experience working with observability tools such as Datadog, Prometheus with Grafana, or ELK/EFK stacks
    • Solid scripting and automation skills in languages like Python, PowerShell, Bash, or Go
    • Experience with containerization and orchestration technologies such as Docker, Kubernetes, or Azure Kubernetes Service (AKS)
    • Strong understanding of secure software delivery practices, including secrets management, compliance automation, and artifact validation
    • Proven ability to collaborate across teams, mentor engineers, and influence technical direction

    Preferred Qualifications

    • Experience working with microservices and distributed system architectures
    • Familiarity with incident response and management tools such as PagerDuty or ServiceNow
    • Exposure to chaos engineering or reliability testing methodologies
    • Experience with GitOps approaches and tools like ArgoCD for declarative deployments