Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Site Reliability Engineer

Brickred

Site Reliability Engineer

National
Full Time
Paid
  • Responsibilities

    Roles and Responsibilities:

    • Influence and design cloud infrastructure, architecture, standards and methods for large-scale systems
    • Support services prior to production via infrastructure design, software platform development, load testing, capacity planning and launch reviews
    • Maintain services during deployment and in production by measuring and monitoring key performance and service level indicators including availability, latency, and overall system health
    • Automate system scalability and continually work to improve system resiliency, performance and efficiency
    • Practice sustainable incident response as part of an on-call rotation and through blameless postmortems
    • Remediate tasks within corrective action plan via sustainable, preventative, and automated measures whenever possible
    • Provision and manage GCP infrastructure including Deploying and implementing Google Compute Engine(GCE) resources
    • Automating infrastructure builds/configurations
    • Build and manage CI/CD pipelines using Jenkins.
    • Define, Implement and assign ownership for Stability/Reliability (SLIs, SLOs, Error Budgets)
    • Collaboration with tribes/dev teams on Reliability development (Fixes, Logging, Delivery Metrics)

    Key Skillsets:

    • 3+ years of experience developing and/or administering software in public cloud. Hands-on 6+ months in GCP.
    • Experience in monitoring infrastructure and application uptime and availability to ensure functional and performance objectives.
    • Experience in languages such as Python, Ruby, Bash, Java, Go, Perl, JavaScript and/or node.js
    • Demonstrable cross-functional knowledge with systems, storage, networking, security and databases System administration skills, including automation and orchestration of Linux/Windows using Chef, Puppet, Ansible, Salt Stack and/or containers (Docker, Kubernetes, etc.)
    • Proficiency with continuous integration and continuous delivery tooling and practices
    • Experience managing Infrastructure as code via tools such as Terraform or Cloud Formation
    • Experience in setting up and managing/modifying CI/CD pipelines using Jenkins.
    • Significant experience in configuring industry leading infrastructure/application monitoring tools (Stackdriver, Kibana, Grafana, Datadog, Splunk, Dynatrace, AppDynamics etc)

    Location:

    St. Louis, MO, USA

    Experience:

    8 to 15 Years

    Skills Required:

    JENKINS, GCP, Docker, Terraform, Python, CI CD,

    Roles:

    Job Description:

    Screening Highlights:

    • 3+ years of developing and/or administering software in public cloud (AWS, Azure or GCP).
    • 6+ months of hands-on experience in GCP.
    • Hands-on experience with informix/redhat workloads running in GCE.
    • Hands-on experience managing Infrastructure as Code using Terraform
    • Hands-on experience in setting up and managing/modifying CI/CD pipelines using Jenkins preferably.
    • Hands-on experience with scripting languages such as Python and Bash.