Site Reliability Engineer, Distribution Engineering

NBCUniversal

Site Reliability Engineer, Distribution Engineering

Stamford, CT
Full Time
Paid
  • Responsibilities

    Job Description

    NBCUniversal is seeking creative and driven Site Reliability Engineers to join our Distribution Engineering team. This team supports the infrastructure and systems that power NBCU’s broadcast, streaming, and monitoring platforms. Within Distribution Engineering, we’re hiring SRE’s across three closely integrated focus areas: Video Streaming, Monitoring & Control, and Playout. As an SRE, you will be responsible for the engineering, operations, support, deployment, and maintenance of critical systems across on-premises and cloud environments. You will work in a fast-paced, agile environment where innovation and reliability are key.

    • Develop automation to deploy, maintain, and monitor infrastructure and applications.
    • Troubleshoot and resolve issues in live, on-air environments.
    • Participate in CI/CD pipelines, including code deployment, testing, and monitoring.
    • Create and maintain system metrics, dashboards, and alerting to ensure high availability.
    • Collaborate with engineering, operations, and vendor teams to support system health and performance.
    • Act as a Level 2 support resource for broadcast-related incidents, including root cause analysis and documentation.
    • Participate in on-call rotation for 24/7 support coverage.
    • Evaluate new technologies and contribute to proof-of-concept deployments.
    • Document system configurations, incident resolutions, and operational procedures.
  • Qualifications

    Qualifications

    REQUIREMENTS:

    • Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).
    • 3+ years of SRE experience in the technology sector supporting and maintaining production-quality software or software-defined infrastructure in a high traffic environment run in cloud environments (AWS preferred)
    • Experience with IP video and broadcast technologies.
    • Proficiency in Linux system administration.
    • Experience with Infrastructure as Code (Terraform or CloudFormation) and configuration management technologies (Ansible).
    • Familiarity with CI/CD tools (e.g., GitHub Actions, Jenkins, ArgoCD).
    • Experience with containerization and orchestration (Docker, Kubernetes, EKS).
    • Scripting experience (Python, Bash, or similar).
    • Strong understanding of networking fundamentals and troubleshooting.
    • Experience with monitoring/logging tools (e.g., Grafana, Splunk, ELK, CloudWatch).
    • Comfortable working in agile, fast-paced environments.

    _ Hybrid : This position has been designated as hybrid, generally contributing from the Stamford, CT office a minimum of 3 days per week. _

    Additional Information:

    This position is eligible for company-sponsored benefits, including medical, dental, and vision insurance, 401(k), paid leave, tuition reimbursement, and various other discounts and perks. For a comprehensive overview of the benefits offered by NBCUniversal, please visit the Benefits page on the Careers website.

    Salary Range: $110,000 - $145,000

    PREFERRED QUALIFICATIONS:

    • Experience maintaining both Linux and Windows environments
    • Familiarity with broadcast and monitoring tools such as Dataminer, TAG systems, and/or MediaProxy
    • Strong hands-on experience debugging and troubleshooting distributed microservices in Kubernetes, including analyzing pod logs
    • Solid understanding of networking concepts relevant to video streaming, including multicast, unicast, RTP/RTMP, and CDN workflows
    • Ability to take ownership of problems and drive solutions through automation where applicable (Automation-first mentality)

    ** Depending on the specific team, candidates will benefit from experience in one or more of the following areas:**

    Video Streaming

    • Experience with live TV broadcasting, OTT streaming, and video/audio codecs.
    • Familiarity with ARQ technologies and cloud-based video distribution.
    • Experience supporting 24x7 production environments and customer-facing systems.
    • Use of AI/ML for data analysis or operational insights.

    Monitoring & Control

    • Deep experience with monitoring and alerting tools (Grafana, Splunk, ELK Stack).
    • Ability to build end-to-end dashboards and alerts for enterprise systems.
    • Experience with frontend technologies (React, NodeJS, Typescript) and UI design.
    • Familiarity with SMPTE standards and PTP implementation.

    Playout

    • Experience deploying and supporting playout systems in cloud and hybrid environments.
    • Monitoring Tools: Grafana (loki), Splunk
    • Familiarity with broadcast automation and IP video distribution workflows.
    • Experience evaluating software releases for reliability and integration.
    • Strong design and problem-solving skills in broadcast infrastructure.

    Additional Information

    As part of our selection process, external candidates may be required to attend an in-person interview with an NBCUniversal employee at one of our locations prior to a hiring decision. NBCUniversal's policy is to provide equal employment opportunities to all applicants and employees without regard to race, color, religion, creed, gender, gender identity or expression, age, national origin or ancestry, citizenship, disability, sexual orientation, marital status, pregnancy, veteran status, membership in the uniformed services, genetic information, or any other basis protected by applicable law.

    If you are a qualified individual with a disability or a disabled veteran and require support throughout the application and/or recruitment process as a result of your disability, you have the right to request a reasonable accommodation. You can submit your request to AccessibilitySupport@nbcuni.com.