Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

SRE Software Engineer

TEKREQS, Inc.

SRE Software Engineer

New York, NY
Full Time
Paid
  • Responsibilities

    Job Description

    SRE SOFTWARE ENGINEER - TICKER PLANT LOCATION: NEW YORK, NY

    Work as a SRE Software Engineer at a software product firm located in New York City, who is the premier provider of real-time market data to the financial world. The Ticker Plant group is at the core of both the firm's Professional Service and Enterprise Solutions products that process market data from around the globe. These systems process over 80 billion events a day publishing these in real-time while servicing millions of client queries from the firm's cutting edge time-series database. As SREs, the team is tasked with applying software engineering principles to solve the problems of owning large, expanding market data systems while ensuring the team maintains resiliency, efficiency, availability and visibility at any scale. Within this group, the SRE Team is comprised of experts in various specialties like software engineering, platform performance and automation ... and this team is growing!

    What’s in it for you: On this team, you'll design and develop scalable services that enhance the stability and reliability of the firm's market data infrastructure. You’ll be depended on to not only help set standards but also partner closely with the application engineers to ensure that all products meet those standards. You'll be trusted to create engineering solutions to operations problems, build systems capable of early detection of issues through metrics and signals and develop automated correction and remediation strategies.

    We’ll trust you to (Responsibilities): • Create solutions to monitor the health, availability, latency and reliability of our services with a focus on fault tolerant approaches • Proactively scale the group's services to stay ahead of ever increasing market data demands by driving capacity planning, instrumentation and performance analysis • Ensure service issues do not reoccur by architecting automation and remediation strategies employing signal detection and orchestration frameworks • Define service level objectives and drive measurable service improvement

    You'll need to have (Required Skills): • 3+ years professional work experience • Proficiency in one or more high level languages like C++, Python or Java • Good understanding of data structures and algorithms • Strong understanding of large-scale systems architecture • Working knowledge of UNIX/Linux • Strong Communications skills • Excellent problem solving skills, experience with triaging and solving production outages and a strong sense of ownership

    We'd love to see (Desired Skills): • Versatile in one or more scripting languages (Perl, Shell) • Building orchestration systems (Ansible, Salt, etc) • Config management (Chef, Puppet, CFEngine) • Knowledge of test frameworks (GTest, etc) • Familiarity with industry standard tools for collection of data across distributed systems, such as Splunk, Grafana, ElasticSearch, Nagios, Zabbix, etc • Experience with incident response and blameless postmortems