Sorry, this listing is no longer accepting applications. Don’t worry, we have more awesome opportunities and internships for you.

Senior Site Reliability Engineer

Keller Williams

Senior Site Reliability Engineer

Austin, TX
Full Time
Paid
  • Responsibilities

    What you’ll be called: Senior Site Reliability Engineer

    Where you’ll work: KWRI Headquarters—Austin, TX

    Named a Happiest Company to Work for in 2019; one of the Best Places to Work in Austin, TX; and featured on the Training Magazine Training 125 list seven times, Keller Williams Realty International (KWRI) thrives within a creative and collaborative culture where transforming the real estate industry through technology is our primary goal.

    KW Technology is the foremost provider of real estate solutions, offering the most comprehensive end-to-end portfolio of products, services and training in the industry. Our team converts agent and consumer challenges into intuitive, insight-enhanced technology and consumer experiences using tools such as GCP, Docker, Kubernetes and Terraform.

    What you’ll do:

    Design and deploy a highly automated infrastructure, applying software development practices to improve platform reliability and services. Obtain and analyze data to identify trouble spots and opportunities for optimization. Since SRE is a recent adoption at Keller-Williams, you will be asked to not only become a member of the team but help to shape the strategy and implementation of this practice as our organization expands.

    Essential Duties and Responsibilities:

    Work with development partners to shape the architecture, design, and implementation of new and existing systems to enhance their reliability, performance, efficiency, and scalability

    Work with engineering and product team to identify and refine Service Level Indicators and Service Level Objectives

    Ensure all key services are measured, monitored and raising alerts when needed

    Automate deployment and configuration processes

    Develop reliability tools and frameworks for use by all engineers

    Drive efficiencies in systems and processes capacity planning, configuration management, performance tuning, monitoring, and root cause analysis.

    Be a subject matter expert in infrastructure and best practices

    Help development teams use infrastructure more effectively.

    Drive capacity planning and help teams anticipate and prepare for growth.

    Play a part in incident management and emergency response (some after-hours on-call may be required).

    Qualifications:

    Preferred technologies will have an asterisk. Not all areas of knowledge are required.

    Experience with Cloud orchestration using Terraform*

    Experience with cloud hosting providers such as GCP*, AWS, Azure.

    Familiar with SRE Practices as defined by Google.*

    Experience with microservice orchestration using Kubernetes*, DC/OS

    Experience in creating and implementing containerization strategies using Docker.

    Experience in networking configuration associated with Cloud infrastructure such as load balancing, NAT, firewalls, reverse proxying, subnetting, OSI model.

    Establishing and implementing monitoring and alerting best practices while leveraging tools such as Stackdriver*, Prometheus, DataDog, etc.

    Experience working with log indexing and aggregation tools such as Splunk

    Experience with a high-level programming language, such as Python, Ruby, Perl

    Configuring and implementing a CI/CD Platform leveraging tools such as CircleCI*, Jenkins, Travis CI, Spinnaker.

    Experience in configuration management using tools such as Ansible, Chef, Puppet, etc.

    Experience with documentation and knowledge sharing across engineering organizations.

    Proven experience in designing, analyzing and solving problems of large-scale distributed systems.

    Understanding and working in the operating system internals of Linux.

    Experience with workflow tools and methodologies including Jira*, Trello, Agile, Sprint, Kanban.

    Experience troubleshooting web and mobile applications.

    Enjoy working in a team and a highly collaborative environment.

    Excellent verbal and written communication skills.