Sr Site Reliability Engineer (DevOps)

palo_alto_networks

Santa Clara, CA

Paid

Responsibilities
PALO ALTO NETWORKS® is the fastest growing security company in history. We offer the chance to be part of an important mission: ending breaches and protecting our way of digital life. If you are a motivated, intelligent, creative, and hardworking individual, then this job is for you!

Palo Alto Networks reinvented the enterprise firewall, growing from a start-up to a multi-billion-dollar company. PAN's Threat Prevention Cloud Services leverage the latest developments in big data, machine learning, virtualization, high-density storage, and distributed systems to enable threat analysis at a scale never seen before. Our cloud services, in conjunction with the reach of our firewalls deployed in tens of thousands of networks across the globe, create a data platform which no other enterprise security company has ever built. The size and complexity of the data we are dealing with is in line with some of the largest Internet companies.

RESPONSIBILITIES:
- You will be responsible for designing, building, maintaining, and scaling production services and server farms across multiple data centers for complex and data-intensive cloud services.
- You will design and enhance software architecture to improve scalability, service reliability, capacity, and performance.
- You will write automation code for provisioning and operating infrastructure at massive scale. You are not an operator, you’re an experienced software engineer focused on operations.
- You will work with development teams to make sure the applications fit nicely within the infrastructure and scalability/reliability is designed and implemented from the grounds up. You will work with QA on building pipelines and automation for delivering and deploying applications to production.
- You will participate in the occasional on-call rotation supporting the infrastructure.
- You will roll up the sleeves to troubleshoot incidents, formulate theories and test your hypothesis, and narrow down possibilities to find the root cause.
- You write postmortem reviews and remediation recommendation.
QUALIFICATIONS:
- Strong sense of architecture and design for fault tolerance, scale, and stability.
- Strong development/automation skills. Must be very comfortable with reading and writing Python code. Java is a plus.
- 10+ years of Unix/Linux experience (shell/tools/kernel/networking).
- Tools-first mindset. You build tools for yourself and others to increase efficiency and to make hard or repetitive tasks easy and quick.
- Subject matter expert in one of these areas: Big Data: Hadoop 2.x, Kafka, Spark, HBase, Elastic Search.:Data Center Virtualization: Containers, Mesos, OpenStack, SDN.
- Experience with Configuration Management and CI/CD. Salt and Jenkins preferred.
- Familiar with middleware software such as Nginx, HA Proxy,RabbitMQ, and typical AWS components, as building blocks of implementing services.
- Knowledgeable about collecting metrics, measuring systems and interpreting data to make decisions.
- Organized, focused on building, improving, resolving and delivering. Good communicator in and across teams, taking the lead.
Learn more about Palo Alto Networks here and check out our fast facts #LI-MB1