Job Description
Role: Senior DevOps Engineer / Platform Reliability Lead
OleOle is looking for a senior DevOps leader to own the reliability, scalability, and operational integrity of a global, real-time football platform.
This s a hands-on leadership role. The core architecture, technologies, and product direction are already defined. The focus now is execution—building infrastructure that scales cleanly, fails gracefully, and supports millions of users during the world’s biggest sporting moments.
You will be responsible for ensuring that a complex, multi-system platform operates as one reliable, observable, and secure system.
What you will own
- End-to-end ownership of cloud infrastructure and platform reliability
- Design and operation of high-availability, fault-tolerant systems
- Kubernetes-based environments supporting real-time social, messaging, AI, and financial services
- CI/CD pipelines that are safe, repeatable, and trusted by engineers
- Monitoring, logging, alerting, and incident response across the entire platform
- Security, access control, secrets management, and operational best practices
- Production readiness for traffic spikes tied to live matches and global tournaments
This role exists to prevent problems before users ever see them and to restore systems quickly and calmly when issues occur.
What you’ll work on
- Operating and scaling real-time systems for live scores, messaging, and in-match activity
- Supporting AI translation workloads without impacting core platform performance
- Ensuring wallet, rewards, and financial infrastructure remain secure, auditable, and always available
- Managing production-grade MediaWiki infrastructure used for large-scale football history content
- Designing failover strategies so no single system can take down the platform
- Creating clear separation between development, staging, and production environments
What we’re looking for
Required
- 7+ years of experience in DevOps, SRE, or platform engineering roles
- Deep experience with AWS and cloud-native architectures
- Strong Kubernetes and container orchestration experience
- Proven track record running high-traffic, real-time production systems
- Infrastructure-as-Code experience (Terraform preferred)
- Strong understanding of Linux, networking, and system debugging
- Experience designing systems for reliability, not just deployment
Strong plus
- Experience supporting crypto platforms, wallets, or exchanges
- Experience with Rust or high-performance backend systems
- Experience with live data feeds, sports, trading, or messaging platforms
- Prior ownership of incident response and on-call operations
How you work
- You think in systems, not tickets
- You anticipate failure modes instead of reacting to them
- You communicate clearly and directly when something is unsafe or broken
- You are comfortable making decisions and taking ownership
- You focus on stability, clarity, and long-term maintainability
This is not a role for someone who wants to debate architecture endlessly. The decisions are made. This role is about making them work in the real world.