Principal Site Reliability Engineer
London (Hybrid) / contract / Negoitable
Principal Site Reliability Engineer - Contract 6 Months+
Are you a highly experienced Principal Site Reliability Engineer with a passion for optimising systems and leading technical initiatives?
You'll be instrumental in shaping the reliability and performance of our infrastructure, driving innovation, and mentoring talent.
Role
- Lead multiple delivery projects while managing day-to-day BAU and support tasks.
- Proactively develop and implement infrastructure as code to enhance availability, scalability, latency, and efficiency.
- Take the lead on troubleshooting and problem-solving complex technical issues.
- Contribute to and create new designs, architectures, and standards for projects and solutions.
- Help ensure the stability and security of our environment.
- Support capacity planning, demand forecasting, software performance analysis, and system tuning.
- Define and champion processes and best practices for reliable and timely delivery.
- Communicate work status and design choices clearly and regularly.
Experience
- A proven track record in technical and mentoring roles within an SRE department.
- Extensive experience in solving complex technical issues, managing projects, and mentoring engineers.
- Ability to collaborate effectively with stakeholders and communicate clearly.
- A strong ability to excel in a fast-paced, constantly evolving digital environment
You'll have deep experience in many of the following areas:
- Linux System Administration (Ubuntu)
- Cloud Platforms: Deployment and automation (AWS/GCP)
- Networking: Load Balancing, Routing, Switching
- Kubernetes: (EKS/GKE/K8s)
- Databases: SQL/NoSQL (Cassandra is a strong bonus)
- Scripting/System-Programming: Python, Bash, Go, Java
- Configuration Management: Ansible, Terraform
- CI/CD: Jenkins, Concourse, GitLab CI, GitHub Actions
- Virtualisation: VMWare
- Monitoring: Prometheus, Grafana, DataDog, Conviva, New Relic
- Logging Systems: ELK Stack
As a Principal SRE, DevOps or Cloud expert, you will have the technical expertise to independently create and implement robust solutions. Strong proficiency in AWS, EKS, and security best practices is essential.
Hybrid working with 2 days onsite in London (mid-week days)