Senior Site Reliability Engineer

Reed Business Information Limited

Senior Site Reliability Engineer

Salary Not Specified

Reed Business Information Limited, City of Westminster

  • Full time
  • Permanent
  • Onsite working

Posted 2 weeks ago, 18 Apr | Get your application in now before you miss out!

Closing date: Closing date not specified

job Ref: e58875ebee6d48478f30047e1b6e6c2c

Full Job Description

As a Senior Site Reliability Engineer, your purpose is to ensure that the company's systems and applications are available, reliable, and performant at all times. We are seeking a skilled Senior Site Reliability Engineer (SRE) to join our dynamic Product Team. As an SRE, you will play a crucial role in ensuring the reliability, scalability, and performance of our products, working closely with the product team and aligning with the overall SRE team roadmap.,

  • Leading SRE Efforts: Lead the Site Reliability Engineering efforts within the assigned product squad, acting as the primary point of contact for all SRE-related activities.

  • Collaborating: Work closely with Core SRE engineers and teams to understand business requirements and translate them into scalable and reliable infrastructure solutions for the assigned squad/product.

  • Ownership: Take ownership of all SRE responsibilities for the project, including actively monitoring incidents, security vulnerabilities, and infrastructure upgrades.

  • Tool Utilization: Utilize observability tools such as Datadog, Grafana, and Elastic for monitoring, analytics, and proactive issue identification.

  • Incident Management: Demonstrate proficiency in incident management, promptly and effectively responding to unforeseen challenges to minimize downtime and impact on operations.

  • Cloud Infrastructure: Implement and maintain cloud infrastructure solutions, particularly in AWS (Amazon Web Services), ensuring high availability, scalability, and security.

  • Automation: Employ containerization technologies and infrastructure-as-code tools such as Terraform, CloudFormation, or Ansible to automate deployment and management processes.

  • CI/CD Optimization: Design, build, and optimize Continuous Integration and Continuous Deployment (CI/CD) pipelines using tools like GitHub Actions, Jenkins, or TeamCity.

  • Expertise in Ubuntu and Linux: Possess expert-level knowledge of Ubuntu and Linux distributions, with the ability to troubleshoot and optimize system performance.

  • Culture Development: Foster a culture of reliability, automation, and innovation within the product squad, driving continuous improvement in processes and technologies.

    Experience in a Site Reliability Engineering/DevOps role, with a track record of successfully managing and improving the reliability of complex systems.

  • Experience in observability tools such as Datadog, Grafana, and Elastic for monitoring, analytics, and proactive issue identification.

  • Experience in incident management, demonstrating the ability to respond promptly and effectively to unforeseen challenges.

  • Experience in cloud platforms, particularly AWS (Amazon Web Services), or proficiency in other leading cloud providers.

  • Expertise in containerization technologies and infrastructure-as-code tools such as Terraform, CloudFormation, or Ansible.

  • Experience in designing and implementing Continuous Integration and Continuous Deployment (CI/CD) pipelines using tools like GitHub Actions, Jenkins, or TeamCity.

  • Be able to show In-depth knowledge of Ubuntu and Linux distributions, with the ability to troubleshoot and optimize system performance.

  • Excellent communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.

    At Cirium, our goal is to keep the world connected. We are the industry leader in aviation analytics; helping our customers understand the past, present, and predicting what will happen tomorrow. Our mission is to transform the aviation industry by enabling airlines, airports, travel companies, tech giants, aircraft manufacturers, financial institutions and many more accelerate their own digital transformation. You can learn more about Cirium at the link below. https://www.cirium.com


  • About our Team

    You will be joining a collaborative, curious, team of Site Reliability Engineers at all different levels. By joining us you will have the opportunity to share ownership in solving this problem end to end. From exploring new data sources for building features, to design and put in production predictive models and make sure they perform consistently over time., LexisNexis Risk Solutions is very supportive of women in Technology and has been a founding signature for the Tech Talent Charter. Currently 27% of our Technology workforce are women which is much higher than the UK average of 17%. We have the following initiatives in place to support women in technology