Senior Specialist, Site Reliability Engineering

apartmentLondon Stock Exchange Group placeTaguig scheduleFull-time calendar_month 

We are seeking a highly motivated and experienced Senior Associate to join the Shared Site Reliability Engineering (SRE) team supporting Risk Intelligence Services within the Markets and Risk Intelligence division. This role is essential to maintaining uninterrupted business operations across multiple applications while adhering to defined SLAs.

The successful candidate will be responsible for day-to-day operational support, collaborating closely with product SMEs to ensure smooth change deliveries, and contributing to observability improvements across the environment. As a subject matter expert for Risk Intelligence applications, the candidate will play a key role in incident resolution, system reliability, and mentoring junior team members in domain and process expertise.

Flexibility is required, including availability during out-of-office hours and public holidays, to support critical production services and on-call responsibilities. The ideal candidate will demonstrate strong ownership, technical depth, and a proactive approach to problem-solving in high-pressure situations.

Key Responsibilities
  • Ensure uninterrupted business operations by managing production support activities in alignment with defined SLAs
  • Take ownership of incident calls and lead resolution efforts until SMEs are engaged
  • Collaborate with product SMEs to ensure smooth and timely change deliveries
  • Participate in out-of-hours and on-call support, including overnight monitoring and weekend release activities
  • Contribute to observability analysis and drive improvements in monitoring, alerting, and telemetry
  • Maintain and enhance support documentation and runbooks for supported applications
  • Act as a subject matter expert for Risk Intelligence applications, providing deep technical and domain knowledge
  • Mentor junior team members to build domain and process expertise
  • Support continuous improvement initiatives across the SRE function
  • Communicate effectively with stakeholders and provide timely updates on incidents and operational status
Education
  • Bachelor’s degree or equivalent, preferably in a technical discipline
Required Skills and Experience
  • Experience with Linux (Amazon Linux AMI) and Windows Server 2019 in cloud environments
  • Proficient in MySQL, PostgreSQL, MongoDB, and Aurora RDS
  • Familiarity with AWS DocumentDB, DynamoDB, and SQLite
  • Knowledge of MS SQL Always On Availability Groups and migration to Azure SQL Managed Instances
  • Hands-on experience with AWS SQS and AWS SES
  • Exposure to Amazon MSK, Coviant, and Cerberus
  • Strong understanding of AWS S3 and EFS, including frontend integration
  • Experience with Synapse Analytics and D365
  • Skilled in development using Spring Boot, Node.js, Python (Django, Flask, Apache Airflow), Java (Java 11, Lambdas), React, Angular, JavaScript, C# (.NET Framework), and PHP
  • Proficient in containerization and orchestration using Docker, Amazon ECS, EKS, and EC2
Additional Attributes
  • 5+ years in production operations, SRE, or DevOps roles
  • Strong understanding of incident management and operational support in complex environments
  • Experience working in investment banking or financial services is preferred
  • Excellent analytical and problem-solving skills
  • Effective communicator with technical and business stakeholders
  • Self-motivated with strong prioritization and ownership skills
  • Collaborative mindset with a focus on mentoring and knowledge sharing
apartmentTerraBarn IncplaceMandaluyong, 8 km from Taguig
seamless, scalable application deployments. Your Role As a Senior Data Platform Engineer, you will be responsible for the operation, reliability, and continuous improvement of data platforms running on Kubernetes...
check_circleNew offer

Systems Engineer

apartmentReed Elsevier Shared Services Philippines IncplaceManila, 12 km from Taguig
Job Description We are looking for a skilled and proactive Open Systems Engineer to join our team. This position focuses on designing, deploying, and maintaining robust Unix/Linux infrastructures with a strong emphasis on performance, security...
apartmentGenpactplaceMakati, 5 km from Taguig
Site Reliability Engineer Makati City Ready to build the future with AI? At Genpact, we don’t just keep up with technology—we set the pace. AI and digital innovation are redefining industries, and we’re leading the charge. Genpact’s AI...