Senior Service Reliability Engineer
Job Title
Senior Service Reliability Engineer
Purpose of the role
The Senior Site Reliability Engineer for Stratos will be responsible for ensuring the reliability, performance and scalability of our mission-critical platforms. In this role, you will be safeguarding operational excellence in the product under Stratos, influence reliability strategies, integral in production incident response, and helping to improve operational metrics.
The role will be collaborating closely with different teams, such as Development and Amadeus Production Support teams, to make sure target SLOs are met, making adjustments where needed and developing code to facilitate meeting such targets. The role is also expected to work on toil reduction projects, handle capacity tuning activities, revisiting existing SOPs and developing code for performance improvements.This is a hybrid position and would require you to be in the local office 2-3 days a week.
In this role you'll:
- Define and track Service Level Indicators (SLIs), Objectives (SLOs), and Error Budgets in partnership with engineering and product leads
- Collaborate with Operations and Development teams to drive service reliability, availability, and scalability
- Participate in toil reduction projects to minimize if not eliminate recurring manual activities performed by the team
- Respond to production incident and perform root cause analysis and continuous improvement
- Develop operational improvement items with development teams working with them closely in prioritizing these improvements
- Provide input on process improvements to Change, Release, and Incident Management
- Create and implement support playbooks that resources can use as part of emergency response to production issues
About the ideal candidate:
- Knowledgeable and experienced in utilizing different Azure resources such as Storage, Network, Functions, Logic Apps. App Services and AKS
- Have technical expertise on Azure DevOps, developing in git and working on gitops repo and build/release pipelines
- Have hands-on experience in developing Azure Powershell scripts, Azure Runbooks, or any other infrastructure automation tools
- Knowledgeable in cloud platform and AI technologies
- Experienced with monitoring and logging tools (Grafana, Dynatrace, Splunk)
- Proven ability to adapt to emerging cloud technologies and industry leading DevOps applications such as Terraform, Docker Containers, and Kubernetes
- Knowledgeable in cloud implementation of Navitaire products across different cloud infrastructure models
- Understands production environments and processes and ways on how they can be further optimized through various Azure features and other cloud technologies/services
- Proven ability to drive problem solving efforts through effective issue analysis
- Effectively works in a team environment and has the ability to work in a dynamic, fast-paced and multi-cultural environment
- Proficient in C#
- Willing to work on shifting schedules
Diversity & Inclusion
Amadeus aspires to be a leader in Diversity, Equity and Inclusion in the tech industry, enabling every employee to reach their full potential by fostering a culture of belonging and fair treatment, attracting the best talent from all backgrounds, and as a role model for an inclusive employee experience.
Amadeus is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to gender, race, ethnicity, sexual orientation, age, beliefs, disability or any other characteristics protected by law.