Site Reliability Engineering Specialist (Senior) - Quezon City

apartmentElectronic Transfer And Advance Processing Inc. placeQuezon City scheduleFull-time calendar_month 

Job Description

We are seeking a Senior Site Reliability Engineer (SRE) to lead the design, deployment, and management of highly available and scalable AWS cloud infrastructure. This role will focus on building automation solutions, optimizing system performance, and strengthening the reliability and security of cloud services.

As a senior member of the team, you will mentor junior engineers, partner with development and operations teams, and drive continuous improvement aligned with industry best practices in site reliability engineering.

Key Responsibilities
  • Architect, implement, and maintain scalable AWS infrastructure components (VPC, EC2, EKS, RDS, S3, IAM).
  • Build and manage automated solutions for infrastructure provisioning, monitoring, and incident response using Terraform, Ansible, and CloudFormation.
  • Lead incident management processes, including troubleshooting, root cause analysis, and long-term remediation strategies.
  • Optimize AWS environments for performance, security, and cost efficiency.
  • Mentor junior engineers, fostering a culture of reliability, automation, and continuous improvement.
  • Collaborate with stakeholders to establish and enforce SRE standards, security practices, and governance policies.
  • Stay current with emerging AWS services and recommend improvements to support long-term cloud strategies.
Qualifications
  • Bachelor’s degree in Computer Science, Engineering, or related discipline.
  • 5+ years of hands-on experience with AWS services (EC2, RDS, S3, EKS, VPC, Lambda, CloudFront) and cloud-native architectures.
  • Strong knowledge of networking concepts such as VPNs, routing, load balancing, and AWS security groups.
  • Advanced proficiency in scripting/automation (Python, Bash) and infrastructure-as-code (Terraform, Ansible, CloudFormation).
  • Experience with monitoring and observability tools (CloudWatch, Grafana, Prometheus).
  • Solid understanding of cloud security, IAM, and compliance standards.
  • Excellent troubleshooting, problem-solving, and multitasking abilities.
  • Leadership skills with proven experience mentoring and guiding technical teams.
Professional Experience
  • 5+ years in Site Reliability Engineering or related roles with deep AWS expertise.
  • Demonstrated success in designing and managing scalable AWS environments (EC2, EKS, RDS, S3, CloudFront, IAM).
  • Proven experience implementing infrastructure-as-code solutions using Terraform, Ansible, or CloudFormation.
  • Strong background in incident response, disaster recovery, and post-mortem analysis.
  • Hands-on experience with VPNs, secure network routing, and traffic management.
  • Track record of mentoring engineers and leading cross-functional technical initiatives.
  • Participation in on-call rotations and driving operational excellence through best practices.
thumb_up_altRecommended

Senior Site Reliability Engineer

apartmentDairy Farm International HoldingsplaceMandaluyong, 8 km from Quezon City
As a Site Reliability Engineer (SRE) at DFI Retail Group, you will be the bridge between development and operations, ensuring our systems are designed, implemented, and maintained for maximum reliability, scalability, and performance. You...
apartmentModulus Labs Inc.placePasig, 10 km from Quezon City
Role Overview As a Senior Site Reliability Engineer, you will be a key player in ensuring the reliability, security, and scalability of our cloud-native payment systems. You will work closely with engineering teams to build resilient infrastructure...
apartmentRazer Inc.placePhilippines
Job Responsibilities : We are seeking a skilled and driven Senior Site Reliability Engineer (SRE) to join our growing infrastructure and platform engineering team. The ideal candidate will have hands-on experience in Amazon Web Services (AWS), strong...