Senior Site Reliability Engineer (SRE)
Modulus Labs Inc. Pasig
Role Overview
As a Senior Site Reliability Engineer, you will be a key player in ensuring the reliability, security, and scalability of our cloud-native payment systems. You will work closely with engineering teams to build resilient infrastructure, automate operations, and maintain high availability while adhering to stringent compliance requirements.
Key Responsibilities- Design, build, and maintain scalable, secure, and highly available cloud infrastructure on AWS supporting mission-critical payment systems.
- Develop and automate CI/CD pipelines enabling safe and rapid software delivery.
- Manage and optimize Kubernetes clusters, container orchestration, and service mesh architectures with a strong focus on reliability and security.
- Implement and enforce infrastructure as code (IaC) practices using Terraform and related tools.
- Drive incident response processes, including detection, mitigation, root cause analysis, and post-mortem reviews.
- Collaborate with developers and product teams to enhance system observability, performance, and resilience.
- Architect and enforce network, security, and compliance controls aligned with PCI DSS and fintech industry standards.
- Build and maintain monitoring, alerting, and logging frameworks using tools like Datadog, Grafana, and New Relic.
- Continuously evaluate new tools, technologies, and methodologies to improve operational excellence.
- Extensive experience working in SRE, DevOps, or Platform Engineer roles within the fintech or financial services sector.
- Strong expertise with AWS services (EC2, S3, IAM, RDS, Lambda, VPC) and security best practices.
- Proficiency with Terraform for managing infrastructure as code.
- Deep knowledge of Kubernetes, container orchestration, and service meshes (e.g., Istio, Cilium).
- Strong skills in scripting and automation (Python, Bash, or similar).
- Experience designing and operating CI/CD pipelines (GitHub Actions, GitLab CI/CD).
- Solid understanding of networking (DNS, load balancing, firewalls, VPNs).
- Proven track record in incident management, including root cause analysis and post-mortem processes.
- Familiarity with monitoring and observability tools (Datadog, Grafana, New Relic).
- Knowledge of database systems (PostgreSQL, MySQL, MongoDB) and their cloud deployment best practices.
- Strong grasp of version control and Git workflows.
- AWS Certifications (Solutions Architect, DevOps Engineer, or similar).
- Hands-on experience with cluster autoscaling solutions like Karpenter.
- Familiarity with PCI DSS and other regulatory compliance frameworks in fintech.
- Experience with cloud-native security practices and secrets management.
Dairy Farm International HoldingsMandaluyong, 5 km from Pasig
As a Site Reliability Engineer (SRE) at DFI Retail Group, you will be the bridge between development and operations, ensuring our systems are designed, implemented, and maintained for maximum reliability, scalability, and performance.
You...
Razer Inc.Philippines
Job Responsibilities :
We are seeking a skilled and driven Senior Site Reliability Engineer (SRE) to join our growing infrastructure and platform engineering team. The ideal candidate will have hands-on experience in Amazon Web Services (AWS), strong...
Electronic Transfer And Advance Processing Inc.Quezon City, 10 km from Pasig
Job Description
We are seeking a Senior Site Reliability Engineer (SRE) to lead the design, deployment, and management of highly available and scalable AWS cloud infrastructure. This role will focus on building automation solutions, optimizing...