Incident and Observability Engineer (SRE) - Permanent WFH Set-up

apartmentiScale Solutions placeMakati scheduleFull-time calendar_month 

Key Responsibilities:

  • Monitor and support application and system infrastructure to ensure high availability and uptime.
  • Troubleshoot recurring system and application alerts based on defined SOPs and coordinate internal teams for resolution.
  • Perform regular system health checks and proactively identify and resolve issues.
  • Initiate and facilitate major incident response activities, ensuring timely service restoration and communication with stakeholders.
  • Evaluate and determine incident impact, triggering critical escalation protocols when necessary.
  • Communicate incident progress and resolution clearly to internal teams, executives, and customers as needed.
  • Continuously improve observability practices and provide feedback on monitoring tools and procedures.
  • Collaborate with Technical Support, Developers, SAs, DBAs, and Systems Administrators to ensure seamless operations.

Qualifications:

  • 4+ years of experience in application infrastructure support, monitoring, and troubleshooting.
  • 1+ year of hands-on experience working in production environments (Linux, Windows, or Unix) using command line tools.
  • Solid understanding of networking and security concepts.
  • Strong background in working with relational databases such as MSSQL and/or Oracle.
  • Experience in cross-functional team environments.
  • Basic scripting experience (e.g., Python, PowerShell, Bash, or Perl) to automate tasks and extract data.

Technical Skills:

  • Databases: SQL Server, Oracle
  • Operating Systems: Linux, Windows
  • Monitoring Tools: Nagios, SolarWinds, Shinken, New Relic, Splunk, Grafana
  • Scripting Languages: Python, Perl, PowerShell, Bash
  • Version Control: Git

Soft Skills & Personal Traits:

  • Exceptional written and verbal communication skills — able to clearly convey complex technical information to non-technical audiences.
  • Strong coordination and leadership skills without requiring direct authority over resources.
  • Calm and focused under pressure, especially during live incident events.
  • Adaptable, resilient, and capable of thinking quickly to adjust strategies as incidents unfold.
  • Business and end-user focused, with an emphasis on proactive service restoration and customer communication.
apartmentExceptional Home Offer LLCplaceManila, 6 km from Makati
in traffic, and can spend more quality time with your family while still providing for their daily necessities? Sounds appealing, right? But wait, there's more!  •  Permanent full-time work from home job  •  Health card  •  Signing Bonus  •  Birthday Leave...
apartmentBizForceplacePhilippines
with the latest technology and resources. BizForce is proud to partner with PH to deliver qualified, professional workers, giving our Architects and Engineers the ability to have an unbeatable career and work from home. This offer from "BizForce" has been...
apartmentEMAPTAplaceMakati
in connecting global supply chains with accuracy and speed. Job Overview Employment type: Full-time Shift: 08:00 AM - 05:00 PM Work setup: Permanent work from home Exciting Perks Await!  •  Day 1 HMO coverage with free dependent  •  Competitive Salary Package...