Reliability Engineer – Cloud Infrastructure Associate Manager

apartmentAccenture placeQuezon City scheduleFull-time calendar_month 

Job Description

We are looking for a highly skilled and motivated Network Site Reliability Engineer (SRE) to join our team. The Network SRE will be responsible for ensuring the reliability, availability, and performance of our network infrastructure. This role involves working closely with various departments to implement best practices, automate processes, and enhance system resilience.

Role Responsibilities
  • Design, implement, and maintain scalable and reliable on-premise and cloud network infrastructure solutions.
  • Continuously monitor network performance and traffic patterns to ensure optimal performance, high availability, and low latency.
  • Utilize monitoring tools and dashboards to detect, analyze, and resolve network anomalies.
  • Proactively identify and mitigate risks affecting network reliability.
  • Develop and maintain automation scripts and tools to streamline network provisioning, configuration, and incident resolution whilst reducing manual intervention.
  • Collaborate with development and operations teams to ensure seamless integration and deployment of network services.
  • Implement and manage monitoring, logging, and alerting systems to proactively identify and address potential network issues.
  • Continuously improve network performance, security, and scalability through regular assessments and optimizations.
  • Troubleshoot complex network issues, working cross-functionally to identify root causes and provide lasting solutions.
  • Analyze and forecast network growth, collaborating with the team to scale infrastructure in line with company needs.
  • Maintain comprehensive documentation for network architecture, processes, and troubleshooting guides.

Experience / Competences

Essential
  • 7-10 Years
  • Bachelor’s degree in computer science, Information Technology, or a related field.
  • Strong understanding of network protocols (TCP/IP, BGP, OSPF) and network security practices.
  • Strong knowledge of cloud platforms (e.g., AWS, Azure, Google Cloud) and networking technologies (e.g., VPN, DNS, load balancing).
  • Proficiency in scripting languages (e.g., Python, Bash) and automation tools (e.g., Ansible, Terraform).
  • Experience with network monitoring and logging tools (e.g., Thousandeyes, Prometheus, Grafana, ELK stack).
  • Excellent troubleshooting skills with the ability to resolve complex network-related issues quickly and the ability to work under pressure.
  • Strong collaboration skills, with the ability to communicate effectively across teams.
Desired
  • Cisco, Fortinet, F5 Certifications or equivalent.
  • Cloud Certifications such as AWS Certified Advanced Networking, Azure Network Engineer Associate, or equivalent.
  • Experience with DevOps practices and CI/CD pipelines.
  • Knowledge of SRE principles.
apartmentManila North Harbour Port, Inc.placeQuezon City
Develop and maintain standard operating procedures (SOPs) and technical documentation related to maintenance practices and reliability engineering practices.   6.  Review and continue development and execution of a reliability centered maintenance...
placeQuezon City
and scalability.  •  Collaborate with development teams to implement best practices for reliability and security.  •  Optimize system performance through continuous improvement and proactive maintenance.  •  Implement effective monitoring and alert systems...
apartmentConnectOSplaceMandaluyong, 8 km from Quezon City
Administrator, or DevOps Engineer) – Helpful for managing cloud infrastructure and services. What will you do?  •  Provide operational support and engineering solutions for core systems and Laravel-based applications.  •  Monitor and maintain system stability...