← Back to job listings
TS
Site Reliability Engineer
Tasq Staffing Solutions, Inc. · Makati City, Metro Manila, Philippines
About The Role
- At least 5 years of experience with in a similar role, with deep understanding of monitoring and application performance management
- Bachelor's degree of any specialization
- Knowledge of SLO platforms (e.g., Nobl9) and experience contributing to standards/governance artifacts.
- Knowledge of proactive monitoring using Azure monitor services, telemetry, and synthetic transactions.
- Understanding of network architecture and security: WAN/LAN, TCP/IP, PKI.Internal - General Use
- Familiarity with ITSM processes and tools (e.g., ServiceNow), and compliance processes
- Have AIOps vision and awareness
- Excellent communication skills to drive continuous improvement by reducing alert noise, shorten MTTR, and improve change success by embedding postmortem learnings into patterns, rules, and pipeline:
- Cloud Observability: Azure Monitor/App Insights/Log Analytics (KQL)
- Knowledge of Grafana, Prometheus, App Dynamics, ThousandEyes
- Uses SLI/SLOs, postmortems, and CMDB and other context to reduce noise, drive self-healing, and measurably improve MTTR and KPIs.
Key Responsibilities
- You will design and define standards, patterns, and automations opportunities that elevate monitoring and reliability across platforms and applications, with a strong focus on Azure Monitor, ServiceNow ITOM Event Management, Grafana, and APM/Synthetics tooling
- You’ll partner with product teams to implement SLO/SLI-driven operations, reduce alert noise, accelerate incident response, and embed self-healing where it matters most.
- Engineer enterprise monitoring & event patterns by authoring and maintaining reference architectures, runbooks, and event management models (alert → event → incident) with actionable alerts and incidents routing.
- Contribute to Monitoring and Observability & Event Management Strategy and tooling intake/governance checkpoints and coach product teamsInternal - General Use
- Excellent communication skills to drive continuous improvement by reducing alert noise, shorten MTTR, and improve change success by embedding postmortem learnings into patterns, rules, and pipelines.
Technologies and Tools
- SRE Practices: Observability and Monitoring
- Cloud Observability: Azure Monitor/App Insights/Log Analytics (KQL)
- Grafana/Prometheus for metrics visualization where applicable
- ServiceNow ITOM Event Management
- Azure Fundamentals, Azure Monitor
- DevOps and Automation Tools
- Grafana, Prometheus, App Dynamics, ThousandEyes
- Application Performance Monitoring and Digital User Experience tools
Additional Details
- Work set-up: Hybrid 3x / RTO 2x per week
- Work shift: Nightshift
- Location: Eton Centris or Ayala, Makati
This listing was posted by a verified recruiter at Tasq Staffing Solutions, Inc.. Report this listing
JobSpring