Our client is looking for an Observability Engineer with deep expertise in monitoring, logging, alerting, and incident management in Google Cloud Platform (GCP) environments. This role combines strong problem-solving skills, hands-on GCP experience, and a focus on ensuring system health, reliability, and compliance.
Must-Haves:
-
Bachelor’s degree in CS/IT or equivalent experience
-
Strong programming skills (Python)
-
Proven experience with GCP services (Compute Engine, Cloud Storage, Cloud Functions)
-
Expertise in logging/monitoring tools (Google Cloud Logging, Google Cloud Monitoring, Prometheus-Grafana, OTEL, Splunk)
-
Experience setting up alerting systems and incident management processes
-
Understanding of cloud security best practices and compliance requirements
-
Excellent verbal and written communication skills
Nice-to-Haves:
-
GCP certifications (Professional Cloud Architect, Professional Cloud DevOps Engineer)
-
Experience with Infrastructure as Code tools (Terraform, Cloud Deployment Manager)
-
Knowledge of containerization (Docker, Kubernetes) and integration with GCP
Key Responsibilities:
-
Design, implement, and maintain observability solutions for GCP-based infrastructure
-
Build and optimize monitoring dashboards, alerting systems, and incident response processes
-
Ensure systems meet security, compliance, and reliability standards
-
Collaborate with engineering and operations teams to troubleshoot, resolve, and prevent issues
-
Automate observability processes and integrate with CI/CD pipelines where applicable