Required Qualification
- Strong experience with observability stacks, including AppDynamics, Dynatrace, Prometheus, OpenTelemetry (OTel), and Grafana.
- Designed and developed automation for routine tasks using GitOps and Python.
- Expertise in AppDynamics and Dynatrace OneAgent deployment, configuration, customization, and troubleshooting.
- Hands-on experience setting up synthetic monitoring.
- Subject matter expert in full-stack observability concepts, including APM tools, metrics collection, and trace analysis, with a passion for automation.
- Experience with Google Cloud Platform and a solid understanding of Kubernetes.
- This role is 70% hands-on, focusing on administration and automation tasks for AppDynamics and Dynatrace. The remaining 30% involves Site Reliability Engineering (SRE) tasks related to Prometheus, OTel, Grafana, and similar tools.
- A strong monitoring subject matter expert (SME) capable of collaborating with various teams to gather monitoring requirements, and design, develop, and implement effective monitoring solutions and alerting strategies.
- Proficient in Java and Python.
- Experience in DataDog is a plus.