
The objective is to design a monitoring and observability system that collects logs, metrics, and traces from distributed cloud applications to ensure performance reliability and proactive issue detection.
Deploy distributed application on cloud Integrate Prometheus for metrics collection
Configure Grafana dashboards Implement centralized logging Set up alerting policies Analyze system performance trends Conduct failure simulation tests Document monitoring strategy