AWS CloudWatch

  • Monitoring and management service which provides unified view of operational health

Why CloudWatch?

  • Gaining visibility across your distributed stack (servers, network, database etc.) at one place

What CloudWatch collects?

  • Collects monitoring and operational data in the form of
    • logs
      1. Vended logs – Natively published by AWS on behalf of customers e.g. AWS VPC Flow Logs and Amazon Route 53 logs
      2. Logs published by AWS services – include Amazon API Gateway, AWS Lambda, AWS CloudTrail, and many others
      3. Custom logs – Logs from application and on-prem resources.  Use AWS Systems Manager to install a CloudWatch Agent, or you can use the PutLogData API action to easily publish logs
    • metrics
      1. Built-in metrics – Default metrics are enabled by detailed. More more detailed, can be opt-in per resource
      2. custom metrics – application metrics etc. Use CloudWatch Agent or the PutMetricData API action to publish these metrics to CloudWatch
    • events
  • CloudWatch provides up to 1-second visibility of metrics and logs data, 15 months of data retention (metrics), and the ability to perform calculations on metrics

CloudWatch Integration

  • Integration with CloudWatch
    • natively integrated with AWS services
    • can use AWS Systems Manager to install a CloudWatch Agent
    • can use the CloudWatch API to easily collect, publish, and store this data in CloudWatch

Some CloudWatch use cases

Unified operational view with dashboards

  • dashboards enable you to create re-usable graphs and visualize your cloud resources and applications

Automated Actions – alarms – auto scaling

  • Amazon CloudWatch enables you to set high resolution alarms and take automated actions.
  • Example 1: This means freeing up important resources to focus on adding business value. For example, you can get alerted on Amazon EC2 instances and set up Auto Scaling to add or remove instances
  • Example 2: You can also execute automated responses to detect and shut down unused EC2 resources, reducing billing overages and improving resource optimization.