Aws Ops

@webframp/aws-opsv2026.05.24.1· 7d agoWORKFLOWS·REPORTS

01README

AWS Operations Toolkit - Unified incident investigation and daily operational visibility.

This extension provides workflows for investigating AWS outages and running daily infrastructure pulse checks. Gathers data from CloudWatch Logs, Metrics, Alarms, X-Ray Traces, resource inventory, networking, Cost Explorer, and GitHub.

Quick Start

# Install the extension (auto-resolves dependencies)
swamp extension pull @webframp/aws-ops

# Create model instances for your region
swamp model create @webframp/aws/logs aws-logs --global-arg region=us-east-1
swamp model create @webframp/aws/metrics aws-metrics --global-arg region=us-east-1
swamp model create @webframp/aws/alarms aws-alarms --global-arg region=us-east-1
swamp model create @webframp/aws/traces aws-traces --global-arg region=us-east-1
swamp model create @webframp/aws/inventory aws-inventory --global-arg region=us-east-1
swamp model create @webframp/aws/networking aws-networking --global-arg region=us-east-1

# Run the investigate-outage workflow
swamp workflow run @webframp/investigate-outage

Required IAM Permissions

logs:DescribeLogGroups
logs:StartQuery
logs:GetQueryResults
logs:FilterLogEvents
cloudwatch:ListMetrics
cloudwatch:GetMetricStatistics
cloudwatch:GetMetricData
cloudwatch:DescribeAlarms
cloudwatch:DescribeAlarmHistory
xray:GetServiceGraph
xray:GetTraceSummaries

Included Components

Workflows

@webframp/investigate-outage - Unified incident investigation workflow that:
- Gathers alarm summary and active alarms
- Analyzes Lambda Duration/Errors and ELB 5XX/latency metrics for anomalies
- Gets X-Ray service dependency graph
- Finds error traces and analyzes error patterns
- Lists CloudWatch log groups and searches for error patterns
- Inventories EC2 instances and Lambda functions
- Lists load balancers and NAT gateways with health status
- Gets alarm state change history
- Generates an incident report summarizing all findings

Reports

@webframp/incident-report - Workflow-scope report that aggregates findings into:
- Alarm status and recent state changes
- Metric anomaly highlights (Lambda + ELB)
- Trace error analysis with top faulty services
- Infrastructure inventory (EC2, Lambda)
- Networking status (load balancers, NAT gateways)
- Actionable recommendations

Model Dependencies

The workflow expects these model instances (create them before running):

aws-logs - @webframp/aws/logs
aws-metrics - @webframp/aws/metrics
aws-alarms - @webframp/aws/alarms
aws-traces - @webframp/aws/traces
aws-inventory - @webframp/aws/inventory
aws-networking - @webframp/aws/networking

02Workflows2

@webframp/investigate-outagec3866eb0-6190-4154-b8e1-304624aba93e

Unified AWS outage investigation workflow. Gathers data from CloudWatch Logs, Metrics, Alarms, X-Ray Traces, resource inventory, and networking to provide a comprehensive view of system health during an incident.

gather-observability-dataCollect data from all observability sources in parallel

1.check-alarmsaws-alarms.get_summary— Get alarm summary and active alarms

2.get-active-alarmsaws-alarms.get_active— Get all currently active alarms

3.analyze-metricsaws-metrics.analyze— Analyze Lambda Duration metrics for anomalies

4.analyze-errors-metricaws-metrics.analyze— Analyze Lambda Errors metrics

5.analyze-elb-5xxaws-metrics.analyze— Analyze ALB 5XX error count

6.analyze-elb-latencyaws-metrics.analyze— Analyze ALB target response time

7.get-service-graphaws-traces.get_service_graph— Get X-Ray service dependency graph

8.get-error-tracesaws-traces.get_errors— Get traces with errors or faults

9.analyze-trace-errorsaws-traces.analyze_errors— Analyze error patterns in traces

gather-logsSearch logs for errors (runs in parallel with observability)

1.list-log-groupsaws-logs.list_log_groups— Discover log groups

2.find-lambda-errorsaws-logs.find_errors— Search Lambda log groups for error patterns

gather-infrastructureCollect resource inventory and networking state (runs in parallel)

1.list-ec2-instancesaws-inventory.list_ec2— List EC2 instances across all states

2.list-lambda-functionsaws-inventory.list_lambda— List Lambda functions

3.list-load-balancersaws-networking.list_load_balancers— List ALBs and NLBs with target health

4.list-nat-gatewaysaws-networking.list_nat_gateways— List NAT gateway status

deep-divePerform deeper analysis based on initial findings

1.get-alarm-historyaws-alarms.get_history— Get alarm state change history

@webframp/morning-pulse460c619c-c59a-44bd-a2ad-27c8b819e8f6

Daily morning infrastructure pulse check. Gathers alarm state, alarm health verdicts, cost trend, and open PRs across user-specified regions, then generates a concise morning-pulse report you can skim in two minutes. Region-flexible via forEach — pass any combination of regions as input. Model instances must follow the naming convention: aws-alarms-{region}, alarm-investigation-{region}

alarmsAlarm summary and active alarms across all regions

1.summary-${{ self.region }}aws-alarms-${{ self.region }}.get_summary— Get alarm state counts and recent changes

2.active-${{ self.region }}aws-alarms-${{ self.region }}.get_active— Get alarms currently in ALARM state

alarm-triageTriage active alarms with verdicts

1.triage-${{ self.region }}alarm-investigation-${{ self.region }}.triage— Enrich alarms with health verdicts

costsCost trend for the last N days

1.trendaws-costs.get_cost_trend— Daily cost trend and direction

2.by-serviceaws-costs.get_cost_by_service— Cost breakdown by service

githubCheck for open pull requests

1.open-prsgithub.list_prs— List open PRs on the target repo

03Reports2

@webframp/incident-reportworkflow

incident_report.ts

Summarizes findings from the investigate-outage workflow into an actionable incident report

awsincident-responseopsobservability

@webframp/morning-pulse-reportworkflow

morning_pulse_report.ts

Daily infrastructure pulse: alarms, alarm health, costs, and open PRs

opsdailyaws

04Previous Versions12

2026.05.21.1May 22, 2026

2026.05.13.1May 13, 2026

2026.05.08.1May 9, 2026

Added 1 workflows. Added 1 reports. updated dependencies. updated labels

2026.04.22.1Apr 22, 2026

Modified 1 workflows. updated dependencies. updated platforms

2026.04.14.2Apr 14, 2026

2026.04.14.1Apr 14, 2026

2026.04.13.1Apr 13, 2026

2026.03.31.1Mar 31, 2026

2026.03.30.4Mar 31, 2026

2026.03.30.3Mar 30, 2026

2026.03.30.2Mar 30, 2026

Added 1 reports

2026.03.30.1Mar 30, 2026

05Stats

100 / 100

Downloads

Archive size

26.4 KB

Has README or module doc2/2earned
README has a code example1/1earned
README is substantive1/1earned
Most symbols documented1/1earned
No slow types1/1earned
Dependencies pass trust audit2/2earned
Has description1/1earned
Platform support declared (or universal)2/2earned
License declared1/1earned
Verified public repository2/2earned

Repository

https://github.com/webframp/swamp-extensions

06Platforms

linux · x86_64 linux · aarch64 macOS · x86_64 macOS · aarch64

07Labels

#aws #cloudwatch #xray #observability #ops #incident-response #daily #workflow