A service within [[AWS CloudWatch]] for monitoring and troubleshooting serverless applications running on [[AWS Lambda]]. > The solution collects, aggregates, and summarizes system-level metrics including CPU time, memory, disk, and network. It also collects, aggregates, and summarizes diagnostic information such as cold starts and Lambda worker shutdowns to help you isolate issues with your Lambda functions and resolve them quickly.  [^fn1] [^fn1]: [Using Lambda Insights (AWS docs)](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Lambda-Insights.html) ## How it works Lambda Insights uses a new CloudWatch Lambda extension, which is provided as a Lambda layer. When you install this extension on a Lambda function, it collects system-level metrics and emits a single performance log event for every invocation of that Lambda function. CloudWatch uses embedded metric formatting to extract metrics from the log events. **Important note:** You can still use CloudWatch Log Insights (which is a distinct service) even if you haven't enabled Lambda Insights for a function. e.g. you can still use the rich glob and regex filtering query syntax, you just won't be able to use the new metrics that the Lambda Insights layer records. ## How to turn it on Using [[Serverless Framework]], add [this plugin](https://www.npmjs.com/package/serverless-plugin-lambda-insights) to turn it on at a service or per-function level ## Pricing Pricing is based on the amount of metrics tracked. Worked example from the docs:  [^fn3] [^fn3]: [CloudWatch Pricing (AWS docs)](https://aws.amazon.com/cloudwatch/pricing/#Example_14_.E2.80.93_Monitoring_with_Lambda_Insights) If you monitor 1 Lambda function that is invoked 1M times per month, your costs would be as follows: CloudWatch metrics There is a predefined number of metrics reported for every function. Every function reports 8 metrics. CloudWatch metrics are aggregated by function using their name. All CloudWatch metrics are prorated on an hourly basis. If your function is invoked less than once per hour, your function will only be billed for the hours that it is invoked. Monthly number of CloudWatch metrics per function \= 8 metrics \* 1 function \= 8 CloudWatch metrics **Monthly CloudWatch metrics costs = $0.30 per metric for first 10,000 metrics \* 8 metrics = $2.40** Once you exceed 10,000 total metrics in your account then volume pricing tiers will apply. See metrics pricing table for details. CloudWatch Logs A single log event is generated for each function invoke. The size of each log event is approximately 1.1 KB. Monthly GB of CloudWatch Logs ingested = (1.1 KB/1024/1024) GB \* 1,000,000 invokes per month = 1.05 GB per month Monthly ingested logs costs = $0.50 per GB of ingested logs \* 1.05 GB of performance events as CloudWatch Logs = $0.52 per month **Monthly CloudWatch costs = $2.40 + $0.52 = $2.92 per month** There are no minimum fees or mandatory service usage. If the function is not invoked, you do not pay. Pricing values displayed here are based on the US East (N. Virginia) AWS Region. Please refer to the pricing information for your Region. ## #OpenQuestions - How is this better than the existing log searching that you can do with Lambda logs inside CloudWatch/CloudWatch Insights? - Allows you to search across multiple Log Streams and **Groups** (i.e. logs from different Lambda functions) - Extra metrics are tracked, full list [here](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Lambda-Insights-metrics.html) - Is it wise to enable it by default across all functions? (in terms of cost impact?) ## Useful queries [^fn2] [^fn2]: [Operating Lambda: Using CloudWatch Logs Insights](https://aws.amazon.com/blogs/compute/operating-lambda-using-cloudwatch-logs-insights/) by [[@James Beswick]] **The last 100 errors** ``` fields Timestamp, LogLevel, Message | filter LogLevel == “ERR” | sort @timestamp desc | limit 100 ``` **The top 100 highest billed invocations** ``` filter @type = “REPORT” | fields @requestId, @billedDuration | sort by @billedDuration desc | limit 100 ``` **Percentage of cold starts in total invocations** ``` filter @type = “REPORT” | stats sum(strcontains(@message, “Init Duration”))/count(\*) \* 100 as coldStartPct, avg(@duration) by bin(5m) ``` **Percentile report of Lambda duration** ``` filter @type = “REPORT” | stats avg(@billedDuration) as Average, percentile(@billedDuration, 99) as NinetyNinth, percentile(@billedDuration, 95) as NinetyFifth, percentile(@billedDuration, 90) as Ninetieth by bin(30m) ``` **Percentile report of Lambda memory usage** ``` filter @type=”REPORT” | stats avg(@maxMemoryUsed/1024/1024) as mean\_MemoryUsed, min(@maxMemoryUsed/1024/1024) as min\_MemoryUsed, max(@maxMemoryUsed/1024/1024) as max\_MemoryUsed, percentile(@maxMemoryUsed/1024/1024, 95) as Percentile95 ``` **Invocations using 100% of assigned memory** ``` filter @type = “REPORT” and @maxMemoryUsed=@memorySize | stats count\_distinct(@requestId) by bin(30m) ``` **Average memory used across invocations** ``` avgMemoryUsedPERC, avg(@billedDuration) as avgDurationMS by bin(5m) ``` **Visualization of memory statistics** ``` filter @type = “REPORT” | stats max(@maxMemoryUsed / 1024 / 1024) as maxMemMB, avg(@maxMemoryUsed / 1024 / 1024) as avgMemMB, min(@maxMemoryUsed / 1024 / 1024) as minMemMB, (avg(@maxMemoryUsed / 1024 / 1024) / max(@memorySize / 1024 / 1024)) \* 100 as avgMemUsedPct, avg(@billedDuration) as avgDurationMS by bin(30m) ``` **Invocations where Lambda exited** ``` filter @message like /Process exited/ | stats count() by bin(30m) ``` **Invocations that timed out** ``` filter @message like /Task timed out/ | stats count() by bin(30m) ``` **Latency report** ``` filter @type = “REPORT” | stats avg(@duration), max(@duration), min(@duration) by bin(5m) ``` **Over-provisioned memory** ``` filter @type = “REPORT” | stats max(@memorySize / 1024 / 1024) as provisonedMemMB, min(@maxMemoryUsed / 1024 / 1024) as smallestMemReqMB, avg(@maxMemoryUsed / 1024 / 1024) as avgMemUsedMB, max(@maxMemoryUsed / 1024 / 1024) as maxMemUsedMB, provisonedMemMB – maxMemUsedMB as overProvisionedMB ``` --- tags: