As mentioned in last week's Sunday Reboot, I've recovered a few of my old posts from the Wayback Machine—all from my time at the now shut-down Fixate.io (via their blog, Sweetcode). These are all freelance topics, so the style of writing is a bit different than normal (for me, at least), and the topics are a little wider-ranging than I would normally write about here. This particular piece was written and then subsequently cancelled for Twistlock in about 2018 (now owned by Palo Alto Networks).
When it comes to software development, log storage has historically been what you might call a "known known." Log data is typically written to standard output, to standard error, or directly to a file using whatever mechanism the application's operating system or programming language chooses to use. While different technologies can follow slightly different standards, "finding" logs generally requires you to look in only a few places.
But, as new technologies hit the scene that allow us to develop and scale applications in new and exciting ways, the patterns that we've taken for granted in the past have started to change. Program execution, data management, log storage — All of these things can feel unintuitively designed as a result of the new paradigms established by the technology running them.
One such change developers are just now learning to deal with is finding, accessing, and analyzing the log data coming out of serverless platforms. Most commonly implemented as "Functions as a Service," these serverless platforms execute encapsulated business logic in fully managed stateless, event-based virtualized environments. Because these environments are ephemeral (only alive for as long as the requests take), traditional methods of accessing log data aren't very practical.
So, how do we extract and analyze log data generated from within a serverless environment?
For the most part, the answer to this question is dependent entirely on the serverless provider. While diving into the logging methods for every serverless vendor isn't all that practical, let's take a look at how the three largest cloud providers handle serverless data.
By default, all native logs within a Lambda function are stored in the function execution result within Lambda. Additionally, if you would like to review log information immediately after executing a function, invoking the Lambda function with the LogType parameter will retrieve the last 4KB of log data generated by the function. This information is returned in the
x-amz-log-results header in the HTTP response.
While these methods are great ways to test and debug issues associated with individual function calls, they do not do much by way of analysis or alerting. Thankfully, the log data that is stored in the Lambda function result is also stored in CloudWatch, Amazon's log aggregation service. To access the CloudWatch logs for a given function, you will need to know the log group and log stream names, which can be retrieved by adding them to the function call logs and retrieving them in the
x-amz-log-results response as mentioned above. As an example, this context can be retrieved and logged in Node.js like so:
console.log(‘logGroupName =', context.log_group_name);
console.log(‘logStreamName =', context.log_stream_name);
Similar to AWS Lambda, logs written within Azure Functions are stored within the function execution results in the Azure Portal. One thing worth pointing out is that Azure Functions offers custom logging mechanisms, rather than utilizing the built-in methods offered by each language. For example, the Node.js implementation requires using a context.log() method, rather than the standard console.log(). As above, this is valuable when debugging functions during development; however, it hardly helps with monitoring and analysis. To accomplish this, Azure Functions offers a built-in integration with Azure Application Insights, Microsoft Azure's Application Performance Monitoring platform.
It is important to note that, while AWS Lambda automatically writes logs directly to CloudWatch, some assembly is required to store Azure Functions logs in Azure Application Insights for an existing function (although the option to store logs in Azure Application Insights is made available upon function creation). Configuring Azure Application Insights for a given Azure Function is outside the scope of this article, so I highly recommend reading about the process here.
Google Cloud Functions
While utilizing the built-in logging methods in Google Cloud Functions is the proper way of writing log data, Google Cloud Functions logs everything written to standard output at the INFO log level, everything written to standard error at the ERROR level, and all internal system messages at the DEBUG level. To access these logs, the gcloud command can be used at the global level, on an individual function, or even for a specific execution, like so:
$ gcloud functions logs read
$ gcloud functions logs read FUNCTION_NAME
$ gcloud functions logs read FUNCTION_NAME –execution_id EXECUTION_ID
Additionally, as with AWS Lambda and Azure Functions, Google Stackdriver can be used to analyze and monitor logs from Google Cloud Functions. Stackdriver offers both a dashboard interface as well as a REST API for querying, analyzing, and alerting on log metrics. Interestingly, Stackdriver also offers second-party triggers for reacting to specific logs.
Taking Things Further
Serverless platforms, as with all new technologies, have introduced new paradigms into the software development lifecycle. But with the proper integration of the provider's built-in log aggregation and application performance monitoring tools, the log data generated by these serverless functions can be analyzed just as easily as with other tools (if not more so). And with a little work, this data can be pulled into third-party performance monitoring platforms for even more end-to-end analytics and monitoring.