So, no servers?
Yeah, I checked and there are definitely no servers.
Well…the cloud service providers do need servers to host and run the code, but we don’t have to worry about it. Which operating system to use, how and when to run the instances, the scalability, and all the architecture is managed by the provider.
But that doesn’t mean there’s no management at all. It’s a common misconception that in a serverless paradigm, we don’t have to care about monitoring, testing, securing, and other details that we are used to managing in other paradigms. So let’s explore the main characteristics that we need to take in consideration when building a serverless solution.
First, why serverless?
One of the great advantages of serverless is that you only pay for what you use. This is commonly known as “zero-scale” which means that when you don’t use it, the function can be reduced down to zero replicas so it stops consuming resources — not only network I/O, but also CPU and RAM — and then brought back to the required amount of replicas when it is needed.
The trigger of a function on an AWS Lambda can be an API gateway event, a modification on a DynamoDB table or even a modification on an S3 file as defined in What Are AWS Lambda Triggers? But to really save money on serverless, you need to take into consideration all of the services that a Lambda needs to work. Serverless architecture provides many advantages, but it also introduces new challenges. In this article, we’ll provide best practices when building a serverless solution.
To deep dive into building, deploying, and managing the serverless framework, check out Cloud Academy’s Serverless Training Library. It’s loaded with content and hands-on labs to give you the practical experience you need to integrate serverless architecture into your cloud IT environment.
Costs
Storage
Even though it is not a direct cost, it is a common architectural design to store some of the assets used on a Lambda on an S3 bucket, so we need to add the S3 cost to the total cost.
Network
If you’re sending or receiving large amounts of data on each request, you need to carefully review this cost because on peak hours it can easily go really high.
API calls
This is another hidden cost, since it’s not charged to the Lambda resources. You may have a lot of API calls to consume database information or others, so it still is an important part of the total cost.
Cold starts
A cold start is the first time the Lambda is getting executed after shutting down to zero replicas (40 to 60 minutes after the last execution). At a cold start, the Lambda might spend a larger time to get everything ready and respond. So even though it is not an actual extra charge, you might want to avoid cold starts by increasing your memory limits or create a script that “warms up” the lambda by calling it every few minutes. Either of the two solutions represents an extra cost for the Lambda.
The actual execution time
The execution time is measured by periods of 100ms. So if we have invocations that run for less than 100ms, let’s say 25ms, it would end up costing the same. And that’s why sometimes we spend more money than what we actually should. Even if the execution time exceeds only by 5 milliseconds (105ms) We still have to pay for the whole period of execution time.
To get all of this information about how much are we really spending, we need to monitor the Lambda.
Monitoring the Lambda
A common mistake is to confuse zero administration with zero monitoring. On a serverless environment, we still need to pay attention to the metrics, and these will be a bit different from the traditional ones like CPU, memory, disk size, etc. Lambda CloudWatch Metrics provides very useful metrics for every deployed function. According to the AWS documentation, these metrics include:
- Invocation Count: Measures the number of times a function is invoked in response to an event or invocation API call.
- Invocation Duration: Measures the elapsed time from when the function code starts executing to when it stops executing.
- Error Count: Measures the number of invocations that failed due to errors in the function (response code 4XX).
- Throttled Count: Measures the number of Lambda function invocation attempts that were throttled due to invocation rates exceeding the customer’s concurrent limits (error code 429).
- Iterator Age: Measures the age of the last record for each batch of records processed. Age is the difference between the time the Lambda received the batch, and the time the last record in the batch was written to the stream. This is present only if you use Amazon DynamoDB stream or Kinesis stream.
- DLQ Errors: Shows all the messages that Lambda failed to handle. If the event was configured to be handled by the DLQ, it can be sent again to the Lambda function, generate a notification, or just be removed from the queue.
Besides the default metrics, there are plenty of monitoring services like Dashbird, Datadog, and Logz.io that can be integrated, so we can have additional metrics for a better logs visualization.
Right now, everything seems very clear and straightforward, right? We have some new metrics and configurations to learn, but it is pretty similar to our traditional structures.
But what about tests? Can we even make local tests for serverless?
Tests
Local testing
Since we don’t manage the infrastructure anymore, can we run it locally? If so, how can we do that?
We do have some options to simulate the serverless environment locally, like LocalStack and Docker-Lambda. They can simulate serverless functions and a few other services, such as an API Gateway. But most of these tools have some differences with the real environment, like permissions, authentication layer, and other services.
The best way to check if everything is working as intended is writing the actual tests!
Unit testing
Unit tests are always a most — whether or not your app is serverless. They are the cheapest (fastest to write and execute). We can use mocked-up functions to test the business logic in isolation.
Integration testing
Integration testing will allow you to catch errors when your function interacts with external services. This tests becomes very important since serverless apps usually rely on a combination of external functionalities that communicates to each other constantly.
GUI testing
UI tests are usually expensive and slow because we have to run it on a manual, human-like environment. But, serverless makes it cheaper because of a fast and cheap parallelization.
To make the app easier to test, a good approach is to divide the function into many smaller functions that join together to accomplish the same task. One of the best ways to do it is by applying an Hexagonal Architecture.
Conclusion
Serverless architecture might be a big paradigm change, providing us with a whole new bag of useful tools and advantages. But it also introduces new challenges to the developers that need to make decisions about the new options they have. Learning the best practices before start developing it is always the easiest and short path to adopt any new paradigm. Hopefully, these tips will help you to decide which are the best approaches in your new project.