Getting Started with AWS Lambda: Coding Session
This is an on demand webinar: for your convenience we have added the transcription here below.
Hello, everyone.
Welcome to this new AWS Lambda coding session webinar.
I am Alex Casalboni from Cloud Academy. Today, we’ll go through a very brief introduction of the main AWS Lambda and serverless concepts. Then I will demonstrate five different demos, trying to cover the main AWS Lambda use cases: API Gateway, S3, DynamoDB, and so on.
First of all, a very brief introduction about myself. I’m a software engineer. I’ve been working into the development field for quite a few years now. I was lucky enough to join Cloud Academy a bit more than three years ago, when I started coding in Python, which is also the language that I will be using today for our demos.
First of all, very quickly, what is serverless, and why is it even a word?
Let’s say serverless was about born about two years ago. It was introduced by AWS, and I personally believe that the main advantage of this new approach to application development is that you can almost neglect administration. Zero administration, meaning that you do not have to manage servers. That’s why it’s serverless.
Although there are servers, but need no management. It’s much simpler.
Basically because you need to scale, is not the system, it’s not a container, it’s not an application, but it’s a single function, meaning that if you follow the microservices oriented design, you can really speed up your development and simplify your workflow, just removing servers from your daily tasks.
What about Lambda?
Lambda comes with all the benefits of serverless. I think here the main point is that you can achieve scalability and very high availability without managing any infrastructure because it’s all managed. Of course, you still have to deal with operations, but all those operations are pretty developer-friendly, meaning that if you want to do everything from a browser console or with some API codes.
You can obtain your code, version your code, and test your code in a very easy and user-friendly way.
Of course, it is AWS. You also have everything that comes from data system. You have every possible integration with all the other services. We will see some of them later in the demo. Of course, one of the other mantra is, is that you never pay for idle because there is no idle. You do not pay for servers.
You only pay for invocations of your functions. You don’t have to talk about idle at all.
Also one of the main use cases for Lambda is that you can attach a RESTful interface to it through Amazon API Gateway. That’s the first thing you’re going to see today. Almost for free, you can have a RESTful interface over any Lambda function you can develop and think about.
We’ll go through five different demos, and I would like to show you how to integrate Lambda with the most important AWS services today. During the last webinar with Austen Collins, we talked a lot about the best practices and what’s the best coding with a special reference to the serverless framework (LINK).
If you don’t know about it, please check it on our blog, or serverless.com.
We went through, let’s say, the three main approaches that people are taking when they develop AWS Lambda systems.
The first one is the nano-services approach or pattern, where every single one of the functions only deals with one single job.
This makes you have a lot of Lambda function, which are simpler, but in a way harder to maintain. Also, they share a lot of dependencies, a lot of code. It’s just more complicated to handle.
If you see the diagram here, every section, every color in the little section is a logical unit. Each one of them will have more Lambda functions. Let’s say if you have a crude logic, you will need to have one Lambda function for GET, one for POST, one for PUT.
This has a lot of overhead in your operations and code maintains.
The second pattern, which is more suitable in the case of HTTP interfaces, that you have one function and different jobs. For example it will handle all the GET, POST, PUT operation over your resources with one single Lambda function. This allows you to have much fewer functions overall, and it’s also much easier to share your code and organize your code. Also, it will make your Lambda function, both faster in a way because if you don’t pull your other function for long enough the underlying container will be dismissed. This means that if you don’t call the post operation for one hour, the next call will be slower.
Instead, if you have all the GET, POST, PUT logic into one single Lambda function, this issue will not be as important because your Lambda will remain warm, and will not get cold. This is the cold start issue that we talked during the last webinar.
Then there’s the new monolithic approach where you manage all the interruption between the cloud and the server with only one function and GraphQL, which is a great new tool like a client-specified what kind of structure it’s expecting as output of your API.
It’s a new approach where everything is managed by one single function that aggregates and returns the right structure to your clients. It will make your API versioning and maintenance much easier. We might have another webinar just about this new approach I’m very curious to explore it myself.
Okay, let’s start with the first demo.
I think most people are using Lambda to build RESTful HTTP interfaces, and I think this is one of the main cases.
Let’s focus on this one for a while. Let’s start from scratch. We’ll go through all the steps, how you build a RESTful interface, how you design mapping templates, and how you manage stages, aliases, and versions between Lambda and API Gateway, which can be a little confusing at the beginning.
Let’s go into a real AWS console.
I’m already logged in my account, and we can start in wizard of AWS Lambda.
You can select the blueprint, which is ready to use.
Let’s say function that you can customize, but in this case let’s just build it from scratch.
I’m really happy to be able to show you is new interface, which was released a few days ago or weeks.
This will make it much easier whenever you want to create your Lambda function from scratch if you have already an idea of what you will build.
In our case, we’ll just select the trigger:
And you’re shown all of the possibilities.
Until a couple of weeks ago, you were forced to build your Lambda function, and then go to Amazon S3 or go to Amazon DynamoDB, go to Amazon API Gateway, and configure everything manually across all these different services. Now, you have this beautiful interface where I can just select API
Gateway, create a new API. Let’s call it “Hello World”. Very simple.
Then we have a new resource. Let’s say Hello. We want the get method, and we can select a deployment stage, meaning this is how API data would manage staging.
You can have a dev stage, a prod stage, testing stage. In this case, let’s start with a simple dev stage, which is a convention, and let’s make the API open so we don’t have to mess up with authentication authorization, although this would be a very interesting demo.
Okay, great. Of course, we get a warning because the API defined will be publically available, and this is a great warning because you know we’re not doing this in a real production environment.
Great, let’s call our function Hello, just Hello. As I mentioned, we’re going to use Python 2.7, and let’s just build a new function.
We want as you can imagine start very simple with just another word. Maybe we want to print the event. This will generate a log, which comes with free with AWS Lambda.
Now we’ll slightly complicate this example step by step, so that I can show you how to build, how to create a new version, how to work with aliases, and so on.
We don’t need any particular role, so we can just select the basic Lambda execution role, which you would allow Lambda to just write blocks.
Nothing special. Okay, three seconds is great. We don’t need any VPC configuration. We can just go. This is a review with configure in API Gateway production stage. I think we want to have a, we said, dev stage. Right? Done.
Of course, if you have any additional dependency or some particular requirements, you can upload your own code. You can upload it on S3. You can version it and can do whatever you want. For today, we’re just going to add the code here on the browser, which is simpler to show to you.
We just created a new function. There is only one version, which Amazon AWS calls it the latest. This is the default last version. We also found this version to our development API Gateway stage.
We didn’t do anything yet, but we already had a Hello World HTTP API. Okay, this is great. Pretty simple. We didn’t do much, but it works!
Let me tell you about versions and alias:
You can create a new version each time you make, let’s say, a meaningful change in your Lambda function code. This allows you to add different versions to rollback and to add different stages, so that, for example, your development stage can point to a specific version, and your production to another one.
Therefore, you can test new functionalities and new improvements to your code without breaking the production. This is a very good practice.
Aliases, I was saying. You can create as we did for API Gateway that alias, and bind it to, for example, the latest version.
In this case, whenever you update your code, your development stage will use the latest version of your code. While if you also created a production stage that is binding to a previous stable version of your code, you will not break anything.
Let’s just do this. We published a new version because we have no version now.
Working with the latest is not recommended. In the initial version, publish. As you see, we don’t have any aliases yet.
Lambda doesn’t create an alias automatically, so we can create a new alias.
We can call it dev, just dev, and we can bind it to the latest version. I like to do it. Maybe it’s not a great practice, but I like to have the dev environment always updated with the latest versions of my code.
This will make my flow much faster. What I’m going to do now, we go back to our latest version, and for example we can update the log.
We just made a little, simple change to our code. We need to save it. Now, our latest version is updated. Our dev alias is still binding to the latest version. In our API, I’m fine. I would think it’s already updated.
We can check our logs now, and we will see. Let’s do it very quickly. You can check every CloudWatch log and see just what happened.
This event, you can see the latest event.
Now, we have a 1) production, a 2) development API Gateway state from one to our latest version. Let’s go and improve this.
Let’s go to API Gateway, as you see, and Hello World, and API was created.
We get Hello resource, and I get method. This is what you used to do until a few weeks ago manually. This just happened automatically.
As you see, the Lambda function integration is just pointing to Hello. This means that API Gateway this reference to the latest version of our Lambda function.
What we can do, we can say please point our GET Hello method to the Hello development alias. We authorize it, and Lambda will ask us, “Do you really want to provide the permission to invoke this alias?” Great. Let’s do it. We can deploy our API to the dev stage, and done.
We didn’t change much, but you’ll see this is being very useful when we start having more than one stage.
Okay, what we want to do now is we want to create a new stage. We can call it production.
First we go to AWS Lambda. We create a new alias.
We call it prod, just prod. Let’s say we bind it to a stable of our version of our API, not to the latest one.
Now we have, as you can see, two different aliases, dev and prod. Dev, always latest, and prod on the stable version.
How do you also have an API Gateway to our production on the function?
You just configure it here with the Lambda function integration.
You say prod.
You will be asked again to provide permission to API Gateway and you need to make a few API requests manually. Then we can redeploy our API, but this time, into a new stage. Let’s call it production, prod. No particular description.
Now, you can see here we have two different independent stages. They should both work. Now we have dev and the prod.
They are the same, although they referenced it to different Lambda functions and two different Lambda aliases. They are basically the same code so far.
Let’s go to our latest Lambda function version and make it a little bit better.
Let’s make it more dynamic. Let’s take input parameter.
Lambda would provide us an event object, which in Python is just a very simple dictionary. We can do something like this:
In this way, we will fetch name attribute or event, and we’ll put it into a name variable.
How do we update our code? We just say Hello name. This is how we’re doing it in Python. Very simple. What is name is not provided? This is really, if I keep calling my API, what’s going to happen?
Let’s say we want to still say Hello World. If no name is provided, Hello World.
We can save our function. We can already test it here in the console. We can provide a sample event. Let’s say no event at all. What’s going to happen?
It should be Hello Cloud Academy. Let’s see..
Great, so it works in Lambda.
Now, how do we make it work through the API Gateway endpoint?
If I call my dev and my prod APIs, they still work.
If you remember our dev alias is pointing to the latest version, so our API Gateway dev has already up to date. What we expect to do is we would like to say something like name equal Cloud Academy, and this should work, right? But it doesn’t. It doesn’t because you have to tell API Gateway where to fetch the name parameter.
This is very much more automatic if you make a POST request because this is done automatically by API Gateway. It says we want to do a GET request in this case.
We have to tell it how to deal with the GET request.
Let’s go into the integration request. We want to deal with the dev all this again.
What we do, you add a new body mapping template. For each content type, for example, application/json, you can specify add mapping template using the last template.
IMAGE
In our case, we just want to add a name parameter. You use input, dot, params, name. In this way, we are telling API Gateway to just take any parameter from query string, from post, from the body, from everywhere, and map it into a name event parameter. Let’s see. It works. We just save.
It’s not enough. We also need to redeploy the API because if you only change the Lambda function, it’s not a big deal. Everything is already configured, but if you change the effective configuration, you will need to redeploy your API. You’re done. What happens?
I will expect this code to render Hello Cloud Academy.
Let’s see if it’s correct. It does.
What about our production environment? Should it print the same? Let’s see if it does. It does not. Why is this happening?
Because our production alias is pointing to a previous version.
Let’s publish a new version of our Lambda function.
Now, so the dev alias is always latest. Now we want to say that our production alias will be pointing to this new version too, so that also the production API will have a dynamic name. Let’s save it. Do you expect this to work now? The dev, it works. The prod doesn’t because we didn’t deploy.
As I told you every time, you would try something in your resource configuration and your integration request, you need to redeploy. Never forget that.
Let’s also add the body mapping template to our production.
It should be there already. I’ll just save it and redeploy. You should follow the previous webinar.
We talked about the serverless framework.
All these operations, all these best practices are already managed by the framework. It will save you a lot of time. You just give it a try.
We redeploy the production environment. I would expect this to work now. Let’s take a few seconds. Great. Production works. Awesome.
This is how you’re going to maintain your versions and update your code. You want to work on the latest version. I like to have the dev API always updated so that I can test it very quickly, and then I can just update the production alias to the newest stable version whenever I have a new one.
This is a very simple example, but it gives you an idea of how to deal with aliases and stages. Another recommendation that I’ll give you is to configure your stage to stage variables.
In this case, you could make your resource configuration dynamic to use a stage variable.Lambda function alias could be a variable, and you could have for a stage a different value. The dev, the Lambda function alias variable valued to the dev. All the minor mapping will be more automatic. What I did right now was simpler, but it’s highly recommended to use stage variable for this use case.
Let’s go on. Second demo.
How do you use Lambda with S3 events? You can more or less find a Lambda function to do everything AWS.
Every time you need a little computation, you don’t need to instantiate to launch a new instance or anything.
You can, in most cases, just provide a Lambda function to do the job.
Let’s see how we bind a Lambda function to an S3 event, for example, a new object uploading to one of your buckets.
As a simple example, we’re going to listen to S3 and wait for a new file to be uploaded, compress it, and upload it back again to S3 into another position.
We have an automatic S3 file compression. I don’t know if it sounds useful to you, but let’s see how it might work.
I already created an S3 bucket here in my account. I called it Cloud Lambda Trigger.
Still here, and it’s already in its folder.
We’re going to use this as a starting point. Let’s go back to Lambda. Let’s create a new Lambda function. We could try to use maybe a blueprint, so it’s just like Python.
You can choose a blueprint. This is fine, looks good. S3 get object Python. Let’s use it.
We combined our Lambda to an S3 event. You just select the right bucket. We want to listen on it to new file. Let’s say put. We want to listen only to this images folder, so we can use our prefix. Let’s say we want to only listen to .png images. We just enable the trigger. We call it S3 zipper.
We use Python 2.7, and so here we add the ready-to-use S3 Lambda function from the blueprint. It’s not exactly what we need, but the main structure is very similar to what we’re going to do.
In particular, as you see, we are retrieving the bucket and the key from the event, which is given by S3. In this case, what the blueprint was doing, it was retrieving the object, printing the content type, and returning the content type. It was a very simple example just to give you a little schema to start from.
In our case, I already have a few lines of code to get started.
This will be our main logic. Let’s quickly go to line 17. They’re really the same, so I basically started from the same starting point. I want to upload my file into a new folder, so I’m taking the file name and creating a compressed/sent file.
That’s it. If I upload Hello.png file, I will end up with a compressed /Hello.png/zip file. Very simple.
We are going to get the object, which was given by S3 in the main event. We are going to write a new object, and put a new object into S3 by compressing its value.
The new key will be the new key, and the bucket will be the same. I wrote a very simple compress utility just by using Python code libraries. I think we also need operating system. Okay, great.
The code should be fairly simple. You just retrieve the data, compress the body, retrieve the body, compress it. Then push a new file into S3. Sounds pretty simple. It’s not weird magic. Let’s go on. The AWS Lambda configuration. This is a bit tricky. Never forget about what you’re doing in Lambda and what permissions we’re going to need.
In this case, we are both reading and writing into and from S3. If you don’t say anything, if you just use a basic role, this will not work. Let’s try to do this. We’re going to use the same Lambda basic execution role.
We will see that it doesn’t work. This looks good. There was some error into the trigger. We’ll check it later. Let’s see. It works.
You can configure a test event. You can select a sample event from S3, for example, S3 put. It’s already there. We can very quickly change the packets so the test event is already updated. There we go.
You can see, if we save and test, we will get a very ugly error saying that we do not have the information. Access denied.
We have to set up a new IAM role. Let’s create a new custom one.
Let’s say lambda_S3_full_access. Here is the policy that is given by default that Lambda needs.
We can add it in line. We’re going to need a role very similar to this one. In this case, I’m just providing the local access to S3, although we might constrain our specific bucket. Our Lambda would be able to get and put and object in S3. Great, allowed! We have to save our Lambda function.
If everything work right, I expect to go to my bucket, upload a new image for example, our logo. We should upload it, and magically, I should go back to my bucket and see, refresh it. Nothing happens, so something went really wrong. Let’s see. Oh yeah, because the trigger failed. Let’s create a trigger.
Cloud Lambda trigger. Put, images, png. Enable trigger. This is very useful in the new console. You can simple enable or disable it whenever you want.
It seems like a trigger was correctly bound this time, so let’s quickly delete the file, reupload it again, and if we’re lucky enough this time, we’re going to see a new zipped version of it automatically created. Let’s refresh. It’s not there.
Great use case. How do debug Lambda? You want to go into CloudWatch logs and see what just happened. Let’s give it a try.
Latest. There was a problem still existing. This is related to IAM.
Sometimes it just takes a while to upload the permissions. I don’t know what’s doing that. Then we can use the test event to make things run a bit faster. I should have a test event here.
Okay. In this test event, I’m using the very same file. I’m just generating the very same event that we previously generated. It said okay. Let’s see display. Refresh. Great, so it worked. I think it was related to IAM. Sometimes it can take up to 30 seconds to update the permission.
We have the file. Our code was working great. Let’s see if it’s really the same file, but compressed. Let’s download it. Save this link. Let’s say here. It’s going to be a zip file. Let’s open it, and inside, we have the same image. Great. This is the second demo, and we found out how to find S3 events, and we also managed to create a new S3 file with a different content, of course.
You might use it even to have a backup without compression or to generate JSON version if you want to serve them via cloud framework. You can do any kind of magic with this if you use S3 for your storage. This is really powerful. You don’t have to add all this custom logic into your application, but you can do everything inside the AWS environment.
Let’s go on. The next demo is about simple authentication service.
Let’s go on. The next demo is about simple authentication service. We did the very same that we just did. You can configure SNS to trigger a Lambda function, and in our simple use case here, we are going to log all the SNS and messages into DynamoDB. We can see how to, again, create a custom role to write a new record into DynamoDB.
First of all, we’ll need to have of course an SNS topic. Let’s do it very quickly. Creating a topic.
Let’s call it Cloud Academy Webinar Test. Sounds good to me. CA webinar. Create topic. Done.
Now we have a new topic. We can publish messages on it. Subscribers will be able to subscribe to this topic and listen to messages, Lambda included. We can do this from here and create a new Lambda subscription, but let’s see how we do it directly from Lambda.
We create a new one. We, again, could use a blueprint. SNS message in Python.
Let’s select it. Let’s select the SNS topic that we just created, and enable the trigger. Next, let’s call it SNS to DynamoDB. This is what we’re going to do.
The blueprint gives you a very basic structure. We will extract the message, the message from the event, and we’re going to just print it.
This is not what we’re going to do, but let’s say this is enough, and let’s create a Lambda function that we make more complicated.
Again, we don’t need any custom role. We can choose a very basic role for this, and we will update it later in order to be able to write into DynamoDB of course.
Great. Looks good. Create function. Everything was there, the integration, the trigger is up there.
We’re not doing much here, so we can just try to test it with a sample SNS event that message is only hello from SNS. If our function works fine, we should just see hello from SNS. Great.
What do we do now? We want to create a new DynamoDB object record every time an SNS message is published into a topic. This is what we’re going to do. As we already did, we are going to extract some information from our message. The message itself, its ID, the topic, the ID, and the subject. This is what you can log, for example. This is a very simple example.
Then we’re going to build the item that we will write into DynamoDB. We will have a primary key called message ID, which unique, message, topic, and subject, exactly what we extracted. We can add some logging and just the right operations. We will log the item, and we would write. I don’t know. We’d also return it.
Great. If we try to test it now, what do you expect to happen? We cannot have DynamoDB write permission, so this shouldn’t really work at all. Let’s see if we are right. Also, I forgot. We need to instantiate our DynamoDB table up here. We create a notifications table. Let’s do it right now so it already exists. Okay. This is DynamoDB . We will quickly create a notifications table. The primary key, we’ll just be the message ID, and it’s going to be a string. Default settings are great. We have new DynamoDB table. Awesome.
As I mentioned, if you run this right now, it’s not going to work because we don’t have the permission.
We might need Boto. Let’s forget about that one. Import Boto3. Save and test again.
Great, as we expected, we have a credential problem. We don’t have the put item permission. We’re not authorized to do this.
What we do, as we did previously, we create a new custom role for our Lambda function.
This time we’ll be a bit more selective, and we will create a new permission to only allow this table to be written. Lambda DynamoDB notifications. This Lambda function will only be able to read from this specific table.
This is a statement that you will need to insert. We are permitting any DynamoDB action just in case, only for identifications, DynamoDB tables. When it exact ARN (Amazon Resource Name).
There we go. Let’s use the right one. It was written there. Et voila: with this role, our Lambda function will be able to write or read or do anything with our DynamoDB table. It’s quick enough this time. We should be able to test, although our test event is not really what our function expects.
Let’s test it. It worked. Succeeded.
This item that we wrote into DynamoDB because an example topic, “hello from SNS”.
Just from sample data from Lambda. Now, if you go into our notifications table we can scan it, and we already see a message. This is the message that we logged. Although we wrote this record, we then tested it. What we want to do is publish every message, to log every message that is published into our SNS topic. Let’s see if this actually works.
Our trigger is enabled.
The Cloud Academy Webinar Test topic will trigger this Lambda function every time a new message is published. Let’s say this is true. First, we need our ARN of our topic. We can publish a new message. I got the wrong one. Let’s do it again. Copy. This time it’s right. We can publish a new message. The subject will be: “hello from my webinar”.
The message will be just fine. Very simple.
We can just publish the message. Done.
What do you expect to see? We should have a new message and record in our DynamoDB table. Let’s query again, and there it is.
Let’s make it bigger. This is hello from my webinar. All the data that we logged are successfully stored into DynamoDB. What happened now? What if I re-execute my test? What do you expect to happen? It will be the same message with the same ID, and nothing happens because if you write a new record, if you put a new item into DynamoDB with the same primary key, it will just overwrite the existing record. This is pretty interesting if I execute my Lambda function 10 times. I will not add new duplicated events because of the primary key.
Okay, so we saw how you can generate new events through SNS. You can do anything. You could call a third-party API. You can call it on DynamoDB as we did.
You could write a file in S3. You can just integrate any kind of custom logic with this mechanism.
What about the next example? I think of a very interesting scenario covered by Lambda is you can bind it to any source, but you can also schedule events on a periodic session.
Kind of like what you’d do with a cron job. You want to execute the function every hour, every day, every Friday at 5PM. Great. You can do this with Lambda as well.
We’ve created a very simple function, execute it every minute because that’s the smallest period that you can configure. Again, we’re going to write a very simple event into DynamoDB, just to see a very simple use case.
We’ll create a new DynamoDB table again. This time, let’s call it, I think I chose a name already. Let’s call it log time, and we’ll allow the primary key called timestamp. Our timestamp will be a unique number, and our function will simply log the time every minute. You can do anything every minute, every hour, every day. We want to do something quick and simple now.
We’re just going to log the timestamp every minute, and save it into DynamoDB. It could be anything really. We have new table. We can very quickly create a new function. We don’t need any blueprint this time. Let’s just elect CloudWatch event schedule. This is how you create a cron job.
You can just give it a name. Let’s say every minute log. Log the time every minute. Great. This is how you select the expression to be one minute, five, 15, one hour, one day, or any custom cron expression. Now, let’s see one minute, so we can actually see it working in very quick time.
Let’s call it time logger. Just time logger. I’ll use Python again. I hope you like Python as I do. Our function will be very simple this time. We will again reference our DynamoDB table before the execution starts. We will log the current timestamp. This is how you do it in Python.
You just cast the date time into a integer with some magic.
We’re going to write a new item. It’s a timestamp primary key and the day time string format in a ISO format, which is very simple, just to see it changing every minute. Some logging, and just put item on DynamoDB.
You already know that we will need the right permission on DynamoDB, but we cannot use the previous role because we forced that role to be on the notifications table. We create a new custom role, and it will be very similar, just with a different table name. Create a new one. Let’s call it DynamoDB time. We add it inline the policy, and we add this statement as we did before.
We need the ARN of our table, which you can find here on the DynamoDB console. This is a new role > Select it.
I think three second will be great. We don’t need much run.
Next. Review. Every minute. Looks great. Create function.
Everything worked.
We don’t need any particular test event in this case. I would just expect to call the function and have a new log because every log would be different anyway. The test event can simply be an empty json object, and we can just execute it. I just did it, and it worked. This is the object that we pushed into DynamoDB, and now if we go into this table, I can scan. We have new one new item. Looks good. There is the timestamp in the UNIX timestamp format, and the ISO format.
Looks good. What happens if I call it again? There will be a new record, I guess, a couple of times. Two new records, three new records, great. CloudWatch will call events every minute. You didn’t forget it. If I now remove all of these, a new log, a new record should appear every minute because of our CloudWatch event schedule trigger configuration, which can you disable at any time, but you will just keep seeing the records coming every minute here. We’ll be back to this later. Very simple example. Very simple table, but you can design any kind of cron job with the very same mechanism. It will keep coming.
Very last example. I want to show you how to actually use DynamoDB as a source, not only as a data store for your Lambda functions. We want to execute a Lambda function every time at DynamoDB record is inserted or updated. This gives you some real powers because you can design any complex system. You can, I like to say, augment your DynamoDB table with for example custom field or other aggregated] values or external data integration with just a very simple Lambda function. Let’s see how you would do this with Lambda.
We go back. We can use the very same table we just created. Let’s make it smarter. Let’s make every record more rich, for example. We create a new Lambda function. Again, we could start from a DynamoDB blueprint. Yeah, DynamoDB in Python. We want to use the log time on the DynamoDB table. Let’s say only one item at a time. Actually, the events will come in a batch format, so you will be processing 100 or 50 records at a time because you have a very high throughput. In this case, we want to process each single record at a time, just to not have any delays.
The logic will be very similar to what we just did. The only difference is that we will need to check that the event is an insertion. It’s an insert event, not an update. I will tell you what’s the point here. The event that DynamoDB will give us will contain many records. First of all, we retrieve the records. We iterate over them. I like to log everything, so please use logging into Lambda into your Lambda functions.
Here, we would retrieve for this record date and name. It could be an insert. It could be a modify. It could be any other event. In our case here, just to keep it simple, we want to do something only whenever a new record is inserted into the table. What we’re going to do, we’re going to add a new field to our record. We will call it new field, and we will give it a value, which is in the case study, we will call it hello. Very simple. DynamoDB is a schema list, meaning that you can have any field you want, any time you want, it will not break anything.
How do we do it? First of all, we retrieve the timestamp, the primary key, because we will need to update the same item that was already written. Then we call the update item operations using the same primary key, the timestamp, and then we set a new field with a hello value.
This is a very stupid simple use case. We just want to have a new field every time an object is created on DynamoDB. Then set to print how many records we just processed.
We can reuse this name, IAM role.
We have the full access to our table, which is to put in a new item next. We’re going to give it a name. Let’s say augmented DynamoDB. Next, the configuration looks good. One log, one file. Only one record. That’s the right table. Let’s create it.
It’s taking some time. Okay, it worked. What is going to happen here? Every time a DynamoDB record is created, we will add a new field. Every minute, a record was created, as you can see here. Five, six minutes passed. They did not have any additional record. They have only timestamp and the date time that we already configured. Our previous Lambda function will keep being executed every minute. We can actually call it manually.
We can call it manually, so we just created a new DynamoDB record. If everything worked correctly, I would expect it to add this new field. Does it? It doesn’t. Something went wrong. Another break example. Let’s debug it.
Augmented DynamoDB. Event monitoring. You want to see what happened into the CloudWatch logs. Here. Well, nothing happened?
Let’s go back to our code and see what happened. This is the right table. We have taken the records. Updating the right one. Looks good to me. Maybe we can try to test it.
Configure a test event. It’s always a good practice.
You should test your function before trying to feel like it. This is the event. I think we need a different schema. We will make a timestamp and a number. Oh, now, it’s working. This is interesting. I think it was an ordering problem.
Maybe we can start back on stuff? Not really.
Anyway, the function is working. That’s why there was no error log, but then DynamoDB was not showing the new record yet.
Every time we create a new record into this table, let’s do it again.
Every time I test, a new record will be created. As you see, this just happened. A new field called new field will be added to the record and the hello value. In this case, it’s static, but you can add any kind of custom logic.
If you want to call a third-party service, fetch some data, add it to this record, for example for data integration or, I don’t know, you want to localize the IP, or you want to do any kind of integration with your records, you can augment your DynamoDB table with Lambda. Any logic you want to implement. Okay. We did it.
Questions & Answers:
We are a little bit out of time, but we could go first in the Q&A. A lot of you have been asking questions during the last hour, and I will try to answer to some of those. Great.
1. AWS Lambda support only Python? No. Actually, Python was one of the latest languages supported.
JavaScript was the first one, and then Java, everything running on the JVM was reported. You can run Ruby and executable binary can be run into Lambda, so you can execute GO code or any other.
I’m using Python because I think it’s simpler to understand even if you’re not a Python coder.
2. What if I want to import an external module in my Lambda function?
These are great questions. I even want to cover this because we’ve talked about it during the last webinar, and this is a very tricky point because you cannot just add your code inline in the browser.
You would need to upload your deployment package containing both your code, your function, and all your dependencies.
Let’s say you’re in Python, and you have a dependency on the recast module.
You need to install it locally, compile it if needed, and upload a zip file containing both the recast folder and your Lambda function. This is how you do it right now. There are a few frameworks that simplify your work, but the same events with the node or with Java. You have to build your deployment package. Of course you can automate the process because, for example, if you are using a Mac or Windows, you can’t compile this code into your machine. You have to compile it into a Lambda-compatible operating system, which is CentOS. This can get really tricky if you have very long compiling time.
3. Do you have script options to change the stage name to production from dev? This is referred to the API data configuration. Of course, if you have a lot of stages you want to automate this.
Today, I wanted to show you how to do it in the console. Also, a lot of operations are already performed by the console, for example, how you give the right permissions to API Gateway to invoke the correct version or alias of the Lambda function.
You remember all those little confirmation messages? They were just asking me, do you want me to add the right role, the right permission? Yes. Okay. Otherwise, the AWS API technically or whatever you would prefer, you would need to do all those provisions yourself, or just use a framework that automates all these best practices for you.
4. How do we do the mapping for URL parameters?
Okay, this is how do you design your routing basically. You will need to play a little bit more with API Gateway if you want to add dynamic routes with parameters. You can accept URL parameters. You can add a variable to your endpoint configuration, and also, yeah, API Gateway will give it directly to Lambda if you use the mapping template that I used. I was using input. You can use pad, in order to take all the pads arguments, all the pads parameters. You can use the same, the last template to configure your dynamic parameters.
5. About the S3 compression example, will it go in recursive mode if we try to put a compressed file?
Yes, if you remember I was limiting the compression Lambda execution to a prefix and the suffix. Only the images folder and the png format were live. With that, I constrained my focus. Of course, if I don’t do that, every write to every put to your bucket will execute them on the function and could go in a recursive way. In my case, I managed to avoid it both with the prefix and the suffix. Even if I upload a jpg or a zip or, I don’t know, a doc, file into the same subfolder it will not match the png suffix. I was pretty safe. No recursion there.
6. Do we have to define the memory required for the execution of our code, or does Lambda compute it on its own? Yes, you have to define it yourself. If you code needs a lot of RAM and a lot of memory, you’ll have to tell Lambda. I didn’t show you. I always use the default value, but you can choose. You have this handle that will increase both the power and memory and the networking qualify of your Lambda function.
If you give it, I don’t know, a gigabyte of RAM, it will also run faster. Of course, it will be more expensive, but the ratio between the memory with it and the execution time is constant. If you give it double memory and it’s executing half of the time, it will pay the same. I think it’s a good deal. You need to find the best trade-off for your specific use case.
7. Can Lambda be used to start in-situ instance? Yes, you can use any AWS API inside Lambda, as long as you provide the right AMI roles. If you have the permission to do something, you can do it in Lambda. If you want to do something inside your VPC, for example, access some resource, some Redis instance or some in-situ instance, and you need to call it, you will need to add also the VPC security group rules in order to it. You can do it. You have to tell AWS Lambda which EPC you want to use, and you will be good to go.
8. We’re thinking of using Lambda for batch jobs, but our batch exceeds the five minutes. How do you do it?
This is another good question. I think if you need more than five minutes, right now it’s the maximum execution time of Lambda, so this is what some users are doing. You can execute for four minutes and 50 seconds. You can run your worker for as long as it can. Then you can just trigger the same Lambda function at the end. It could go in a recursive way. If you keep track of the status of your computation, of your batch computation, you can trigger the same Lambda function at the end.
This is possible because you have two variables, the event and the context. In the context, you can ask it how much time do I have left for this execution? If you know that you don’t have enough time to execute the next item in your batch, you can just stop executing, and call yourself and be recursive. This is not ideal. I think Lambda, it’s not a perfect use case for this, but if you really like Lambda, and you want to go for it, I think this is the only workaround at the moment.
9. Is using the AWS console encouraged or suggested to configure all these operations?
Of course, you can use the CLI, the terminal, the APIs. For most operations that I’ve shown you today, I would actually recommend to automate the process because it takes some time, some practice.
10. Using the API will require a deeper knowledge of what you’re doing because you can’t just click here and there. You will need to know a lot of lower level details that the browser console will solve for you automatically. Yeah, I think it’s much more recommended to use some kind of automation tool for your own scripts to automate these tasks.
11. Would it be possible to use a Docker image with all the libraries and stuff you need?
You would like to use Docker as a deployment method. I think this is not currently supported. I know that there are other serverless platforms that allow you to use Docker, but I think if you use Docker, there are better solutions to deploy your code and your applications. I think in Docker you can run anything, and if you just need a function in Python or JavaScript, and Docker is an overkill.
This is why Lambda is simplifying the software engineer’ and the developers’ workflows. I don’t even want to configure Docker or the whole infrastructure operating system layer just to run a function in the cloud. It’s not possible, and I actually wouldn’t do it honestly.
Okay. Thank you very much for attending. We are already out of time.
We will have three more webinars this month, and I recommend you to register for them, and also to visit CloudAcademy.com/webinars, and have a look at the previous ones.
I found the last webinar about serverless computing and the serverless framework really interesting.
You’ll want to give it a try if you didn’t attend, and I’m looking forward to the next one.
12. Thank you very much. I hope you learned something useful today.