AWS Lambda
AWS Lambda is huge for me at work. Our app architecture is primarily based on Lambda and the Serverless model.
There are so many options to choose from when setting up your sevices with AWS. Until now in my Udemy course, a lot of what we have seen involves some sort of servers. A lot of time was spent talking about EC2s, which is basically a computer/server in the cloud that runs your apps.
Apps created with Lambda don’t run on EC2. This is where “serverless” comes into play. Lambda functions are just functions, small pieces of code that have some purpose. You essentially deploy your lambda functions into AWS and everything is taken care of for you. Your functions can be “connected” by SNS, Eventbridge, SQS, etc (rather than having lambdas calling lambdas).
We’ll go into more detail below. Once again I am taking notes from my Udemy course - “Ultimate AWS Certified Developer Associate 2021”.
Serverless
You only deploy your code/functions - you no longer have to manage and think about your servers.
A typical serverless architecture in AWS could include:
- users log in with Cognito
- static site comes from S3
- API Gateway accesses your Lambda functions, which access your DynamoDB database
Lambda
There is a time limit for your functions, up to 15 minutes. Lambda functions are run on demand, only invoked when you need them. They are scaled up automatically by AWS, so more functions are created when you need them. You only pay for invocations and compute time.
This is different from EC2, which need to be running at all times, and provisioned by you to scale up or down in response to changes in traffic.
API Gateway works well with Lambda - you create your REST API and the endpoints call your lambdas.
When you invoke your function, you will get a new log stream in Cloudwatch. You can use Cloudwatch to help debug your functions.
Synchronous invocation
You invoke the function, and you wait for the result. This can happen when you use API Gateway, the AWS SDK, or other services.
Example: client calls API Gateway, which sends the request to Lambda. Lambda will run and return a response to the client through API Gateway. You wait for the response from the Lambda function.
Lambdas and ALBs (Application Load Balancers)
Another way, aside from API Gateway, to expose a lambda function via an HTTP endpoint. I first learned about ALBs through their use with EC2, but they can be used with lambdas as well.
Lambda is registered to a target group. The HTTP request that is sent to the load balancer is converted to a JSON format. Once the lambda code runs, it should return a JSON object, which is converted to HTTP and sent back to the client to indicate the response.
In a severless.yml file, you could set up ALBs with Lambdas like below, using an ALB event. This code is from serverless.com:
functions:
albEventConsumer:
handler: handler.hello
events:
- alb:
listenerArn: arn:aws:elasticloadbalancing:us-east-1:12345:listener/app/my-load-balancer/50dc6c495c0c9188/
priority: 1
conditions:
path: /hello
Asynchronous invocations
Events placed into event queue, lambda processes events. If there are problems processing, there are up to 3 retries. You can end up with your events processed multiple times. If processing continues to fail you can send the messages to a dead letter queue in SNS or SQS.
S3, SNS, EventBridge, SES, CloudFormation all use asynchronous invocations.
This can also be useful if you need to increase processing speed, and order/duplicates aren’t a big deal.
EventBridge and Lambda
2 ways to trigger lambda from EventBridge rule:
- Set up CRON or Rate EventBridge rule (scheduled) - trigger lambda on a schedule, like every hour
- Set up CodePipeline EventBridge Rule - trigger lambda whenever the state of the CodePipeline changes
Note: it wasn’t mentioned in the course, but at work we have services emitting EventBridge events that trigger our lambdas.
IAM Roles / permissions
When lambda functions are invoking other services:
- Lambda functions need IAM roles attached to them so they have access to the AWS services they require. This is called an execution role. You should have one execution role per lambda function.
When lambdas are being accessed by another service:
- give other services access to your lambdas by using a resource based policy
This link to serverless.com goes over how to work with IAM roles in serverless.
Environment variables
You can add and change environment variables in the Lambda console, but I tend to work with them only in the code.
In serverless.yml files, variables look like this:
variableA: ${variableSource}
#also can have a default
variableB: ${variableSource, default}
#example, with a "dev" stage being the default
STAGE: ${opt:stage, 'dev'}
Networking
By default, lambdas are launched in the AWS VPC, not your VPC, so they can’t access resources in your VPC. They can access other AWS services and external websites.
You can launch them inside your VPC, though, and you need to set up the VPC ID, subnets, security groups, and an IAM role. You can deploy a lambda inside a private subnet with a NAT gateway if you want it to access the internet. Interestingly, putting it inside a public subnet does not give it internet access, so you must use a private subnet.
Cloudformation
What I didn’t realize until now was that lambda function code is actually stored in S3. This isn’t something that you have to tell serverless to do explicitly - it is generated by serverless behind the scenes. If you look at the cloudformation templates that are generated by serverless, you will find references to S3 buckets in the code for your lambda functions.
It looks something like this:
"Properties": {
"Code": {
"S3Bucket": {
"Ref": "ServerlessDeploymentBucket"
},
Under “Code” you could also write your code directly, but this is only good for very simple functions, and I honestly wouldn’t consider doing it. Instead you should just reference the S3 bucket where your code is stored.
Here I can see just how useful serverless is. I don’t have to think about any of these things. In my serverless.yml file all I need to do is reference the handler that my lambda should correspond to with a file path.
# here, my code is being kept in a folder called "src"
# this is all that's needed - serverless creates all other resources for me
createTeam:
handler: src/teams/teams.createTeam
In my course, in the Lambda/Cloudformation demo, the instructor is having to go into S3 in the AWS console and create a bucket. Then he needs to reference it in the cloudformation file.
If you were to do this, you should enable versioning on your bucket (so new versions of your code are updated properly when deployed).
Once your bucket is created, you upload your code, then go to the cloudformation console and upload your template. There is some config that needs to happen at this step.
Definitely prefer serverless.yml files, uploaded from my machine, over this process.
Other notes
- Lambda can have up to 1000 concurrent executions at once. Lambdas scale up or down automatically. Important: this applies to all lambda functions across your account, not just one application, so you may want to set limits on your functions so one application doesn’t take up all your lambda resources.
- when connecting to databases / creating other clients like the AWS SDK, it is best practice to initialize them outside of your handler functions at the top of your file. This makes them available for other functions, not just the current one.
- Lambda layers: split up your function code and put reusable things like large libraries into separate layers - don’t have to be reuploaded every time you change your code, and other functions could use them as well.
- can create different versions of your lambdas as well - your latest version is mutable, what you’re currently working on. Each version is immutable and can’t be changed once deployed
- an alias is mutable and points to the latest version of your lambda