Description
A couple weeks back, I came across an old decision to use docker to bundle up some python code and then execute it on lambda functions.
This means that, doing so, we are creating a docker image, tagging it, and uploading it into a docker repository for it to be deployed / accessible by the lambda. Should we want to develop another function that uses the same dependencies package, we would need to build the application once again (including dependencies) and deploy a new lambda with that container image.
Docker image deploys in lambda functions are a means of quickly getting code that is already developed into a lambda, allowing it to be run in a “serverless” manner. But the purpose of lambdas is to act as a sort of Service Function Chaining concept in the sense that code should be small and quickly executed, hence the 15-minute limit time execution for lambdas lifespan.
I’ll use the cost of running point of view to validate some of my reasoning because in AWS we leverage economies of scale to pay the least amount possible for computing power. This means that the best approach is more often than not the one that costs less.
Lets talk cost
One aspect we might want to consider is cost. To have a lambda deployed via docker image we have to deploy the said image to ECR (see here) and we can adjust to 10cents per GB per month (ignoring initial free limits). This may seem like a small amount but remember that old docker images are not deleted automatically once a new one is deployed. Even on our sandbox account we could see that the same repo has 63 images of averaging 200MB (as of today). That is ~12GB for only 1 lambda function. Imagine that everyone decides to adopt docker images to deploy lambda functions (for now we only have a very limited number of 7 functions).
Given the following time cost series for this example:
We easily calculate the total amount we would spend during 1 year assuming no more deploys and no more lambdas are developed, which is very unrealistic, to be around 109,20€. The variation in cost is tied linearly with both amount of space taken by image, number of releases and number of lambdas in account.
Another thing we might want to have in mind is lambda’s time for execution which is priced at GB-second / month. Can we assume that lambdas that are loaded via docker will take more time to execute? Not necessarily, but they will take more memory as well because they will load the OS to run the program along with all the dependencies.
Is there any other way of using docker?
Is there a reason for us to use docker images and not deploying to other services like EKS or even ECS? My suggestion is that if we want to keep using docker images, we should move away from lambda and into ECS, this way the service would be at least free of charge, we would still pay for underlying resources / EC2 instances, which could be spot instances (less costly) for low-priority workloads.
For EKS we would have to pay 10 cents per hour of uptime in a container, which could be cheaper than the lambdas but I won’t do the maths for this one.
Both these options can be arranged / set up with target groups and with a set elasticity that allows from 0 instances to 1000 (ECS) / 750 (EKS) to be run at any given time. This should be plenty enough given that lambdas are also limited to 1000 concurrent executions across the entire account.
Then Why Lambda Layers?
Besides being the optimal solution for service function chaining (we could argue that step functions could also do the trick but that’s another discussion), lambda layers have NO COST for usage. This means that we can have up to 75GB (see lambda layer dashboard for use) of FREE dependency space we can make use of.
Lambdas can have up to 5 layers, we can bundle dependencies however you see fit and can reuse them across different lambdas. This means that if I create a new lambda that has the same dependencies as other lambdas (that already use layers) I can simply reference those layers in the CDK code with a simple line like:
LayerVersion.fromLayerVersionArn('magic:layer:imuttable:arn')
and that’s it. No more creating Dockerfiles, managing Docker ARGS, and pip-related stuff.
Should someone be interested in this subject I can recommend both the certified-cloud-practitioner course and the certified-developer-associate course for the introductory levels on AWS.
These courses specify in more detail why some simple design choices should be made to save you money in the long run, while also having our cloud architecture the most efficiently designed it can be.