AWS Lambda vs EC2: Comparison of AWS Compute Resources
Cloud Foundry Foundation, a non-profit organization that overlooks the open-source cloud computing projects, conducted a global survey recently consisting of 550 users. The survey results made it evident that 31% which is 169 of them are already using a serverless architecture.
When further asked which platform they are using, 77% said AWS Lambda. AWS Lambda is becoming popular for serverless application development since they enable an organization to develop scalable software and applications than server-based applications, for example, EC2.
EC2 requires management and provisioning of the environment. Each EC2 instance runs not just a full copy of an operating system, but a virtual copy of all the hardware that the operating system needs to run. In contrast, what AWS Lambda requires is enough system resources and dependencies to run a specific program.
Also, AWS Lambda enables you to create portable code blocks for easy development, testing and deployment. That’s a winning trifecta!
If that’s all there was to AWS Lambda vs EC2 then I’d be writing a death note for EC2. But, there’s a lot more to it than just serverless-less/full.
What is AWS Lambda?
AWS Lambda is an on-demand cloud computing resource offered as function-as-a-service by AWS. Over time, AWS Lambda has changed how we create, architect, and run our applications.
The main difference between AWS Lambda and other computing resources is the responsibility of provisioning, use cases, and pricing. Before the emergence of agile solutions, operations teams allocated the resources based on forecasting. They had to make sure that the computation and memory demands don’t exceed the limits their system could handle.
With AWS Lambda, the computing resources scale up and back down automatically based on real-time demands. At present, AWS Lambda supports multiple languages and is used within applications in different ways or as a back-end as a service.
Popularly known as serverless architecture, AWS Lambda is a splendid example of how the overhead of an operation team is going to be a distant memory.
What is EC2?
Amazon Elastic Compute Cloud (EC2) is a virtual cloud infrastructure service offered by AWS that provides on-demand computing resources through which you can create powerful servers in the cloud.
The hardware of EC2 is fragmented into multiple resources that are offered in the form of scalable instances for computing memory and processing power.
It also provides flexible options to host your application on more than one platform with tight security for multi-model, multi-tenant architecture. These instances are accessed by HTTP or HTTPS (API), enabling developers to create applications just like an on-premise infrastructure.
With Amazon EC2, you have the facility of provisioning virtual machines as per your applications’ requirements. Such facility is provided on a utility-based subscription model where the user is billed as per their consumption of resources.
An Evolution: From EC2 to AWS Lambda
By introducing EC2 as IaaS, AWS removed the overhead of managing infrastructure. Consequently, the time to allocate a server reduced to a fraction along with features like automatic scaling, scheduled provisioning, monitoring, and alerting systems powered by CloudWatch among others.
When EC2 was launched, it prevailed in a far more volatile environment than what it is now. Some of the initial issues were sudden outages, multi-tenant models per machine, failure in scheduled provisioning, and the disappearance of virtual machines. As a result, Reddit, Foursquare, Rapportive, and Heroku were among the many sites that were affected due to these glitches.
Then came Elastic Beanstalk (EB) which provided all these facilities in a pretty package. EB worked with multiple languages and frameworks that seamlessly let developers upload their code to virtual machines through the AWS console. Additionally, it directly spun EC2 instances while automatically balancing the load and giving users a direct endpoint. Despite this, DevOps could still log in to the AWS console and manually configure or tweak the allocation of instances.
With the launch of IaaS, organizations were freed from the baggage of infrastructure management. However, it failed to deliver promised advantages like provisioning and capacity planning that lead to the creation of FaaS.
Like EB, AWS Lambda allowed users to operate with an array of languages and frameworks as well as upload code packages directly to functions. Lambdas use ECS that is not available to configure manually. On the other hand, Lambdas are exposed through API Gateway which functions as a URL router to your Lambdas.
AWS Lambda vs EC2: Infrastructure Management
Setup & Management Environment
AWS Lambda: Whether you need to set up a multiple or single environment, you do not need to do much work. You are not required to spin up or provision containers or make them available for your applications, scaling is fully automated.
AWS Lambda might not appeal to someone already working with an on-demand development environment with containers and orchestration in place.
Amazon EC2: With EC2, setting up includes logging in via SSH and manually installing Apache, and doing a git clone. Along with that, you need to install and configure all the required software in a manner that is automated and reproducible.
For EC2, instances come in two options, first are standards ones which serve data roughly the same as our desktop hard drive, and second, advanced provisioning which will serve data much faster. Comparatively, this is a lot of work.
AWS Lambda: Serverless architecture abstracts away the patching and OS update manual work. With Lambda, you have higher flexibility of workflows but at the same time, it increases the surface attack. What you need to consider is how to secure communication happening inside and outside your application.
Another thing is that the granularity of functions is really high. With an increasing number of functions, monitoring becomes hard which in turn is a threat to decaying functions.
However, vulnerability breaches are less likely to happen considering the stateless characteristic of the functions. Malicious agents grow over the time which isn’t possible considering the statelessness. More to that, functions are automatically scalable which gives good protection against DDOS attacks. The protection is facilitated by auto-scaling functionality but it also increases your bill.
Amazon EC2: With EC2, you’ll have to take care of the security layer at the instance level. The security layer decides and controls the traffic allowed to communicate with each instance. Each instance can have multiple security layers that dictate the allowed inbound traffic through certain protocols like TCP, UDP, ICMP, etc.
Creating policies that will have correct permissions is tiresome and a work of trial and error. This is especially true when you’re working with a growing team. Handling permissions for every specific business needs mean changing policies often and you end up increasing the unwanted granularity.
Meanwhile, with Lambda, this is entirely taken care of by AWS along with OS patches and system maintenance.
If we consider DDOS attacks, you’ll either have to opt for other services of AWS Shield or you can do it manually by using ELB to scale under the attack or limiting the rate of requests/minute from a particular id. EC2 definitely has security groups and firewalls but unfortunately, those are not enough to monitor resource-based traffic monitoring.
AWS Shield helps in automatically detect the type of AWS resource behind the elastic IP address and apply the relevant DDOS protections.
Whether it is EC2 or Lambda, we are here to help you build your application.
AWS Lambda vs EC2: Performance Comparison
AWS Lambda: As per the official documentation, AWS Lambda records a timeout of 300 seconds. This limits the type of tasks Lambda can deal with, solong-running functions and complex tasks aren’t a good fit.
Furthermore, the time limit imposed by API Gateway invokes a function at30 seconds, which poses another potential challenge. Even if you manage to architect a solution that handles timeout sufficiently, you’ll have to constantly monitor and debug the issues when they happen.
Not all timeouts occur due to the 300 seconds limit. Some are attributed to the bugs introduced or when you’re dealing with communication over external servers. This takes too much time and answers the functions with the wrong data.
Amazon EC2: In comparison with Lambda, EC2 has flexible options. You can definitely work with long-running tasks since instances are available for different types of requirements with different configurations. This makes EC2 a better option.
However, the EC2 service is not error-free. You may encounter connection timeout as a result of overlapping security group rules and an unidentified user key by the server. Another common error takes place when you’re dealing with an insecure network over SSH.
This is especially applicable for newly created instances. For example, if you SSH your new instance without waiting for its status checks to get complete, you will get an SSH timeout error. Nevertheless, Lambda, too, clearly lags while dealing with complex processing over EC2.
AWS Lambda: External libraries are inevitable for any project and it is true for AWS Lambda too. When you’re dealing with heavy processing functions like image processing, video conversion, you’re bound to have dependencies.
However, Lambda comes with a size limit of 50 MB. Uploading your function dependencies with the package seems to be the logical solution, although it gives you the option of downloading the dependencies from its “/tmp” file storage once your function is executed.
More to that, “/tmp” file storage has a limit of 512 MB, so the higher the number of dependencies, the more time is required to execute the function successfully.
Managing dependencies in EC2 isn’t a big problem since it doesn’t have constraints when it comes to temporary storage. Though, what you should consider is the size of software packages and corresponding instance CPU. This is because your CPU may be burdened if it’s not configured.
For example, the base of Amazon Linux Containers is already preloaded with many software packages that are required for executing the basic functionalities.
AWS Lambda: Scalability is one of the major benefits of Lambda as most of it is automatic and handled by AWS. It scales dynamically in response to the increased traffic and increases the number of concurrently executing functions by a predetermined amount. If this isn’t sufficient to accommodate the traffic, it will continue increasing the number of concurrent function executions by 500 per minute.
But scaling isn’t as effortless as it’s portrayed. In fact, we’ve faced errors during the creation of new Lambda functions. Although it is great to have an automatic scaling system, it isn’t much fun when you can’t address these errors and work upon them.
For instance, the API Gateway timeout to invoke a Lambda function is 30 seconds. And when you violate this limit, you’ll see a 5XX error being returned from API Gateway. The only option you have is to retry the request until you succeed.
Another example is: Lambda depends on Amazon EC2 to provide Elastic Network Interface for your VPC-enabled Lambda function. Consequently, your function’s scalability depends on EC2’s rate limits as they scale. You need to check whether your VPC-enabled functions are allowed 500 concurrent invocations per minute.
Amazon EC2: With EC2, everything is in your control. This means you have to manually configure scalability features, unlike Lambda, and that might not be a novel idea. However, the positive side is you can make your scalability an error-free process with EC2 Auto Scaling groups.
These groups ensure that you have a sufficient amount of instances available to handle the load. Simply create the Auto Scaling groups, add the instances, and specify the number of minimum and maximum instances in each group. What’s more, AWS ensures your group never goes above the mentioned number of instances by automatically launching and terminating them.
For example, the Auto Scaling group has a minimum of 1 instance active all the time) while the desired capacity of 2 instances meets the usual demands, and a maximum of 4 instances meet the unusual spike in traffic). More to this, there are no additional costs attached with Autoscaling Groups.
Conclusion: Fully automated scaling is a dream come true, but not at the cost of not being able to mitigate the errors. On the other hand, the manually configuring scalability environment could be work initially, but once you have gained control through setup and tuning, it becomes automatic.
How to Build a Scalable Application up to 1 Million Users on AWS
Availability (On-demand vs Always available)
AWS Lambda: As discussed, Lambdas are available all the time. These are brought up and spun down automatically depending upon the requirements of the event triggers. Since you’re not paying for idle time, you can save a lot of money.
However, the debate over the function’s availability is still a burning topic. Here’s an example: “I’m running US-East-1, and I am getting 500 errors no matter what I do. It was working a few hours ago, and I haven’t changed any code. In fact, even a new ‘hello world’ from the templates returns this response, but the service shows “healthy” on dashboards.”
Such outcries are common as we let go of the ability to manage infrastructure for other added benefits. In such cases, if your Lambda isn’t available, a potential solution would be to deploy your functions into multiple regions. Since you only pay for the executed functions, your subscription covers fallback deploys if there is a region outage.
Amazon EC2: With EC2 instances, you don’t get the benefit of on-demand availability as autoscaling groups help in scalability. But at the end, you will have to keep at least one instance up and running.
However, EC2 has also faced many outages. The network shutdown in the US-East-1 region, for instance, resulted in temporary unavailability at startups like Foursquare, Rapportive, Heroku, and Reddit A similar outage before this one lasted for 48 hours. Such examples raise questions about the reliability of cloud platforms and consumer technology.
Furthermore, replicating EC2 instances for multi-region availability is quite difficult. The most possible solution is to deploy a blue/green strategy but that will double the number of running EC2 instances, reducing the extra cost.
AWS Lambda vs EC2 Latency
AWS Lambda: Cold startup is considered as one of the biggest issues with event-driven serverless functions. It occurs when you trigger a function which has been inactive for a long time. This delay occurs at the cloud provider’s end as it takes to provision your selected runtime container and then execute your functions.
This may take more than 5 seconds, which makes it impossible to guarantee a less than 1-second triggering of functions from API Gateway, DynamoDB, CloudWatch, etc. However, a potential solution can be pinging your function from time to time to keep it warm.
However, this wouldn’t be a recommended solution as it comes at the cost of an increased bill. For example, if you have a scheduled function which consumes a lot of memory, CPU and time, Lambda’s cost might be higher than EC2.
Amazon EC2: Cold starts do not occur in EC2 instances typically unless you start a new container. New instances take longer to be ready than existing instances because they typically run code on the first startup. For example, if you Stop the instance and Start it again, the instance will be available quickly. However, the startup time invariable depends on some issues like local filesystem mounts, remote file system mounts, regular OS initialization scripts, user data and cloud-init scripts, instance type, and EBS volume type. Moreover, the time taken to launch a new instance is determined by running benchmarks for your particular use case that only reduces the time.
Comparatively, this is a tiring task, and Lambda seems to be a right choice. Also, if you just have a couple of requests to process, running an entire EC2 instance would not be a wise choice.
AWS Lambda vs EC2: Cost Comparison
Both AWS Lambda and EC2 offer a pay-per-usage package. If you are to use EC2 instances, then the costing is based on the time it is running.
The cost per hour may depend on a CPU’s efficiency, memory footprint, and even storage capacity. So the usage of AWS EC2 instances is more feasible for applications or software that needs to handle many user requests.
Simultaneously, AWS Lambda may be an ideal choice for businesses with less traffic as they just pay for several applications and execution time. The price for an application’s execution depends on the memory provisioning per second, which is around $0.00001667/GB-Second. In the next section, there are examples showing how AWS calculates the cost based on memory provisioning.
The time of execution is counted from the start of the application to the timeout. It is rounded up in the multiple of 100ms for the calculation so that on-demand businesses can benefit from it.
Let’s discuss some examples of AWS Lambda vs. EC2 pricing:
Say, an application that bears over 8000000 hits per day, with each taking about 200 ms with 512 MB, the cost will be around $8.07. At the same time, when using a t3.nano EC2 instance, the price for this instance will be $5.82.
Here, you can see that the Amazon EC2 instance is more cost-effective than AWS Lambda when dealing with high traffic.
Let’s take another example where the traffic of an application is moderate or low. Here, there are 60,000 requests per day with 512 GB memory and 200ms. AWS Lambda offers 400000 free tiers GBs for low or medium traffic to bear the minimal cost, as shown in this image.
The same configurations used with EC2 instance t2.nano costs around $4.36. So this is the difference between AWS Lambda and AWS EC2 pricing based on the amount of computing power, storage, and time.
AWS Lambda Pricing: How much it Costs to Run a Serverless Applications?
Amazon EC2 vs Lambda: Versions/Snapshots
EBS Snapshots for EC2 Instances
AWS offers a complex system of snapshots of EC2 instances from Elastic Block Storage (EBS). Here, a snapshot is a copy of data in the EC2 instance at a certain point in time that is incremental. It stores data that has changed since the snapshot was created. What’s more, they have all the information since their creation and can be restored easily.
All the snapshots are chained to each other and let you restore EBS volumes on-demand. The best thing about snapshots is that even if you delete an old snapshot, AWS passes the data to the next snapshot, so there is no risk of data loss. EBS volumes and snapshots remain mutually exclusive as even if the snapshot is in a pending state, you can easily write data on EBS volumes, and AWS will include it in the snapshot.
Version Control in AWS Lambda
AWS Lambda offers a versioning system for functions. So you can create a beta version of the Lambda function and test it without affecting the stable version. AWS Lambda creates a new version of your function each time you publish a function.
It is like a whiteboard that you can use to change the codes in case of discrepancies without hindering the stable app. AWS Lambda creates an unpublished copy of your function that you can edit. The copy contains codes, dependencies, Lambda runtime to invoke the functions, environment variables, function settings, and Amazon Resource Number (ARN).
The data and code on the unpublished version enter a lock-in to provide an uninterrupted experience to the customers. So you can change the code and function settings only on an unpublished version of the Lambda function.
Use Cases: When to use AWS Lambda?
With a conventional setup, managing a physical server or even a virtual server is quite difficult, but AWS Lambda efficiently manages multiple activities like updating the operating system and provisioning of the instances. There are different services that you can use with AWS Lambda like AWS API Gateway, RDS for relational, DynamoDB for a non-relational database, Amazon S3, and Amazon Cognito User Pool.
AWS Lambda handles application logic and data through database services like RDS or DynamoDB. You can even host Lambda functions within a virtual private cloud to isolate them. As for the pricing model of AWS Lambda, API Gateway and S3 are based on the traffic whereas fixed cost is only related to database service.
AWS Lambda handles the entire website without the need for server management, especially for an eCommerce website where different aspects like cart management, payment integration, and Artificial Intelligence engine for recommendations are easy to execute.
Check How We Helped Our Client, Swift Shopper, To Build E-commerce Solutions On the AWS Stack
Real-time data Transformations
Many times it happens that you need to convert the data into different formats instantly. For example, you are using a smart home device that converts digital codes to trigger infrared signals and then convert them into electrical signals instantaneously. Restructuring raw data becomes essential before writing it to the desired destination. Some of them are:
- Normalizing data from multiple resources
- Metadata addition to the information
- Converting data as per destination requirements
- Testing of extract, transform, and load between databases
- Merging data from different sources
With AWS Kinesis Firehose, you can write real-time streaming data and scale data transformations with every clickstream data. Firehose works as a mediator and invokes asynchronous Lambda functions once the data buffering is complete.
The Lambda function converts the data from the source into the format required for the destination database as per custom logic and sends it back to Firehose. Next, Firehose sends the data to the destination database.
Along with the transformation of data a concurrent backup is created on the S3. So you have the backup of your original data along with the transformations, all happening at the same time.
In a hyperconnected world, real-time notifications are crucial. From third-party collaborative technologies like Slack to the constant need for ChatOps, real-time notifications are more relevant than ever.
With AWS Lambda and SNS, you can create a topic that controls the access for publishers and subscribers. SNS couples with AWS Lambda function to manipulate the data in messages and send it to different SNS topics or system endpoints.
Take an example of receiving system alerts as a slack notification. Here, CloudWatch alarms will trigger a message to the SNS topic. It is a feature in the AWS cloud that logs all the performance and metrics data through a single platform. So, each time there is a glitch in your system infrastructure, it will trigger a message through an SNS topic.
Next, the SNS topic will invoke a Lambda function that calls the Slack API to send a message across the channel. It is a great use case for larger enterprises where system performance determines business operations and helps avoid catastrophic downtimes like the one that happened with Google recently.
Predictive page rendering is all about data transformations based on user selection with an algorithm or predictive model. It guides the web page to the next possible step with a phase-wise transformation which follows the pattern as follows
User selection – Data Visualizations– Updating the data
If you are to use predictive page rendering for your webpages, AWS Lambda is your best bet. It can be used to retrieve important files and data from the next page requested by the user. It executes the initial phase of page rendering.
Sometimes an external source that offers data or media to your webpage may have issues in rendering. In a case like this, Lambda looks for an alternative source to serve the user so that the user experience is not hindered.
Real-time Data Connectivity
In the hyperconnected world of smart devices, real-time data is processed on a bulk basis. Now, for applications and websites, handling such a large amount of data in real-time is overwhelming. Data from peripheral devices, IoT devices, and even user interfaces come as bytes that may not seem significant individually but together they pose a problem. This happens when you want them to temporarily store data and process later.
With AWS Lambda, real-time data processing becomes easy. All you need is to stream the data in a Lambda application specifically designed for faster processing in real-time. With it, you can process real-time data without affecting the system’s performance.
Let’s take an example of IoT devices like smart lights, smart speakers, and fitness bands. Now you want to register each device on your system, and for that, you will have to use a rule trigger. An IoT rule trigger invokes a Lambda function that applies custom registration logic to the DynamoDB.
Use Cases: When to use EC2?
Instances for IDE
Application developers leverage modern IDEs (Integrated Development Environment) deployed on elastic computers with an on-demand instance. EC2 lets you deploy several IDEs for multiple projects and developers.
The best thing about this feature is that developers are assigned individual IDEs. So you don’t need to pay for the instance running even after the task is complete as these instances are created individually for each IDE and terminated after the use, for instance, is over.
Data Recovery with EC2
AWS offers three types of data recovery options- pilot light, warm standby, and multi-site. They are categorized based on RTO/RPO where RTO is Recovery Time Objective and RPO Recovery Point Objective.
RTO is the time you need to restore your data. while RPO indicates the point in time from where you want the data to be restored. For example, if you want to restore your data from an hour ago, the RPO will be one hour. Similarly, the downtime of the system while your recovery data is, say, two hours, then that is RTO.
Here, are the three data recovery options from EC2 with its RTO/RPOs
- Pilot Light(RTO/RPO– 24 hours)
- Warm Standby (RTO/RPO– 10’s of minutes)
- Multi-site (RTO/RPO– minutes)
The best thing about using EC2 for your instances is that it creates an instant backup on Cloud. It is retrievable and does not possess the risks associated with a physical browser. For example, you are sharing important code documentation over a shareable drive with another colleague, and due to some reasons, your data gets lost. With EC2 configured to create a backup, that data is directly retrievable.
HIgh-Performance Computing with EC2
If you are a firm with higher data resources, EC2 is the perfect choice for your organization. Take the example of Airbnb, an online marketplace for property owners and rentals, which need data processing of about 50GB per day.
Now, in order to analyze and process such a huge amount of data, it uses about 200 Elastic Cloud instances for applications, Memcache, and search servers. Along with Amazon EC2, Airbnb uses AMR or Amazon Mapreduce to manage enormous amounts of data. Amazon Mapreduce uses tools such as,
- Apache Spark
- Apache Hive
- Apache HBase
- Apache Flink
- Apache Hudi
With this service, you can easily scale BigData environments to reduce the effort on provisioning capacity and tuning clusters.
Secure Environment with EC2
EC2 comes with a secure environment that has two firewalls. The first firewall that secures your data is Network Access Control Lists (NACLs). It controls the communication between VPC (Virtual Private Cloud) and your system. These lists control access to all the subnets in your cloud environment. So if your organization is looking for a secure VPC environment, Amazon EC2 is the right choice for you.
Another aspect of cloud security is EC2 instances. For securing the communication with each EC2 instance, AWS provides security groups., You don’t need to assign a security group if you are using EC2 API as AWS automatically assigns one. But, if you are to use the EC2 Console, you need to set different rules for inbound and outbound traffic to the EC2 instances.
Whether it is EC2 or Lambda, we are here to help you build your application.
Here we come to the final question: which one of these services is best suited for my application? Well, there is no direct answer. There are many factors which you need to analyze and one of the major ones is your specific use case.
If you’re wasting your compute resources due to unpredictable traffic for your application but still want a scalable and cost-friendly solution, AWS Lambda is for you. When not to use AWS Lambda? When you want to do complex processing and your process can’t be executed in the limited execution time.
Or maybe you want to run a complex application that has consistent traffic and want to operate in a tried and tested deployment environment, EC2 if for you. The only drawbacks are a complex setup environment and provisioning of servers.
The result- For either AWS Lambda vs EC2 or vice versa, both operate for a highly specific use case, however, one wasn’t sufficing the need which necessitated the invention of another.
Until then, let’s make the most of each of the services. If you’ve hands-on experience with either EC2 or Lambda or both, I’d love to hear from you. Hit me up on Twitter @RohitAkiwatkar.