Serverless Databases: The Future of Event-Driven Architecture
In this blog, we’re going dig deep into Serverless Database — functionalities, features, benefits, limitations, uses cases, and vendor options — illustrated with Amazon Aurora Serverless (explained in a minute). It will help you to understand how you can use different databases in custom serverless applications.
Before I brief you about what is AWS Serverless Aurora and why demonstrating my blog with it, I’d like to draw your attention to a common business problem:
I am sure most of us are aware of the importance of testing environments for software and applications, especially testing databases. They are used infrequently for fewer amounts of time with unpredictable loads.
However, many of us end up investing a major chunk in testing databases. I’ve, personally, taken over many projects that burnt thousands a month in database costs because they preferred replicating the environment for testing branches.
Anyway, AWS Aurora Serverless — Announced in the AWS re: Invent in 2017 and launched in August 2018 — turned into the biggest game-changer for all.
What is Serverless Database?
Serverless Database is a prerequisite for Serverless Computing. These are specially designed for the workloads which are unpredictable and can change rapidly. What’s more? This allows you to pay only for the database resources you use, on a second-by-second basis.
We are aware of cloud databases, say, AWS Aurora which is compatible with MySQL or PostgreSQL, fully-managed and automatically scales up to 64TB of database storage.
While creating this database, you choose the desired instance size and this works really well in an environment where there are predictable workload, request rate and processing requirements.
However, in the cases where the workload isn’t predictable and there is a burst of a request for a few minutes a week or a day, arranging right amount of capacity can be a lot of work. At the same time, paying for it on a continuous basis might not be the best solution.
And here’s where serverless database comes in the picture.
What is AWS Aurora Serverless?
“Amazon Aurora Serverless is an on-demand, auto-scaling configuration for Amazon Aurora (MySQL-compatible edition), where the database will automatically start up, shut down, and scale capacity up or down based on your application’s needs. It enables you to run your database in the cloud without managing any database instances. It’s a simple, cost-effective option for infrequent, intermittent, or unpredictable workloads.”
With this release, we do not need to provide large ‘x’ instance types to support the testing environments that happen at 3 AM and sit idle for the rest of the day. And, with separate pricing for storage and processing, switching to serverless databases can reduce significant costs.
Now, let’s get started!
How does Serverless Database Work?
In the case of AWS Aurora Serverless, it comes with an on-demand auto-scaling configuration. This means, the database starts up, scales capacity as per your application’s demand and shuts down when not in use.
What’s more? You run your database in the cloud without managing the instances or clusters
The Serverless Database model is built on the separation of storage and processing.
You create an endpoint, set up the minimum and maximum capacity if you like, and issue queries to the endpoint. This endpoint works as a proxy to a frequently scaled fleet of database resources. This empowers your connections to remain intact while scaling operations occurs behind the stage.
The separation of storage and processing brings another benefit as well. You can easily scale down to zero processing and only pay for the storage requirements. Whenever your application demands, scaling happens in almost 5 seconds while building upon a pool of “warm” resources which are eager to serve your requests.
Features of Serverless Database
Serverless Databases come with some of the exciting features like:
#1. Multi-tenant Architecture: One of the bonuses of serverless databases is that they can function as a single pool of resources that can be used by multiple projects within your organizations.
This is a huge plus for the development team as they are not required to build application-specific siloed data sources.
This is possible due to its multi-tenant architecture. This enables developers to set up, configure and deploy multiple applications within the same database cluster.
#2. Dynamic Quality of Service: Multi-tenancy opens up the possibility of assigning priority to each tenant. Under this, you can assign priority to specific tenants as a result of which system resources will be consumed in accordance.
This empowers operation teams to make maximum use of resources across their organization’s projects. However, it must be kept back in mind that this will demand a high transparency rate across the whole database cluster.
With this practice, your resources can be run at 60-80% utilization. If any spike is observed towards 90% or above, it will result in throttling of lower-priority workloads.
For example, high priority workload might be an issue in making a purchase, while low priority workload might be batch reporting processes.
#3. Geo Distribution: Given the fact that most of the businesses are working on a global scale, it is a prerequisite for the organizations to have their data needs available around the world.
With the close proximity of data centres, the real-time experience is enhanced to a greater extent. Moreover, the risk of an outage is highly unlikely as there is no point of failure.
Serverless database lets you replicate multiple datasets across the world without any additional tooling or custom developments. Various protocols embedded in the serverless database networking layer makes sure that it responds correctly to failures and performance degradation.
#4. ACID Consistency: There is no way possible that we can compromise data accuracy over its real-time availability. In other words, we can’t afford the serverless database to sacrifice its transactional consistency for speed and scale.
Serverless databases support ACID transactions with the new approach of classifying sequencing and scheduling functionality within a single layer above the stored data system. Thus, queries incoming from the end user application is managed before the DBMS interacts with the data storage system.
This gives enough time for the nodes in a cluster to decide which node will handle this transaction before the write begins while making sure that two simultaneous processes don’t modify the dataset. In this manner, it is possible to scale while making room for ACID transactions as well.
#5. Single Transactional Query Language: Serverless databases facilitate a single pool of resources to multiple applications. However, the imposition of a particular data structure or a single model for all its applications consuming that data could cause a serious problem for the development team. Also, this way, it’d be impossible to store data in the way it was originally meant to be.
Moreover, some applications might need strong schema while some might need a schema-less model. To all these problems, an ideal serverless database supports structured as well as unstructured data.
Schema optionality empowers the development teams with the support of use cases of structured and unstructured data. It also facilitates all the advantages of having a schema with none of the inherent disadvantages.
Benefits of Serverless Database
#1. Real-time Access: You have access to your data at a granular level. The data gets indexed by default and it makes those indexes available immediately.
This means you’ll be able to constantly query, read, write, update and add new items to your serverless database. What’s more? It will have easy and instant access via functions.
#2. Infinite Scalability: Serverless databases can be scaled up or down anytime you want as they start-up or shut down as per the application’s need.
If your functions are querying, reading or writing data to the same database cluster, it will scale the computing units (ACU in the case of Aurora Serverless) to handle the load.
Due to this automation, all your functions will be able to work in parallel and your data is guaranteed to be consistent.
#3. High Security: Modern applications are exposed to the untrusted and malicious audience at a global level.
Serverless database takes care of it by making sure that all the applications interacting with the same dataset passes the same protocol of access control. As a result, it reduces the attack surface, a critical risk for businesses.
Serverless database empowers you with that which helps you in reducing the latency. With this approach, data from event-driven functions is read where it is closest to the user.
#5. Schemaless: With Schemales you can handle any data output from your functions. This ‘handle anything’ approach makes it extremely easy to integrate serverless databases with your functions. An uncommon feature amongst other Serverless databases.
Limitations of Traditional Databases
Traditional databases are coming off with huge limitations as new technologies step in Here are some limitations that impact businesses:
#1. Overspending on Resources
To manage huge data infrastructure, companies spend an immense amount of money. Traditional database infrastructure means they benefit very little from resource sharing, and they keep on wasting money with isolated teams and their dedicated resources.
#2. Locality of Data
To maintain data availability and low latency, the database is replicated across various data centres. However, due to networking infrastructure, it is impossible to make sure that the data requested is returned from the same geographic location.
#3. Higher Fulfillment Time
Even after years of investment, large organizations often come across what we call ‘database diversity’ problem. With many options in the market, it is quite hard for the development team to add functionality to all of them once in a while. And the result is a higher time of fulfilment period.
Use Cases of Serverless Database
When to Use?
If you’re at any of these stages, Serverless Database is your knight in the shining armour:
#1. New Applications: An application whose usage might be for a few minutes over the course of a day or week,
For instance, you have got a low-volume blog site and you want to pay only for the time any user is accessing your site, this works for you since you pay for the database resource you consume on a per-second basis.
#2. Infrequent Used Applications: Where you’re unsure how your user base is going to scale.
For instance, you’ve built an app and you’ve no idea how popular it will become and also, you don’t want to take a chance, this works for you. Just create an end-point and let the serverless database auto-scale as per the requirements of your application.
#3. Variable Workloads: An application where you’re likely to see around, say, 30 minutes of the peak for several hours a few times each day or several times per year.
For instance, applications like HR, budgeting, operational reporting, etc. have highly unpredictable peak timings and hence, reserving instance is impossible. However, with the serverless database, you don’t have to provide to either peak or average capacity.
#4. Test Databases: A test database, it is going to be used only during the working hours of your organisation, well why pay for it when it is not in use? Serverless here is the best fit as it shuts down automatically when not in use.
Where to Use?
#1. Telecommunications & IoT: Let us assume the scenario of connected cars. You can use events to trigger functions related to car alerts. Let’s take an example of checking an engine light ON in a connected car.
A function is created which invokes as soon as there is any data-change to the sensor data collection. Moreover, if it meets the threshold limit, another function is triggered which sends a notification to the warranty department.
Another function is created which sends the maintenance mailers as soon as the threshold conditions from the sensor data collection are met. Since, all these events need to occur in a real-time, serverless database can be used here as a most feasible solution.
#2. Finance: Let us take the case of financial applications. With these applications, you can use events to trigger functions related to the bank account of a particular user. This function can be set whenever the account balance is below a certain limit.
With an API call, you can access the data stored in the serverless database. The conditions to invoke these functions can be set to weekly or monthly and for what would be considered as a low balance. Since you can’t predict the occurrence of these events, the serverless database is best.
#3. Marketing & Retail: Let us take the case of the retail business. Whenever a user adds a particular item in their cart, you can set up functions which trigger further business pipeline components like, updating inventory totals or send customer information for certain products to the marketing department, who sends them a promotional mailer.
In this case, since the function is decoupled from the app and database itself, you don’t need to keep them spinning all the time.
#4. Ecommerce: Let us take the case of e-commerce apps where 3rd party payments are inevitable. Serverless databases streamline the user experience by letting you call these payment APIs securely, without setting up your own servers.
When a user reaches for the payment, you collect their payment information and pass it to your database, then you process their payment by making requests to a third-party payment processing API over HTTPS, write the order to the serverless database, and return a confirmation message to the user once it is finished. All securely, without managing servers.
Serverless Database Vendors
There are plenty of serverless database options available in the market and most of them can be used while making your serverless application. However, the following databases should be preferred as they are purely meant to be used with functions and they stand out due to their two distinct features- pricing and real-time responsiveness.
Firebase Real-time Database lets you store and sync data between users and devices in real time using cloud-hosted, NoSQL database. The updated data syncs across the connected devices in milliseconds. More to that, data remains available even if your app goes offline, providing a great user experience irrespective of the network connectivity.
It is the only cloud database built on the ground to meet the needs of serverless apps. FaunaDB believes in not managing databases either if you don’t manage your servers. No provisioning, utility pricing and built-in security are some of the prominent features what makes FaunaDB an ideal option for teams building serverless applications.
Aurora Serverless is an on-demand auto-scaling configuration for Aurora where the database will start up and shut down as per the application’s needs. There is no complexity in managing database instances and capacity. It is built on the same fault-tolerant, distributed and self-healing Aurora storage with 6-way replication to protect you against any data loss.
#4. Azure CosmosDB
Azure CosmosDB is a multi-model database with global distribution and horizontal scalability as its core features. It offers turnkey global distribution across any number of Azure regions by transparently scaling and replicating your data as per the locality of your users. More to that, CosmosDB guarantees single-digit-millisecond latencies at the 99th percentile anywhere in the world.
The future of Serverless Databases, especially AWS Aurora Serverless, look promising. The features of this modern technology have enabled us to draw focus on essentials like real-time access, scalability, security, and availability.
Having said that, it is not feasible to ditch existing databases all at once. Infrastructure management is challenging and, you could end up losing focus from the real problem of managing databases.
What I’d love to talk about is your experiments with any of the serverless databases and what are the things you look forward to? If you think I am missing something, kindly comment or get in touch with me on Twitter @Jignesh_Simform.