In the past few months, we have witnessed interesting things in the ecosystem of serverless technologies. Many organizations are adopting serverless architectures for building modern event-driven architecture. And today, I want to discuss one of the least talked topic- Serverless Databases.
I am sure most of the people are aware of the importance of testing environments for software and applications, especially testing databases. They are used very infrequently for very less amount of time with unpredictable loads.
However, many of us end up investing the major chunk of our project money in testing databases. I, personally, have seen many projects burning thousands a month in database costs because they prefer replicating the environment for testing branches.
You might say that this is the only option! Even I thought so until recently when AWS Aurora Serverless was launched! The biggest news for all of us from the AWS re:Invent 2017, a complete game changer!
No need to provision large ‘x’ instance types to support the testing environments that happens at 3 AM and sits idle for the rest of the day. Since storage and processing have separate pricing, switching to serverless databases can reduce significant costs. (We’ll discuss more about it in the pricing section ahead, stay tuned!)
However, it all depends on how good they are in meeting the requirements of modern applications with user-rich experiences.
If we look at the current picture, the massive investment in database infrastructure has made organizations less involved with the demands of digital businesses.
And the reason behind it is perhaps, the requirements of diverse database workload. To enable this to modern applications, it is not easy to support. Also, at this point, the traditional database has failed to keep up with the IT demands.
In this upcoming post from the Serverless Architecture, we will discuss the challenges faced in using traditional databases for modern event-driven applications. We will identify what are the limitations with present databases and introduce you to an alternative approach in the form of ‘serverless database.’
Need of Serverless Database
Disruptive competition has mandated organizations to innovate rapidly and to keep up with the pace + innovation. It might look like a hard way but it the only way to align with business teams.
This isn’t the end, management teams are expecting lower resource costs and higher return on investment for the technology projects.
As we discussed earlier, modern applications are thriving due to the fact that they enable user rich experience. This could range from lightweight social gestures in terms of likes, followers, purchases, payrolls to handling the massive amount of data, for example, IoT sensors embedded in devices, etc.
To thrive in this race of user rich experiences, applications need data in real-time. Look at your phone and tell me if there is an app which doesn’t use real-time data processing? Hardly couple of them, right?
It is quite old-school to invest your time, money and efforts to scale a single, monolithic application system. Instead, what organizations need today is a system where they can scale without losing the ability to continually adapt and innovate against managing the database infrastructure.
In this varied landscape, serverless databases play an expanding and unique role by delivering secure, safe and scalable access, all the while reducing your operational costs.
What is Serverless Database?
We all are aware of the cloud databases, say, AWS Aurora which is compatible with MySQL or PostgreSQL, fully-managed and automatically scales up to 64TB of database storage.
While creating this database, you choose the desired instance size and this works really well in an environment where there are predictable workload, request rate and processing requirements.
However, in the cases where the workload isn’t predictable and there is a burst of a request for few minutes a week or a day, arranging right amount of capacity can be a lot of work. At the same time, paying for it on a continuous basis might not be the best solution.
And here’s where serverless database comes in the picture. These are specially designed for the workloads which are unpredictable and can change rapidly. What’s more? This allows you to pay only for the database resources you use, on a second-by-second basis.
How does it Work?
Serverless Database, say, AWS Aurora Serverless comes with an on-demand auto-scaling configuration. This means, the database will start-up, scales capacity as per your application’s demand and shuts down when not in use.
What’s more? You run your database in the cloud without managing the instances or clusters
Serverless Database model is built upon the separation of storage and processing.
You create an endpoint, set up the minimum and maximum capacity if you like, and issue queries to the endpoint. This endpoint works as a proxy to a frequently scaled fleet of database resources. This empowers your connections to remain intact while scaling operations occurs behind the stage.
The separation of storage and processing brings another benefit as well. You can easily scale down to zero processing and only pay for the storage requirements. Whenever your application demands, scaling happens in almost 5 seconds, while building upon a pool of “warm” resources which are eager to serve your requests.
Pricing Comparison: Serverless vs Cloud
Let’s consider how AWS Aurora Serverless, one of the most-talked-about Serverless Database, bills you.
With this, billing is based upon Aurora Capacity Units (ACU), which is a combination of computing power + memory usage, where compute power includes database capacity and I/O your database consumes while it is active. Presently, 1 ACU costs $0.06 per hour.
Each ACU has approximately 2GB of memory with corresponding CPU and networking. You only pay a flat rate per second of ACU usage, with a minimum of 1-minute of usage every time the database is activated.
Let’s compare the pricing between Aurora Serverless and Aurora RDS. Suppose you’re creating a test environment where you’ll be running 2 test over the span of 24 hours.
#1. For the first test, Aurora runs at a capacity of 4 ACUs for 44 minutes and 6 seconds and performs 60,000 I/O requests and will scale down when the test is complete.
#2. For the second test, Aurora runs at a capacity of 8 ACUs for 30 minutes and 8 seconds and performs 80,000 I/O requests and will scale down when the test is complete.
Here’s what you’ll be billed for! Please note, you are not charged while the database is scaled to zero. Also, standard charges are applied for database storage and I/O over the period of 24 hours.
(6.957 ACU-hours) + (200GB of storage for 24 hours) + (140,000 I/O) = $0.42 + $0.66 + $0.03 = $1.11*
Standard 24 hour charges for storage + 140,000 I/O + db.t2.small instance = $3.25*
You see the striking difference of $2.14 for just 24 hours? Serverless databases are particularly useful in event-driven applications where the load is intermittent with spikes.
* Prices have been calculated as per the AWS Simple Monthly Calculator. Prices may vary according to different configurations and location.
Limitations of Traditional Databases
Traditional databases are coming off with huge limitations when it’s time to approaching new tech heights! Below are some of the business impacts that your organization might be facing due to them. Check them out!
#1. Overspending on Resources
To manage the huge data infrastructure, companies spend an immense amount of money. More to that, they are utilized very thinly. Traditional database infrastructure means they benefit very little from resource sharing, and they keep on wasting money with isolated teams and their dedicated resources.
#2. Locality of Data
Who doesn’t have a global customer base? Almost everybody! To maintain data availability and low latency, the database is replicated across various data centers. However, due to networking infrastructure, it is impossible to make sure that the data requested is returned from the same geographic location.
#3. Higher Fulfillment Time
Even after years of investment, large organizations often come across what we call ‘database diversity’ problem. Since there are so many databases available in the market, it is quite hard for the development team to add functionalities to all of them once in a while. And the result is a higher time of fulfillment period.
Features of Serverless Database
How you wish serverless databases could be the solution to all your problems! Here are some of the exciting features that you can look forward to.
#1. Multi-tenant Architecture: One of the main benefits of serverless databases is that they can function as a single pool of resources that can be used by multiple projects within your organizations.
This is a huge plus point for the development team as they are not required to build application-specific siloed data sources.
This is possible due to its multi-tenant architecture. This enables developers to set up, configure and deploy multiple applications within the same database cluster. No need to know where and how your data is stored!
#2. Dynamic Quality of Service: Multi-tenancy opens up the possibility of assigning priority to each tenant. Under this, you can assign priority to specific tenants and then accordingly, system resources will be consumed.
This empowers operation teams to make maximum use of resources across their organization’s projects. However, it must be kept back in mind that this will demand a high transparency rate across the whole database cluster.
With this practice, your resources can be run at 60-80% utilization. If any spike is observed towards 90% or above, it will result in throttling of lower-priority workloads.
For example, high priority workload might be an issue in making a purchase, while low priority workload might be batch reporting processes.
#3. Geo Distribution: Given the fact that most of the businesses are working at a global scale, it is a prerequisite for the organizations to have their data needs available around the world.
The query of an end user from India shouldn’t have to incur the latency of connecting to servers in USA. With the close proximity of data centers, the real-time experience is enhanced to a greater extent. Moreover, the risk of outage is highly unlikely as there is no single point of failure.
Serverless database lets you replicate multiple datasets across the world without any additional tooling or custom developments. Various protocols embedded in the serverless database networking layer makes sure that it responds correctly to failures and performance degradation.
#4. ACID Consistency: There is no way possible that we can compromise data accuracy over its real-time availability. In other words, we can’t afford serverless database to sacrifice its transactional consistency for speed and scale.
However, the need of the modern application of the real-time data can be fulfilled easily if you are willing to give up transactional components. And if you do, you’ll give up your ability to execute business critical workloads.
Serverless databases support ACID transactions with the new approach of classifying sequencing and scheduling functionality within a single layer above the stored data system. Thus, queries incoming from the end user application is managed before the DBMS interacts with the data storage system.
This gives enough time for the nodes in a cluster to decide which node will handle this transaction before the write begins while making sure that two simultaneous process doesn’t modify the dataset. In this manner, it is possible to scale while making room for ACID transactions as well.
#5. Single Transactional Query Language: As per our discussion earlier, serverless databases facilitates a single pool of resources to multiple applications. However, this shouldn’t let it impose a particular data structure or a single model for all its applications consuming that data.
If we’d do that, it’d be a serious problem for the development team to work with the rigid team. Also, this way, it’d be impossible to store data in a way it was originally meant to be.
Moreover, some applications might need strong schema while some might need schema-less model. To all these problems, an ideal serverless database supports structured as well as unstructured data, while providing an option whether you want to opt for schema or not.
Schema optionality empowers the development teams with the support of use cases of structured and unstructured data. More to that, schema optionality facilitates all the advantages of having a schema with none of the inherent disadvantages.
Benefits of Serverless Database
#1. Real-time Access: You have access to your data at a granular level. Whatever the data that you store, gets automatically indexed by default and it makes those indexes available immediately.
This means you’ll be able to constantly query, read, write, update and add new items to your serverless database. What’s more? It will have easy and instant access via functions.
#2. Infinite Scalability: Serverless databases can be scaled up or down anytime you want. As discussed already, they start-up or shut down as per the application’s need.
If your functions are querying, reading or writing data to the same database cluster, it will scale the computing units (ACU in the case of Aurora Serverless) to handle the load.
Due to this automation, all your functions will be able to work in parallel and your data is guaranteed to be consistent.
#3. High Security: Most of the traditional databases implement schema-level user authentication only, due to the fact that they are designed for a small number of internal business users. But modern applications are exposed to the untrusted and malicious audience at a global level.
Serverless database takes care of it by making sure that all the application, while interacting with the same dataset passes through the same protocol of access control. As a result, it reduces the attack surface, a critical risk for businesses.
#4. Availability: While businesses are going global, it is imperative to have your data replicated at different geo-location, it means closest to where your users are.
Serverless database empowers you with that which helps you in reducing the latency. With this approach, data from event-driven functions is read where it is closest to the user.
#5. Schemaless: This feature is available with certain Serverless databases only. This feature is quite unique as it enables you to handle any data output from your functions. This ‘handle anything’ approach makes it extremely easy to integrate serverless databases with your functions.
Use Cases of Serverless Database
When to Use?
#1. New Applications: If you’re building an application whose usage might be for few minutes over the course of a day or week, the serverless database is the best fit for you.
For instance, you have got a low-volume blog site and you want to pay only for the time any user is accessing your site, this works for you since you pay for the database resource you consume on a per-second basis.
#2. Infrequent Used Applications: If you’re building a new application and you’re not sure how your user base is going to scale, the serverless database is the best fit for you.
For instance, you’ve built an app and you’ve no idea how popular it will become and also, you don’t want to take a chance, this works for you. Just create an end-point and let the serverless database auto-scale as per the requirements of your application.
#3. Variable Workloads: If you’re building an application where you’re likely to see around, say, 30 minutes of the peak for several hours a few times each day or several times per year, the serverless database is the best fit for you.
For instance, applications like HR, budgeting, operational reporting, etc. have highly unpredictable peak timings and hence, reserving instance is impossible. However, with the serverless database you don’t have to provision to either peak or average capacity.
#4. Test Databases: If you’re building a test database, it is going to be used only during the working hours of your organisation, well why pay for it when it is not in use? Serverless here is the best fit as it shuts down automatically when not in use.
Where to Use?
#1. Telecommunications & IoT: Let us assume the scenario of connected cars. You can use events to trigger functions related to car alerts. Let’s take an example of checking an engine light ON in a connected car.
A function is created which invokes as soon as there is any data-change to the sensor data collection. Moreover, if it meets the threshold limit, another function is triggered which sends a notification to the warranty department.
Another function is created which sends the maintenance mailers as soon as the threshold conditions from the sensor data collection is met. Since, all these events need to occur in real-time, serverless database can be used here as a most feasible solution.
#2. Finance: Let us take the case of financial application. With these applications, you can use events to trigger functions related to the bank account of a particular user. This function can be set whenever the account balance is below a certain limit.
With an API call, you can access the data stored in the serverless database. The conditions to invoke these functions can be set to weekly or monthly and for what would be considered as a low balance. Since you can’t predict the occurrence of these events, the serverless database is best.
#3. Marketing & Retail: Let us take the case of retail business. Whenever a user adds a particular item in their cart, you can set up functions which trigger further business pipeline components like, updating inventory totals or send customer information for certain products to the marketing department, who sends them a promotional mailer.
In this case, since the function is decoupled from the app and database itself, you don’t need to keep them spinning all the time.
#4. Ecommerce: Let us take the case of e-commerce where 3rd party payments are inevitable. Serverless databases streamline the user experience by letting you call these payment APIs securely, without setting up your own servers.
When a user reaches for the payment, you collect their payment information and pass it to your database, then you process their payment by making requests to a third-party payment processing API over HTTPS, write the order to the serverless database, and return a confirmation message to the user once it is finished. All securely, without managing servers.
Serverless Database Vendors
A note before we move towards this section: there are plenty of databases available in the market and most of them can be used while making your serverless application. However, the following databases should be preferred as they are purely meant to be used with functions and they stand out due to their two distinct features- pricing and real-time responsiveness.
Firebase Realtime Database lets you store and sync data between users and devices in real time using cloud-hosted, NoSQL database. The updated data syncs across the connected devices in milliseconds. More to that, data remains available even if your app goes offline, providing a great user experience irrespective of the network connectivity.
It is the only cloud database built on the ground to meet the needs of serverless apps. FaunaDB believes in not managing databases either, if you don’t manage your servers. No provisioning, utility pricing and built-in security are some of the prominent features what makes FaunaDB an ideal option for teams building serverless applications.
Aurora Serverless is an on-demand auto-scaling configuration for Aurora where the database will start up and shut down as per the application’s needs. There is no complexity of managing database instances and capacity. It is built on the same fault-tolerant, distributed and self-healing Aurora storage with 6-way replication to protect you against any data loss.
#4. Azure CosmosDB
Azure CosmosDB is a multi-model database with global distribution and horizontal scalability as its core features. It offers turnkey global distribution across any number of Azure regions by transparently scaling and replicating your data as per the locality of your users. More to that, CosmosDB guarantees single-digit-millisecond latencies at the 99th percentile anywhere in the world.
While AWS Aurora Serverless and similar serverless databases are in their infancy, the future looks promising. Not all of us can ditch our databases, at least not yet, but we seem to be moving in the right direction.
Infrastructure management is challenging and very painful. It shifts our focus away from the real problem to the undifferentiated heavy lifting of managing databases. Let’s just not talk about it!
What I’d love to talk about is your experiments with any of the serverless databases and what are the things you look forward to? If you think I am missing something, kindly comment or get in touch with me on Twitter @Jignesh_Simform.