10 Observability Success Stories: Organizations that Nailed Observability
Initially, IT teams relied on APM (Application Performance Monitoring) and NPM (Network Performance Monitoring) to monitor and troubleshoot application and infrastructure-level issues. But, the evolution of modern development practices introduced too many distributed components, making it hard for APM and NPM solutions to provide full-stack visibility.
Hence, observability, due to its ability to provide full-stack visibility inside a distributed IT system, became the natural successor of APM and NPM. With observability, businesses can troubleshoot production-level issues proactively.
In this article, we will explore that through the success stories of 10 companies that adopted observability.
10 Observability success stories: Learning from top companies
Observability solutions aim to address three significant challenges, i.e., enhancing digital experiences, ensuring high availability and scalability, and keeping the peak performance intact.
Based on these challenges, we have divided 10 companies into three categories. Each category represents a unique challenge that companies solve with end-to-end observability solutions.
Challenge 1: Enhancing digital experience
In today’s day and age, exceptional digital experiences have become the need of the hour. 80% of customer interactions happen over a digital platform. 70% of customers believe digital experience is critical to evaluate a product.
Observability gives you visibility into the entire IT ecosystem and helps you find the root cause of the issues impacting digital experience. Average response time, error rate, load time, usability, etc., are some KPIs to measure digital experience.
1. Dubai Customs reduced the number of tests by 90% and sped up the release cycle by 70%
Dubai Customs used Mirsal, a mission-critical app that processed the movement of goods in and out of the Emirates. Any failure in this app led to long delays and tailbacks at the border, causing chaos in trade. So, maximizing the uptime of the application was critical.
Moreover, Dubai Customs aspired to contact their customers proactively to help complete their transactions in case of a problem. It couldn’t fulfill these requirements with the legacy monitoring tool it used.
So, it opted for an end-to-end observability solution that continuously tracks Mirsal and ensures higher uptime and a good user experience.
The solution helped Dubai Customs identify the root cause of the issue impacting the user experience. It accelerated the time-to-market and release cycle by 70%.
Teams used a shift-left approach to eliminate bugs in pre-production, resulting in high-quality releases. They also ran all tests into a unified solution, reducing the number of tests from 30 to 3 for each release.
2. TSB bank accelerated digital innovation and enhanced customer experiences
TSB is the UK’s 7th largest retail bank.
To expand its digital footprint and drive innovation, it built a modern banking platform on AWS, IBM Cloud, and BT Cloud. However, its transition to multi-cloud architecture introduced many distributed components, making it hard to gain full-stack visibility into the components.
Too much of the teams’ time was consumed by reactive problem-chasing. As a result, TSB had to compromise on efficiency and customer experience.
So, TSB adopted an observability solution that tracked all the components and dependencies across multi-cloud, helping it know real-time customer experience insights. Observability also helped engineers analyze the problem’s root cause and resolve them before the changes went live in the production environment.
3. Channel 7 delivered an A-grade streaming experience with 100% uptime
Channel 7 is Australia’s #1 free-to-air commercial television network and home to the most loved news, sports, and entertainment programs. It had the right to stream and televise mega sporting events such as AFL Grand Final, the 2020 Tokyo Olympics, and the 2022 Winter Olympics.
To capture the imagination of millions of people, it had to ensure higher uptime and an A-grade streaming experience. For that purpose, knowing scalable infrastructure and components was paramount.
However, its homegrown tools for simple monitoring and troubleshooting were insufficient with the scale of mega sporting events. This led to the shift to observability.
Channel 7 is now able to capture rich and valuable data at the application level to understand customer journey and identify the issues affecting user experience. It can determine the number of users the infrastructure can support at a particular time.
Opting for observability was a record-breaking win for Channel 7!
It ensured 100% uptime and a flawless user experience. As a result, Channel 7 could stream 4.7 billion minutes during Tokyo 2020 Olympics, 32.6 million minutes during AFL Grand Final, and 376 million minutes during the Winter Olympics 2022.
4. Swiggy served more than 30 million users and increased productivity by 10%
Swiggy is one of India’s leading food ordering and delivery platforms. Operating in over 500 cities with more than 30 million customers, it needed to deliver a comprehensive digital experience while maintaining sustainable user growth.
The major challenge for Swiggy was to optimize operational efficiency and ensure the platform’s reliability, scalability, and stability. With the expansion of business, the tech stack became complex. Simple monitoring/troubleshooting was insufficient to provide insights into the end-user experience.
However, observability empowered its engineers to get customer insights within 15 minutes, based on which they could identify and prevent outages. Observability also allowed Swiggy to identify the areas of UI where most users land, which helped them to optimize app design and enhance the app experience.
Challenge 2: Ensuring high availability and scalability
Observability helps detect infrastructure and application-level issues before they go live in production. It captures the telemetry data such as logs, traces, metrics, etc., to know the load capacity of the production and keep applications available even under a tremendous load, thereby, facilitating scalability.
5. Braze shortened their processing time by 90%
Braze is an American cloud-based organization that develops CRM software. The solution processes 8 billion API requests on a daily basis. With such massive operations and expanding business, it had to overcome the challenge of scaling, debugging incidents, and rapidly resolving customer support tickets.
Moreover, the customer support ticket was directly passed to the product and DevOps team, which hampered their productivity and efficiency. There was no centralized platform where the company could involve the customer success team that works with customers and rapidly resolves support tickets.
That’s when they shifted to observability.
Observability helped identify potential problems along with user interactions and their pain points. The gathered customer data helped the customer success team resolve customer issues quickly.
After fixing the problems related to performance, Braze ensured higher availability for the application. The engineering team scaled up its infrastructure without worrying about production issues.
Lastly, one unified platform instead of multiple tools reduced processing time by 90%.
6. Lenovo ensured 100% uptime while reducing MTTR by 85%
Lenovo provides smart devices for consumers and businesses through offline and online stores. With most of its sales coming from e-commerce, Lenovo relied on monitoring solutions to get visibility into every process and ensure seamless shopping experiences. However, visibility became an issue as the business grew and infrastructure moved to the cloud from on-premise data centers. It led to unexpected traffic spikes.
Lenovo, then, opted for an observability solution to monitor infrastructure performance and identify performance bottlenecks.
The result was that MTTR reduced by 85% from 30 minutes to mere five minutes, which means faster troubleshooting and increased developer productivity.
Observability also ensured 100% uptime translating to continuous availability. Lastly, predictive machine learning analytics gave real-time data and actionable insights based on which Lenovo could reliably scale up the infrastructure.
7. 2xConnect reduced downtime by 60% and increased conversion by 20%
2xConnect, a B2B telemarketing startup, adopted observability to increase its customer base, improve productivity, and reduce costs.
To reach its goal of reaching potential customers and increasing revenue, 2xConnect aimed for agile services and higher uptime.
With an observability solution, the engineering team could detect real-time bugs through log alerts and security-related monitoring. They were able to reduce call downtime by 60%.
Not only this, observability increased 2xConnect’s conversion rate by 20% while ensuring 100% uptime during Heroku to AWS migration.
Challenge 3: Keeping peak performance intact
Performance is closely related to the user experience of your application. Keeping the peak performance intact while the user base grows is crucial.
Observability helps you detect, troubleshoot, and eliminate application and infrastructure-level errors. So, as the user base grows, you can capture the performance data and analyze whether the application sustains peak performance while facilitating more users.
8. Care.com reduced MTTR by 85% and increased deployment frequency by 10x
Care.com opted for microservice-based architecture to eliminate challenges like availability, scalability, and reliability. However, to ensure a seamless transition, it needed a centralized monitoring repository and telemetry data that could provide granular details about various components.
That’s when Care.com opted for observability, which identified pages with poor views and load times. It helped Care.com optimize those pages to ensure a world-class user experience.
9. IGA Bahrain reduced downtime by 80% and optimized app performance
The Kingdom of Bahrain maintained all government data through the Information & eGovernment Authority (IGA). There were more than 35+ apps for e-governance. With distributed infrastructure and millions of citizens accessing these applications, monitoring integrated services and apps was not a piece of cake!
They had to ensure high availability, faster troubleshooting, and prevent future service interruptions.
That’s when IGA Bahrain opted for an end-to-end observability solution to view all the dependencies across 35+ applications.
It helped IGA Bahrain conduct root cause analysis and reduced service interruption downtime by 80%. The MTTR also reduced significantly, directly affecting the user experience and enhancing the application performance.
10. HelloFresh improved organization-wide performance and reduced cognitive load
To enhance coordination across teams, HelloFresh, one of the world’s leading meal kit companies, developed the ‘Platform & Payment Alliance’ model. However, the platform used a series of troubleshooting tools that increased its maintenance and overall load, and monitoring tools, each of which introduced its dependencies. Most importantly, most of the developers’ productive time was consumed by repetitive maintenance and monitoring tasks.
HelloFresh, thus, shifted to observability, a unified tool that enabled faster incident resolution while reducing cognitive load and time spent on maintenance.
HelloFresh could set alerts for specific problems and conduct root cause analysis. Observability ensured minimal downtime and helped HelloFresh improve the app’s overall performance.
Observability enhances digital experience by speeding up the release cycle and accelerating digital innovation. It ensures high availability by decreasing processing time and MTTR and increasing customer conversions. And moreover, maintains the peak performance while increasing deployment frequency and reducing downtime and cognitive load.
We, at SIMFORM, also experienced these benefits of observability when we collaborated with an auto dealer to modernize the marketing and analytics platform. With 1000+ car dealers using the platform, the challenge was to monitor the performance and reliability of the software.
We used infrastructure observability and captured MTTR, system uptime, response time, the number of requests, etc. to maintain peak performance and ensure reliability. As a result, our client experiences a 75% increase in lead conversion rate with a 25% retention rate.
Similarly, we can also make your IT systems observable. Collaborate with our engineering team for further consultation and ensure digital transformation for your business.