Pixis: An AI-driven Data Analytics Platform

Category: Advertising and marketing
Services: DevOps, Migration, Cloud Architecture Design and Review, Managed Engineering Teams.

  • 25% reduction in ETL infrastructure costs
  • 30% improvement in data processing efficiency
  • 40% reduction in data latency for real-time processing

About Pixis

Pixis.ai helps brands scale marketing through data infrastructure and modeling, enabling data-driven decision-making in the face of complex consumer behavior. It offers a codeless AI platform that leverages machine learning and optimization algorithms to analyze audience, media platforms, budget allocation, and other marketing data.

Problem statement

  • The platform sourced data from multiple sources, making all the information available at centralized storage challenging.
  • Further, Pixis was also facing the issues of slower development cycles and a lack of a standardized process for tracking infrastructure costs.
  • Pixis was also battling the issue of high costs which was affecting company’s profitability and revenue figures
  • There were issues with the scalability of data infrastructure and background tasks where several services still needed to be migrated to new infrastructure.

Proposed solution

  • Our team leveraged AWS MSK to consolidate data from different systems into a centralized streaming data pipeline.
  • Further, MSK helps capture data in real-time from different sources and load it into a data lake in S3, reducing the siloed data issues.
  • Utilized AWS Managed Flink running on Amazon MSK to perform real-time and batch processing tasks. This enabled Pixis to transform and analyze data effectively before loading it into the data warehouse.
  • Our team of experts used Amazon GuardDuty to secure the entire data lake infrastructure by continuously monitoring for unauthorized data access.
  • Used AWS EKS for containerization and simplifying deployment, management, and scaling of data analytics platform.
  • Leveraged Amazon S3 for storing processed data in the data lake. This cost-effective storage solution provided scalability and flexibility for handling large volumes of data generated in the ETL process.
  • Set up an observability stack using Grafana, Prometheus, and CloudWatch for monitoring, alerting, and logs.
  • We used Amazon RDS to store data types, including campaign data, ad accounts, cross-platform engagement scores, ML datasets, and tenant data.


  • Optimizing data storage and resource allocation led to a 25% reduction in infrastructure costs associated with ETL operations.
  • Containerizing ETL infrastructure on AWS EKS enabled Pixis to scale their data processing capacity dynamically, accommodating fluctuating workloads seamlessly.
  • Streamlining the ETL process using AWS services resulted in a 30% improvement in data processing efficiency.
  • Leveraging AWS MSK for real-time data ingestion and processing reduced data latency by 40%, enabling Pixis to make faster, data-driven marketing decisions.

Arhitecture Diagram

Pixis Architecture Diagram

AWS Service

  • AWS MSK – We used AWS MSK to consolidate real-time data from multiple sources into a centralized streaming pipeline and store it in an S3 data lake, reducing siloed data concerns.
  • Amazon RDS – We leveraged Amazon RDS as our primary storage solution for various data types, including campaign data, ad accounts, cross-platform engagement scores, ML datasets, tenant data, and more.
  • Amazon S3 buckets – Our experts leveraged Amazon S3 buckets as a highly secure storage solution for storing various data types in the system, including configuration and customer data files.
  • Amazon EKS – We leveraged Amazon EKS to manage and scale microservices easily, enabling background jobs on a containerized infrastructure.
  • Amazon MQ – We utilized Amazon MQ to facilitate data ingestion from various sources into a centralized repository.
  • AWS Trusted Advisor – Our team used AWS Trusted Advisor to identify overprovisioned resources and improve our security posture.
  • AWS CloudTrail – AWS CloudTrail enables auditing, security monitoring, and operational troubleshooting by tracking user activity for enhanced data analytics.
  • AWS ECR – We used AWS ECR to manage and scan the Docker images of our microservices securely.
  • AWS Lambda – We used AWS Lambda to trigger and run machine learning pipelines and alerts.
  • Redis – We leveraged Redis to store user sessions, which allowed us to retrieve session data easily and provide a better user experience.
  • AWS SecurityHub – We utilized Security Hub to get a comprehensive view of our security state in AWS and to ensure our environment adhered to security industry standards and best practices.
  • AWS NLB – Our team uses AWS NLB as our load balancer to distribute incoming traffic across multiple targets in different availability zones.
  • Amazon CloudWatch – We used CloudWatch to monitor AWS services like RDS and MQ. Customized metrics, dashboards, and alarms helped us respond to issues quickly.
  • AWS Secrets Manager – Our team used AWS Secrets Manager to securely store and manage sensitive information for our microservices, including API keys and database credentials.
  • Amazon GuardDuty – Our experts used Amazon GuardDuty to monitor and secure the entire data lake infrastructure against unauthorized access.

Related Case Studies

ONA dating - case study
Freewire - case study

Speak to our experts to unlock the value of Mobility, IoT, and Data Insights!