The client is a leader in logistics who provides delivery services by connecting users with delivery drivers on its mobile and web apps. They were looking for an efficient and optimized solution to manage multiple teams, continuous releases, remote collaboration, and maintain optimum performance. Though the whole process sounds easy, but in real-time, it consisted of many complexities, as big data needs regular improvisation in the entire software development cycle.
The client was looking for a seamless automated DevOps solution to set up a big data environment on the cloud with real-time streaming features and implement frequent releases with quality control using automated end-to-end pipelines for different applications. Nagarro DevOps masterminds worked meticulously to study the challenges in detail and helped them achieve an Enterprise DevOps solution for their big data platform. The solution powered business strategies and boosted overall growth.
challenge
To establish an automated big data environment over Kubernetes from scratch and facilitate frequent releases with the help of automated end-to-end pipelines for different applications. Additionally, there was no real-time tracking of their delivery drivers, and archived data of last 24hrs was used for visualization and analysis which was inconsistent in nature. Moreover, many times the client reported data loss and had no proper mechanism to get filtered and meaningful insights from visualization dashboards like Tableau. Thus, an automated solution was needed to ensure data consistency and accuracy, handle a high throughput of data while real-time streaming, and implement uniform standards throughout the system.
solution
Nagarro studied the client’s ecosystem to identify the gaps during the DevOps Assessment & Transformation Workshop. We designed the end-to-end architecture of the entire platform where a phase-wise roadmap to achieve the complete big data setup leveraging DataOps (DevOps over Big Data) best practices/toolchain was blueprinted.
The solution also involved writing KOPS solution for the big data platform over AWS and the local Dev environment using minikubes. To achieve real-time streaming data of delivery drivers, Kafka Source & Sink connectors were set up which enabled real-time data streaming through multiple topics & queues. Also, the data schema was compared at Kafka’s receiver end and the sender’s end to maintain the sanity and correctness of data. If there was any mismatch in data schema comparison, the data was identified as invalid and automated feedback was sent to resend the data. And Apache Spark was used to compute, filter, and format the relevant data for getting meaningful insights on visualization dashboards like Tableau.
Quick Wins
With Nagarro's solution, the client achieved a cost-effective, seamless DataOps solution that enabled:
-
30% faster rollouts via automated deployment pipelines of the entire platform.
-
Automated workflows without any manual intervention. The solution increased the awareness of data quality, consistency, and data quality issues, along with more responsive systems with a Continuous Feedback Loop in the solution architecture.
-
Reached multi-fold growth in their business through our DataOps solution with increased availability, responsiveness, auto-scaling, and recovery.