As more and more businesses decide that they don't want to manage infrastructure and IT operations by themselves, the ‘Cloud First’ approach is becoming a norm rather than a choice. With the leading players in this domain being Amazon Web Services (AWS) and Azure, Google is often criticized for the lack of institutional will to create a robust cloud platform that enterprises will be enticed to adopt.
However, this outlook is changing now. Google is not just investing heavily in their cloud platform, but also bringing the enterprise mindset and ecosystem into the Google Cloud Platform (GCP).
A few services that make the Google Cloud Platform (GCP) unique are:
BigQuery
A fully managed, petabyte scale and low-cost enterprise data warehouse for analytics.
Simply speaking, BigQuery is the public interface for one of Google’s most successful internal tools called “Dremel.” It can scan millions of rows without an index in a matter of seconds and is an excellent alternative to Hive on Hadoop or Spark SQL.
However, limited integration capability with some of the industry standard business intelligence tools and the lack of support for user-defined precision alternatives are some of BigQuery’s major shortcomings.
Of course, like any data warehouse, BigQuery cannot predict the amount of data that could be stored and processed in the future. Our solution experts at Nagarro however, generally try to solve this problem with calculation tools to optimize usage and predict costs relying on long-term storage, query optimization or usage, data expiration policies, partitioning, and sharding.
Cloud Dataflow
A managed service for developing and executing a wide range of data processing patterns using a unified programming model.
Big data applications often face challenges such as lack of resource capacity, a wide range of data processing patterns, and managing multiple services for various data processing pipelines. This leads to both performance and operational issues. Dataflow provides a simplified and developer-friendly way to approach this issue.
While the Cloud Dataflow runner service abstracts the application from low level details, the Dataflow SDK allows developers to create data processing jobs and focus on logical composition rather than physical orchestration. The designing of pipelines is important as flaws in such designs can lead to a huge increase in reads and writes, which will eventually lead to performance degradation.
At Nagarro, we recently recommended Cloud Dataflow together with BigQuery to a customer where requirements were centered around the IoT space – with more real-time event processing and stream processing with real-time data.
Cloud Dataproc
A fully managed Hadoop and Spark offering.
When the application is tightly coupled with some big data tools like Hadoop and Spark, Dataproc provides an easy alternative by making the entire process easy to start, run and manage. Dataproc also abstracts the application from hardware and software installations. Ease of use, together with quick creation and scaling capabilities, make Cloud Dataproc ideal for many business scenarios.
Although a built-in support for on-demand clusters is not available, these clusters can be managed by using the software development kit (SDK) and REST API provided by Google Cloud. Also, a cluster is acquired as part of job creation in Google, unlike other MapReduce solutions.
Factors like high granularity of billing (per-minute), separation of storage and compute, and affordability of services in comparison with competitors make Dataproc and its service offerings unique.
Machine Learning & Artificial Intelligence
One of the biggest challenges faced by the industry in the machine learning domain is training larger models. This often requires expensive special purpose hardware. With lack of technical capability to build and train custom models, it becomes the biggest hindrance in the usage of machine learning (ML) and artificial intelligence (AI) by enterprises.
Google solves this problem by providing pre-trained models along with a huge list of supported APIs such as Google Cloud Vision, Google Translate, Cloud Speech, and Cloud Natural Language.
Moreover, Google Translate API and Google Prediction API are simplified services that make learning and adoption easy. Many customers we have worked with at Nagarro are also able to make the best use of Googles AI/ML services based on TensorFlow for predictive analytics and deep learning.
Infrastructure as a Service (IaaS)
While the world is talking about “DevOps”, Google is already realizing the “NoOps” dream. Google says that the instance boot time of Google Compute Engine (GCE) is considerably better than most of its competitors. It also claims to be the first platform to deliver its cloud infrastructure on Skylake (Intel’s latest processor). Another interesting feature of GCE Load Balancers is that they don’t need any pre-warming like AWS.
GCE also brings in the concept of “pay per use” based on sub-hour billing. Here, the compute engine instances are charged in one-minute increments (with a ten-minute minimum chunk). Thus, users don't pay for compute minutes that are not consumed by their application.
Our observation is that Google has emerged as a strong and serious cloud player and has reinforced the belief that AI is the future of cloud. At Nagarro, we help our clients choose a computing platform based on an exhaustive checklist of functional and non-functional requirements. We also provide solutions keeping in mind the ever-changing market trends.
For more information on our offerings in this area, you can reach us at info@nagarro.com.