Six Ways of Looking at Data Warehousing as a Service

essidsolutions

As the name implies, data warehousing collects disparate data from multiple sources where it can be queried, analyzed and mined for intelligence. Keeping it on-premises, on a public cloud or going hybrid can be a fraught decision. Rohit Amarnath, CTO of Vertica explains the tradeoffs and options afforded by the DWaaS (Data Warehousing as a Service) model. 

With mobile applications, the Internet of Things (IoT) and digital consumer interactions exploding, data is getting generated at unprecedented speed and scale. Obviously, businesses want to capture and consume these data because they can reveal consumption patterns, deliver greater personalization, engagement, and provide tailored products and services to customers. 

That said, storing, maintaining, and analyzing such massive volumes of data is no joke. Over the past few years, the general shift has been to collect this data in the cloud because of the cost and convenience gained with the cloud-elastic commercial model. The data is usually staged in cloud object stores, also known these days as data lakes, and analyzed using cloud data warehouses. 

These cloud data warehouses (and “data lakehouses”) have grown in sophistication with innovations that leverage cloud strengths, like the separation of compute and storage. Today, most cloud providers and data warehouse vendors offer public cloud-based data warehousing services (a.k.a. data warehouse-as-a-service or DWaaS) with consumption-based pricing. Organizations leave the undifferentiated heavy lifting like setting up, maintaining, securing, or upgrading a data warehouse and all associated software and hardware stacks to the cloud vendor. 

DWaaS offerings come with different options and picking the right one can depend on the business use case. Therefore, it is prudent for organizations to carefully evaluate DWaaS on the points below as part of the overall decision-making process:

Software Limitations

When software is developed to be available as-a-service, which includes self-service and ease of use, it needs to be balanced with the stability of the platform and security. Other factors, including which capabilities the various clouds provide, can impact what functionality or capability the service can provide. 

For example, cloud object stores can vary in performance and features, and that can mean the DWaaS may perform differently from one cloud to another or only support one flavor of cloud. Some providers are cloud-only, which means that one would lose the ability to handle on-premises or hybrid workloads should they need it (for compliance or security reasons). 

If the DWaaS is a complete black box hosted by the vendor, that could mean less transparency and flexibility in tuning and configuration. Ensure you understand the impact of these factors on your workload. Generally, most workloads can find a service that offers the right sort of combination of features. In my experience, if a use case is particularly sophisticated, customers have the resources to build and self-manage, but that, too, is changing. 

Concurrent Data Systems are Hard

Most DwaaS are multi-node, cluster-based platforms. Concurrent systems are inherently sophisticated and impose rules on how the cloud infrastructure can be configured. This could include the number of nodes, or how the nodes can scale up or down, or how the inter-node communication is laid out. 

This can be limiting, but the rules are usually there for valid reasons – typically, stability for complex concurrent systems communication and data transfer, while handling failures that can happen at any time. Remember the adage – “Everything fails in the cloud”; how failures are handled defines how well the system has been architected for the cloud. The vendors have their work cut out for them to make sure flexibility is maximized at the cloud infrastructure level to make data warehousing in the cloud easier.

Hybrid Limitations

StudiesOpens a new window show that many businesses are also considering hybrid cloud adoption. Security, privacy and regulations are obviously a major drivers, but other than that, the total costOpens a new window of ownership of a cloud-only approach can be higher than that of running on-premises, with a trade-off in flexibility. 

Some DWaaS offerings do not have on-premises equivalent capabilities, and this can make it necessary for businesses to run multiple data warehouses. Such complexities can greatly increase the operation and orchestration costs of a major analytics program. 

Security Considerations 

No one will deny that the DWaaS model is convenient. What could be easier than letting someone else manage everything while you only focus on analytics, dashboards, and reports? But opting for DWaaS could result in some providers having more access to your data than they should. That said, tier-one DWaaS vendors leverage and implement encryption and cryptographic controls to separate the operations of the warehouse from access to the data. These vendors also implement security controls that qualify them for security standards like ISO27001 and SOC II, allowing you to trust their security practices, processes, and policies to maintain and protect your data.

See More: Data Is a Candy Store for Hackers: What the Private Sector Must Learn About Protecting Sensitive Data

Hidden Charges 

Budget overruns are a common problemOpens a new window when it comes to the cloud. Cloud charges can quickly spiral out of control due to the lack of predictability in usage, increasing complexity and the inherent elastic nature of the cloud. Watch out for hidden charges that are tied to consumption-based usage and performance. 

DwaaS business models have low entry pricing, but as soon as there is unexpected growth in data or extra compute power is needed, be ready to closely monitor resulting costs. Security features may only be available at higher tiers of service, with the obvious upcharge.

Vendor Lock-ins 

Most people think that it’s easy to move from one DWaaS to another, however, this is not the case. Every platform requires a significant amount of migration effort, especially if there is some customization involved. DWaaS vendors also sell a lot of add-on services to lock customers into complex deployments, and these are usually difficult to replicate in other platforms. Many cloud providers charge egress fees in case businesses want to migrate to another cloud or move their workloads on-premises. Sticking with standard SQL-based relational data warehouses can make it easier to move workloads around.

To sum it up, DWaaS probably makes good sense for many organizations. Their benefits over on-premises data centers include reduced staffing needs, easy scalability, and lower IT costs. One should consider their data workloads and look at data warehouse players that leverage cloud strengths like separation of compute and storage, consumption-based pricing, security certifications and controls, transparency in pricing and flexible deployment models. 

Have you benefited from DWaaS as a model? Tell us about your experience on FacebookOpens a new window , TwitterOpens a new window , and LinkedInOpens a new window .

Image Source: Shutterstock

MORE ON DATA WAREHOUSEÂ