Can Oracle’s Cloud Data Warehouse Spark Self-Service Data Warehousing Trend?

essidsolutions

Oracle recently announced its next-gen autonomous data warehouse, but is it simply marketing hype, or has Oracle succeeded in paving the way toward self-service? Lewis Cunningham, a certified cloud practitioner and Oracle ace director alumnus, answers the question. 

In March, Oracle announced the Next Generation of Oracle Autonomous Data WarehouseOpens a new window . In this article, I will discuss how the autonomous data warehouse fits into the trend towards self-service, and whether Oracle’s autonomous data warehouse is marketing hype, or are the features in the Oracle Cloud really paving the way towards self-service?

The first thing to discuss is what self-service truly means. Using a data warehouse involves more than just running queries and reports or getting value from analytics and insights. A data warehouse must be able to reliably ingest data and integrate data from disparate sources. It needs to store that data in a format usable by analysts, scientists, and business users. Lastly, the user would need to have tools available for visualization and analytics without resorting to third-party tools.

Self-Service Data Engineering

In this context, self-service means that a user, not a developer, can load data, transform it to a usable form, and store it in that form. Oracle’s new generation of autonomous databases moves much closer to the self-service model with a set of features starting with, what Oracle calls, Autonomous Data Warehouse (ADW) Tools. 

Along with standard tools such as a SQL worksheet and data modeling, self-service offers loading data, linking to external data, and setting up a regular data feed. You do not need to know SQL to use the data load tools, and it features a drag and drop interface that any internet user is familiar with. This allows a regular user, what Oracle refers to as data analysts, citizen data scientists, and business users, to ingest data as needed: the first step in self-service. 

As part of the data load, the Data Load tool will recognize the type of file and, if possible, automatically identify and create the correct columns and data types (all of which are customizable in the GUI). 

The second step in self-service is transforming it, if needed, into the form required by users. ADW includes a GUI-based Data Transforms tool which provides transformation via a new interface on top of Oracle Data Integrator (ODI), a top-tier data integration tool. Like the Data Load tool, the Data Transforms tool is a drag and drop tool meant for any user, not just developers. 

While the Transforms tool structures your data the way you need it, the Business Model tool gives you an interface for accessing it. Oracle will automatically identify the important parts of your data, such as facts, dimensions, and hierarchies, and creates a semantic model on top of your data. This model is available for querying, and Oracle will rewrite queries as needed. 

Here’s more information on the Business Model toolOpens a new window and Oracle Autonomous Database toolsOpens a new window .

Once a user has identified a useful data source and has a model they can query, it is time to gather insights and make use of the data.

Also Read: How to Build a Simple Data Pipeline on Google Cloud Platform

Self-Service Insights and Machine Learning

The final step in self-service is making use of the data. For data analysts, Oracle is adding the Data InsightsOpens a new window tool. The company runs a process, possibly ML-based, that finds what they say are patterns, anomalies, and outliers. The data is then presented as a dashboard.

The last item that I will mention is what Oracle calls the Citizen Data Scientist. The company defines a citizen data scientist as “a business user who has a deep understanding of the data and the business problems that need to be solved – but is not a professional data scientist with an advanced computer science degree.” 

AutoML UI 

The Oracle database has a set of 30 machine learning algorithms built into the autonomous database. These algorithms have been available for many years, but they require some fairly technical knowledge and experience to use. The intent of AutoMLOpens a new window is to move that functionality out to more users and allow them to “self-serve” ML models. 

A citizen scientist still needs to know a bit about machine learning. Think of it as sort of a guided ML wizard. The user still needs to understand what a feature is and what a metric might be in the data. But that user is not required to understand the ML models, how to run them, or how to interpret most of the results. AutoML understands those items by looking at data, plugs the right data into the right parameters, and runs the best models for the input data. AutoML will even tune the models for predictive impact analysis.

The AutoML UI provides a no-code front end to the database resident ML capabilities. It generates notebooks for additional analysis by utilizing the OML4Py API (a very cool Python API for working with Oracle ML). OML4Py is worth an article all on its own.

These are just a few of the newer features in the Oracle Autonomous Data Warehouse that move Oracle closer to a self-service model. 

Also Read: Highway to Heaven: Building a Strong Cloud-Based Business Roadmap

Conclusion

In the old-school pre-cloud days, the above steps were handled by developers, DBAs, and ETL experts. With cloud computing and big data, many of those jobs merged into the data engineering profession. Data engineers get data, ingest it, transform it, and extract it when needed. 

In a complex enterprise environment, the autonomous data warehouse, even the next generation warehouse, is not going to replace data engineers any more than AutoML is going to replace data scientists. The goal is not to replace engineers or scientists but to simplify data ingestion to the point that users can start working with the data they have available as quickly as possible. Once the users have validated the data and the usage of the data, data engineers can take it over and automate the ingestion and other steps leading towards productizing the data.

We have reached a point of data overload. We have the computing and storage now to hold all of the data, but data does us no good if it is not in a usable format in the hands of people who need to make decisions. That is the true goal of self-service. Get the data into the hands of the people who need it when they need it. It can be streamlined, tuned, and automated over time but making use of it as quickly as possible is an important step.

Is the Oracle autonomous data warehouse hype or a step into the realm of self-service? I would have to say it is a step in the right direction. Is it complete? Not quite yet. Some of the tools seem a bit disconnected, and there is still some technical knowledge, while if not required, is very helpful. A team using the new features would find it helpful to have access to a data engineer or data scientist for advice while working with the ADW tools.

But overall, I would answer the question as yes. I would definitely say Oracle is paving a way forward towards self-service. They are not alone in this. Google has made some significant inroads, and I expect Amazon to do it soon. Oracle provides a robust self-service model, combined with their database feature set and reliability, and their very competitive pricing; I can see many companies giving serious consideration to this technology stack. 

Did you enjoy reading this article? Let us know your thoughts in the comment section below or on LinkedInOpens a new window , TwitterOpens a new window , or FacebookOpens a new window . We would love to hear from you!