In the Cloud, ETL is More Relevant Than Ever

essidsolutions

With the convergence of data and cloud technology, businesses are on the cusp of a transformation that will affect everything we do – the way we operate, the experiences we create, and the products that we build. But to transform business, you need to transform data. So not only is ETL still relevant, it is as integral to this shift as data and cloud data warehouses. ETL is key to fulfilling the huge potential of both.

As we approach 2020, one thing is very clear: Every single company needs to compete using data. It’s not a matter of leading the market or being in second, third, or last place. It’s about staying in business at all. And companies need to innovate using data. We are at a point where the amount of data we collect is matched by the kind of technology that lets us do really cool things with it.

Like what? Well, for the past decade, we’ve gotten good at using data analytics software to help us measure business performance – areas of strength, where we’re lagging, tracking revenue, and so on. More recently, as technology advanced and the amount of data grew, we gained the ability to ask and answer new analytical questions. We could anticipate outcomes and further improve the way we operate. We could really start to understand and serve the customer and spot new markets and opportunities.

The third data evolution

Now there is a third evolution. For companies on the forefront of innovation, data itself is part of the product and changes the nature of business. Offering insight is one way that data becomes the product. But the data also shapes products and evolves them. For example, sensors in Siemens gas turbines collect data to help them run more efficiently. Video on Demand platforms serve up recommendations based on customer behavior and preferences. Citrix uses customer data to improve features and performance of its ShareFile software, enabling customers to share information and work together more efficiently. Data will only continue to become a more integral part of the products we develop.

If we want to use all of the data at our disposal to create groundbreaking products and stay relevant, we need technology with speed and scale. On the whole, the only place that is practical in modern business is in the cloud. More specifically, in a cloud data warehouse. Eventually every company will have a cloud data warehouse, or, more likely, several.

There’s simply no other way that makes sense to manage the enormous volume and velocity of data we are collecting at this point, so that we can leverage the full power of that data with analytics. A traditional, on-premises data warehouse isn’t scalable or cost-effective enough. As the market has shown, Hadoop is too fragile and too high-maintenance for all but the most web-scale of use cases.

So cloud data warehouse (CDW) is the way to go. A CDW has the speed and sophistication, but also resilience, stability and ease of use to accommodate organizations of any size and budget. If you’re looking for a sign that the CDW is on the rise, in January 2018, Snowflake Computing was valued at $1.5 billionOpens a new window . Nine months later, its valuation was $3.5 billionOpens a new window , and it continues to rise.

Before you can use data, you need to transform itOur data and our data warehouses are changing, but one thing isn’t changing–the need to make data valuable through transformation. It’s not a trendy thing to say. However, if you work with data, you know that it is inherently complex and messy. We will always need to transform it from mess to something that’s organized, analytics-ready, and therefore useful.

You see, a data warehouse is not the same as a data warehouse. That may sound confusing, but hear me out. First, you have your data warehouse “engine”: that’s the database infrastructure that stores the data and processes the queries. But there’s also the “design” of the actual database that you build on that engine. That’s the data model, or the data warehouse itself!

To put it another way, simply putting your data onto a data warehouse engine (cloud based or otherwise), does not create a fully functioning data warehouse. You also have to transform the data from it’s raw, probably messy, siloed and normalized source state, to data that’s joined together, dimensionally modeled, de-normalized analytics-ready state. When you’ve done that, you now have a true data warehouse: a data warehouse model running on your cloud data warehouse engine.

That’s why transformation is so important – it’s the hard and time-consuming place where the data professional lives. It’s unavoidable and it’s what actually creates your data warehouse.

4 things you need for cloud data transformation

In the past, we used traditional ETL tools to move data into a traditional, on-premises data warehouse and transform it. Then we could use it for operations, and then for analytics. Just as data warehouses are moving to the cloud, that ETL function needs to move to the cloud, too.

How do we handle data in the cloud? There are 4 things your ETL tool needs to do.

1. It needs to be cloud-native

Traditional tools for traditional data warehouses don’t translate. They were built for a platform with limitations, and will not step up to take advantage of the speed and scalability of the cloud.

2. It needs to go deep

A cloud-based ETL tool needs to do all of the things that a traditional, full-scale, grown-up enterprise ETL tool does – and then some. It needs to be able to take all kinds of data – structured, semi-structured, cloud or on-prem– from all kinds of data sources and join them together.

3. It needs to be transformative

Again, just moving data from left to right isn’t the whole job, it isn’t even the most critical part of the job. You need a tool that was specifically created to transform that data and get it analytics-ready.

4. It needs to be flexible

It’s rare that a business has data and operations in only one cloud. Or that it doesn’t need to move information from one cloud to another. Choose a tool that can move with you. Also the job of transforming is almost never done. You continually innovate at the transformation layer, as the business changes and as new questions require new answers and insights.

With the convergence of data and cloud technology, businesses are on the cusp of a transformation that will affect everything we do – the way we operate, the experiences we create, and the products that we build. But to transform business, you need to transform data. So not only is ETL still relevant, it is as integral to this shift as data and cloud data warehouses. ETL is key to fulfilling the huge potential of both.