Uber Opens to Machine Learning with Code-Free Open Source Toolbox


Ride-hailing giant Uber has revealed Ludwig, a code-free open source toolbox designed to make machine learning more accessible for non-experts as well as assist experienced developers and researchers by enabling faster model iteration cycles.

Built on Google’s TensorFlow open source library, Ludwig allows users to train and test artificial intelligence modelsOpens a new window without the need to write code.

“Ludwig is unique in its ability to help make deep learning easier to understand for non-experts and enable faster model improvement iteration cycles for experienced machine learning developers and researchers alike,” Uber scientistsOpens a new window Piero Molino, Yaroslav Dudin and Sai Sumanth Miryala wrote in a blog post.

The trio’s expertise span the toolbox’s uses: Molino is a senior machine learning and natural language processing research scientist, Dudin a senior software engineer and Miryala a speech and machine learning scientist. They added that “by using Ludwig, experts and researchers can simplify the prototyping process and streamline data processing so that they can focus on developing deep learning architectures rather than data wrangling.”

Ludwig has been in development for two years, striving to streamline and simplify the use of deep learning models in applied projects. With a mix of influences from machine learning software, including Weka, MLlib, Caffe and scikit-learn, it is quite different, says Uber, from deep learning libraries that provide tensor algebra primitives and few other utilities to code models.

Filling the Gaps in Open Source Led to Build

With the building blocks for deep learning models available from open source libraries, the Uber AI unit realized there was no need to reinvent the wheel: They could develop packages built on these foundations.

In 2017 the company released Pyro, a deep probabilistic programming language built on Facebook’s PyTorch, improving it with the help of the open source community. They also created the Horovod AI tool, a framework hosted by the LF Deep Learning Foundation that permits distributed training of deep learning models over multiple general processing units and machines.

Ludwig’s core design principlesOpens a new window are based upon the unnecessary need for coding skills; generality, meaning the tool can be used across many different cases; flexibility, for both experienced users and newcomers; extensibility, meaning it is easy to add new model architecture and data types; and understandability, with standard visualizations such as graphs to help comprehend the performance of models.

Uber has already put the tool to use in its own business, extracting information from its drivers’ licenses, identifying points of interest during chats between driver-partners and riders, as well as predicting food delivery times.

Users can initiate training on deep learning models simply using a tabular dataset file, for example a CSV, and a YAML configuration file “that specifies which columns of the tabular file are input features and which are output target variables.” Where more than one output target variable is specified, Ludwig learns to predict all the outputs simultaneously – an action that would usually call for custom-made coding.

Additionally, every model trained in the system is saved and can be reloaded later to get predictions when new data is added.

Algorithmic Architecture

The newest principle of the toolbox is the idea of data type-specific encoders and decoders, resulting in a modularized and extensible architecture. Every sort of data it supports, including text, images and categories, has a specific preprocessing function. In other words, encoders link raw data to tensors, while decoders map tensors to this raw data.

“By composing these data type-specific components, users can make Ludwig train models on a wide variety of tasks,” Uber says. “For example, by combining a text encoder and a category decoder, the user can obtain a text classifier, while combining an image encoder and a text decoder will enable the user to obtain an image captioning model.”

The versatile and flexible “encoder-decoder” architecture helps those less experienced in deep learning to train models for a variety of machine learning tasks, including text classification, object classification, image captioning, sequence tagging, regression, language modeling, machine translation, time series forecasting, as well as answering questions.

Future plans for Ludwig include adding a series of new encoders for each data type, such as Transformer, ELMo, and BERT for text, and DenseNet and FractalNet for images. Uber also wants to provide additional data types such as audio, point clouds and graphs, as well as integrating more scalable solutions for managing big datasets, like the Petastorm library.