Taming the Deep Learning Workflow / Distributed Deep Learning with Hopsworks

Date: 
Thursday, April 25, 2019 - 18:00
Source: 
SF Machine Learning
Attendees: 
64
City: 
San Francisco

Title: Taming the Deep Learning Workflow

Abstract:

We are entering the golden age of artificial intelligence. Model-driven, statistical AI has already been responsible for breakthroughs on applications such as computer vision, speech recognition and machine translation, with countless more use-cases on the horizon. But if AI is the lynchpin to a new era of innovation, why does the infrastructure it’s built upon feel trapped in the 20th century? Worse, why is advanced AI tooling locked within the walls of a handful of multi-billion-dollar tech companies, inaccessible to anyone else?

In this talk, we will describe how deep learning workflows are impeded by the following circumstances:
The necessary expertise is scarce;
Hardware requirements can be prohibitive;
Current software tools are immature and limited in scope;

Under such circumstances, there are promising opportunities to dramatically improve these workflows via novel algorithmic and software solutions, including resource-aware neural architecture search and fully automated GPU training-cluster orchestration. The talk draws on academic work at CMU, UC Berkeley, and UCLA, as well as our experiences at Determined AI, a startup that builds software to make deep learning engineers more productive.

Bio:
Evan R. Sparks is co-founder and CEO of Determined AI, a software company that makes machine learning engineers and data scientists dramatically more productive. While earning his Ph.D. in computer science in Berkeley's AMPLab, he contributed to the design and implementation of much of the large-scale machine learning ecosystem around Apache Spark, including MLlib and KeystoneML. Prior to Berkeley, Evan worked in quantitative finance and web intelligence. He holds an A.B. in computer science from Dartmouth College.

Title:Distributed Deep Learning with Hopsworks

Abstract:

Distributed Deep Learning (DDL) can reduce the time it takes to train models and make your data scientists and AI researchers more productive. The inner loop in DDL involves utilizing a cluster of machines to train a model using gradient-averaging techniques, whereas the outer loop involves running parallel experiments to perform black-box optimization and find the best hyperparameters. Algorithms for DDL are becoming commoditized, with support by all of the main machine learning frameworks. However, managing the operations of DDL is still a challenge, lacking simple abstractions to use for data scientists and engineers. In this talk we will present Hopsworks, a platform for horizontally scalable deep learning, with support for parallel and reproducible experiments, distributed training with GPUs, a feature store, and pipeline orchestration with airflow. We will go over the lessons learned in providing platform-support for DDL and present ongoing systems research on a framework for hyperparameter optimization.

Bio:

Kim Hammar is a software engineer at Logical Clocks AB, the main developers of Hops Hadoop (http://www.hops.io ). He received his MSc in Distributed Systems from KTH Royal Institute of Technology in 2018. He has previously worked as an engineer at Ericsson, as a researcher at KTH, as well as a data scientist at Allstate.

Mesosphere, Inc.

225 Bush St #700