Project Cerebro


Artificial Neural Networks (ANNs) are revolutionizing many machine learning (ML) applications. Their success at major Web companies has created excitement among many enterprises and domain scientists to try ANNs for their applications. But training ANNs is a notoriously painful empirical process, since accuracy is tied to the ANN architecture and hyper-parameter settings. The common practice to choose these settings is to empirically compare as many training configurations as feasible for the application. This process is called model selection, and it is unavoidable because it is how one controls underfitting vs overfitting. Model selection is a major bottleneck for adoption of ANNs among enterprises and domain scientists due to both the time spent and resource costs.

In this project, we propose a new system for ANN model selection that raises model selection throughput without raising resource costs. Our target setting is small clusters (say, 10s of nodes), which covers a vast majority (almost 90%) of parallel ML workloads in practice. We have 4 key system desiderata: scalability, statistical convergence efficiency, reproducibility, and system generality. To satisfy all these desiderata, we develop a novel parallel execution strategy we call model hopper parallelism (MOP).

Cerebro is open sourced under Apache License v2.0. Code and deatiled documentation are available here: Cerebro System

Downloads (Paper, Code, Data, etc.)

  • Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning
    Side Li and Arun Kumar
    Under submission | TechReport

  • Distributed Deep Learning on Data Systems: A Comparative Analysis of Approaches
    Yuhao Zhang, Frank McQuillan, Nandish Jayaram, Nikhil Kak, Ekta Khanna, Orhan Kislal, Domino Valdano, and Arun Kumar
    VLDB 2021 (To appear) | Paper PDF | TechReport | Code release

  • Intermittent Human-in-the-Loop Model Selection using Cerebro: A Demonstration
    Liangde Li, Supun Nakandala, and Arun Kumar
    VLDB 2021 Demo (To appear) | Paper PDF coming soon | TechReport | Video

  • The CNN Hip Accelerometer Posture (CHAP) Method for Classifying Sitting Patterns from Hip Accelerometers: A Validation Study
    Mikael Anne Greenwood-Hickman, Supun Nakandala, Marta M. Jankowska, Fatima Tuz-Zahra, John Bellettiere, Jordan Carlson, Paul R. Hibbing, Jingjing Zou, Andrea Z. LaCroix, Arun Kumar, and Loki Natarajan
    Medicine and Science in Sports and Exercise Journal, 2021 (To appear) | Paper PDF coming soon | Code

  • Application of Convolutional Neural Network Algorithms for Advancing Sedentary and Activity Bout Classification
    Supun Nakandala, Marta Jankowska, Fatima Tuz-Zahra, John Bellettiere, Jordan Carlson, Andrea LaCroix, Sheri Hartman, Dori Rosenberg, Jingjing Zou, Arun Kumar, and Loki Natarajan
    Journal for the Measurement of Physical Behaviour, 2021 | Paper PDF and BibTeX | Code

  • Cerebro: A Layered Data Platform for Scalable Deep Learning
    Arun Kumar, Supun Nakandala, Yuhao Zhang, Side Li, Advitya Gemawat, and Kabir Nagrecha
    CIDR 2021 (Vision paper) | Paper PDF and BibTeX | Talk video

  • Cerebro: Efficient and Reproducible Model Selection on Deep Learning Systems
    Supun Nakandala, Yuhao Zhang, and Arun Kumar
    ACM SIGMOD 2019 DEEM Workshop | Paper PDF and BibTeX | Blog post

Student Contact

  • Supun Nakandala: snakanda [at] eng [dot] ucsd [dot] edu

  • Yuhao Zhang: yuz870 [at] eng [dot] ucsd [dot] edu


This project was/is supported in part by a Hellman Fellowship, the NIDDK of the NIH under award number R01DK114945, an NSF CAREER Award under award number 1942724, and a gift from VMware.