ADA Lab @ UCSD

 

Project Triptych

Overview

Triptych is an end-to-end model selection management system (MSMS) that aims to simplify and accelerate the process of sourcing data/features and selecting ML models. Our guiding principles are to exploit the semantics of the data and the ML task to the extent possible to reduce work for the data scientist and reduce runtimes and costs. We apply these principles to remove or mitigate different bottlenecks in this end-to-end process, eventually unifying these components to yield an integrated ‘‘operating system’’ for ML analytics tasks. Please refer to the ACM SIGMOD Record paper below for more details of this vision.

Component Project Webpages

 

Hamlet
Exploiting database schema information to simplify data sourcing.

 

Morpheus
Integrating linear algebra and relational algebra to simplify feature engineering for ML.

 

Nimbus
Enabling the first ML-aware cloud-based commodity market for the new black gold: training data.

 

SLAB
The first comprehensive benchmark comparison of scalable linear algebra systems.

Publications

  • A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics
    Anthony Thomas and Arun Kumar
    VLDB 2018/2019 (To appear) | Paper PDF (Coming soon) | TechReport | Code and Data

  • Are Key-Foreign Key Joins Safe to Avoid when Learning High-Capacity Classifiers?
    Vraj Shah, Arun Kumar, and Xiaojin Zhu.
    VLDB 2018 (To appear) | Paper PDF | TechReport | Code and Data

  • Model-based Pricing: Do Not Pay for More than What You Learn!
    Lingjiao Chen, Paraschos Koutris, and Arun Kumar
    ACM SIGMOD 2017 DEEM Workshop | Paper PDF

  • Cerebro: A System to Manage Deep Learning for Relational Data Analytics
    Arun Kumar
    CIDR 2017 (Abstract) | Paper PDF

  • To Join or Not to Join? Thinking Twice about Joins before Feature Selection
    Arun Kumar, Jeffrey Naughton, Jignesh M. Patel, and Xiaojin Zhu
    ACM SIGMOD 2016 | Paper PDF | TechReport | Code and Data

  • Model Selection Management Systems: The Next Frontier of Advanced Analytics
    Arun Kumar, Robert McCann, Jeffrey Naughton, and Jignesh M. Patel
    ACM SIGMOD Record Dec 2015 (Vision Track) | Paper PDF