ADA Lab @ UCSD

Peer-reviewed Publications

  • Materialization Trade-offs for Feature Transfer from Deep CNNs for Multimodal Data Analytics
    Supun Nakandala and Arun Kumar
    Under submission | TechReport | Code and Data

  • Model-based Pricing for Machine Learning in a Data Marketplace
    Lingjiao Chen, Paraschos Koutris, and Arun Kumar
    ACM SIGMOD 2019 (To appear) | TechReport

  • Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent
    Fengan Li, Lingjiao Chen, Yijing Zeng, Arun Kumar, Jeffrey Naughton, Jignesh Patel, and Xi Wu
    ACM SIGMOD 2019 (To appear) | TechReport coming soon

  • A Comparative Evaluation of Systems for Scalable Linear Algebra-based Analytics
    Anthony Thomas and Arun Kumar
    VLDB 2018/2019 (To appear) | Paper PDF | TechReport | Code and Data

  • In-RDBMS Hardware Acceleration of Advanced Analytics
    Divya Mahajan, Joon Kyung Kim, Jacob Sacks, Adel Ardalan, Arun Kumar, and Hadi Esmaeilzadeh
    VLDB 2018 | Paper PDF | Addendum

  • Are Key-Foreign Key Joins Safe to Avoid when Learning High-Capacity Classifiers?
    Vraj Shah, Arun Kumar, and Xiaojin Zhu.
    VLDB 2018 | Paper PDF | TechReport | Code and Data

  • Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics
    Xi Wu, Fengan Li, Arun Kumar, Kamalika Chaudhuri, Somesh Jha, and Jeffrey Naughton
    ACM SIGMOD 2017 | Paper PDF | TechReport

  • SpeakQL: Towards Speech-driven Multi-modal Querying
    Dharmil Chandarana, Vraj Shah, Arun Kumar, and Lawrence Saul
    ACM SIGMOD 2017 HILDA Workshop | Paper PDF

  • Model-based Pricing: Do Not Pay for More than What You Learn!
    Lingjiao Chen, Paraschos Koutris, and Arun Kumar
    ACM SIGMOD 2017 DEEM Workshop | Paper PDF

  • Cerebro: A System to Manage Deep Learning for Relational Data Analytics
    Arun Kumar
    CIDR 2017 (Abstract) | Paper PDF

  • To Join or Not to Join? Thinking Twice about Joins before Feature Selection
    Arun Kumar, Jeffrey Naughton, Jignesh M. Patel, and Xiaojin Zhu
    ACM SIGMOD 2016 | Paper PDF | TechReport | Code and Data

  • Model Selection Management Systems: The Next Frontier of Advanced Analytics
    Arun Kumar, Robert McCann, Jeffrey Naughton, and Jignesh M. Patel
    ACM SIGMOD Record Dec 2015 (Vision Track) | Paper PDF

  • Demonstration of Santoku: Optimizing Machine Learning over Normalized Data
    Arun Kumar, Mona Jalal, Boqun Yan, Jeffrey Naughton, and Jignesh M. Patel
    VLDB 2015 (Demo) | Paper PDF | Code and Data

  • Learning Generalized Linear Models Over Normalized Data
    Arun Kumar, Jeffrey Naughton, and Jignesh M. Patel
    ACM SIGMOD 2015 | Paper PDF | Code and Data

  • Materialization Optimizations for Feature Selection Workloads
    Ce Zhang, Arun Kumar, and Christopher Re
    ACM SIGMOD 2014 | Paper PDF
    Best Paper Award; Invited to ACM TODS 2016

  • Distributed and Scalable PCA in the Cloud
    Arun Kumar, Nikos Karampatziakis, Paul Mineiro, Markus Weimer, and Vijay Narayanan
    NIPS BigLearn 2013 | Paper PDF

  • Feature Selection in Enterprise Analytics: A Demonstration using an R-based Data Analytics System
    Pradap Konda, Arun Kumar, Christopher Ré, and Vaishnavi Sashikanth
    VLDB 2013 (Demo) | Paper PDF

  • Hazy: Making it Easier to Build and Maintain Big-data Analytics
    Arun Kumar, Feng Niu, and Christopher Re
    ACM Queue 2013 | Article
    Invited to the Communications of the ACM March 2013

  • Brainwash: A Data System for Feature Engineering
    Michael Anderson, Dolan Antenucci, Victor Bittorf, Matthew Burgess, Michael Cafarella, Arun Kumar, Feng Niu, Yongjoo Park, Christopher Re, and Ce Zhang
    CIDR 2013 (Vision Track) | Paper PDF

  • Towards a Unified Architecture for in-RDBMS Analytics
    Xixuan Feng*, Arun Kumar*, Benjamin Recht, and Christopher Re
    ACM SIGMOD 2012 | Paper PDF | TechReport | Code and Data

  • The MADlib Analytics Library or MAD Skills, the SQL
    Joseph M. Hellerstein, Christopher Ré, Florian Schoppmann, Daisy Zhe Wang, Eugene Fratkin, Aleksander Gorajek, Kee Siong Ng, Caleb Welton, Xixuan Feng, Kun Li, and Arun Kumar
    VLDB 2012 (Industrial Track) | Paper PDF

Manuscripts, Articles, and Dissertations

  • ML/AI Systems and Applications: Is the SIGMOD/VLDB Community Losing Relevance?
    Arun Kumar
    Blog post on the official ACM SIGMOD Blog, 2018 | Webpage

  • Courting ML: Witnessing the Marriage of Relationa & Web Data Systems to Machine Learning
    Interviewed for ACM SIGMOD Blog, 2018 | Webpage

  • Advice from PhD to Early Career
    Arun Kumar
    ACM SIGMOD 2018 New Researcher Symposium Talk | Slides

  • Learning Over Joins
    Arun Kumar. PhD Dissertation. UW-Madison 2016 | PDF | Video of job talk at UCSD
    Wisconsin CS 2016 Graduate Student Research Award for best dissertation research

  • A Survey of the Existing Landscape of ML Systems
    Arun Kumar, Robert McCann, Jeffrey Naughton, and Jignesh M. Patel
    UW-Madison Technical Report TR1827 | PDF