Acknowledgement

This material is based upon work supported by the National Science Foundation under Grant No. 1942724, an NSF CAREER Award. It was/is also supported in part by a Hellman Fellowship, the NIDDK of the National Institutes of Health under award number R01DK114945, and a gift from VMware. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF or NIH.

We used the following projects when building Cerebro.

  • Horovod: Cerebro’s Apache Spark implementation uses code from the Horovod’s implementation for Apache Spark.

  • Petastorm: We use Petastorm to read Apache Parquet data from remote storage (e.g., HDFS)