A Framework for Extensible Science

Exploiting scientific containers, cloud computing, and cloud data services, we present a framework for performing and communicating scalable, reproducible, and extensible science in the cloud. We show the capability to compute massive amounts of data parallelly in the cloud, and run a web service that enables intimate interaction and demonstration with the tools and data presented. We hope this model will inspire the community to produce reproducible and, importantly, extensible results which will enable us to collectively accelerate the rate at which scientific breakthroughs are discovered, replicated, and extended.

Read our paper @

GigaScience arXiv

Try our Demo!

Running as a persistent Jupyter notebook, our demonstration walks through the ndmg pipeline and quality control.

Fork our Code

Download either the frozen-for-publication or up-to-date versions of our code and try sic yourself!

Use the Cloud

Our pipeline is integrated with a variety of platforms, and has been used to process a variety of datasets in the cloud. We encourage you to pull on one of these threads.


Kiar, G; Gorgolewski, K, J; Kleissas, D; Gray Roncal, W; Litt, B; Wandell, B; Poldrack, R A; Wiener, M; Vogelstein, R J; Burns, R; Vogelstein, J T
Corresponding Author: Joshua T. Vogelstein jovo@jhu.edu


  • Gregory Kiar, Krzysztof J. Gorgolewski, Dean Kleissas, William Gray Roncal, Brian Litt, Brian Wandell, Russel A. Poldrack, Martin Wiener, R. Jacob Vogelstein, Randal Burns, Joshua T. Vogelstein; Science In the Cloud (SIC): A use case in MRI Connectomics. Gigascience 2017 gix013. doi: 10.1093/gigascience/gix013
GigaDB Entry
  • Kiar, G; Gorgolewski, K, J; Kleissas, D; Gray Roncal, W; Litt, B; Wandell, B; Poldrack, R, A; Wiener, M; Vogelstein, R, J; Burns, R; Vogelstein, J, T (2017): Example use case of SIC with the ndmg pipeline (SIC:ndmg) GigaScience Database. http://dx.doi.org/10.5524/100285