Data from: Exploration and Explanation in Computational Notebooks

About this Collection

In July 2017, our team queried, downloaded, and analyzed approximately 1.25 million Jupyter Notebooks in public repositories on GitHub. By our calculation this was about 95% of all Jupyter Notebooks publicly available on GitHub at the time. This dataset includes: ~1.25 million Jupyter Notebooks Metadata about each notebook Metadata about each of the nearly 200,000 public repositories that contained a Jupyter Notebook Top level README files for nearly 150,000 repositories containing a Jupyter Notebook In addition to this core data, these data include: A smaller, starter dataset with 1000 randomly selected repositories containing ~6000 notebooks CSV files summarizing and indexing the notebooks, repositories, and READMEs Log files documenting when each file was downloaded Scripts for our initial analysis of the dataset View this collection on the contributor's website.

Type of Item

1 item found in this collection

of 1
of 1