Virtual Software Heritage filesystem (internship)

From Software Heritage Wiki
Revision as of 10:25, 29 January 2020 by StefanoZacchiroli (talk | contribs) (fill in internship details)
Jump to: navigation, search

Context: Software Heritage is an ambitious research project whose goal is to collect, preserve in the very long term, and share the whole publicly accessible Free/Open Source Software (FOSS) in source code form.

Description: the Software Heritage data model is a giant Merkle DAG, not unlike the Git data model. The goal of this internship is to develop a FUSE virtual filesystem that allows to mount parts of the Software Heritage graph on a Linux machine and navigate them as if they were locally available. The filesystem backend will use the Software Heritage storage and/or graph APIs to fetch the relevant data from the archive and cache them locally as needed for efficiency reasons. Use cases that will be explored are both interactive archive navigation for retrieving archived source code and VCS analysis for research purposes.

Desirable skills to obtain this internship:

Will be considered a plus:

  • working knowledge of filesystem architectures and/or implementations

Workplace: on site at Inria Paris (contact mentors for remote opportunities)

Environment: you will work shoulder to shoulder with all members of the Software Heritage team, and you will have a chance to witness from within the construction of the great library of source code.

Internship mentors:

  • Antoine Pietri <antoine.pietri1@gmail.com>
  • Stefano Zacchiroli <zack@upsilon.cc>