Virtual Software Heritage filesystem (internship)

From Software Heritage Wiki
Jump to navigation Jump to search

Context: Software Heritage is an ambitious initiative whose goal is to collect, preserve forever, and make publicly available the entire body of software, in the preferred form for making modifications to it.

Description: the Software Heritage data model is a giant Merkle DAG, not unlike the Git data model. The goal of this internship is to develop a FUSE virtual filesystem that allows to mount parts of the Software Heritage graph on a Linux machine and navigate them as if they were locally available. The filesystem backend will use the Software Heritage storage and/or graph APIs to fetch the relevant data from the archive and cache them locally as needed for efficiency reasons. Use cases that will be explored are both interactive archive navigation for retrieving archived source code and VCS analysis for research purposes.

Desirable skills to obtain this internship:

Will be considered a plus:

  • working knowledge of filesystem architectures and/or implementations

Workplace: on site at Inria Paris (contact mentors for remote opportunities)

Environment: you will work shoulder to shoulder with all members of the Software Heritage team, and you will have a chance to witness from within the construction of the great library of source code.

Internship mentors:

  • Antoine Pietri
  • Stefano Zacchiroli <zack@upsilon.cc>

See also