Darcs loader (internship)

From Software Heritage Wiki
Revision as of 12:41, 28 January 2020 by StefanoZacchiroli (talk | contribs)
Jump to: navigation, search

Context: Software Heritage is an ambitious research project whose goal is to collect, preserve in the very long term, and share the whole publicly accessible Free/Open Source Software (FOSS) in source code form.

Description: The Software Heritage archive currently contains source code coming mostly from Git repositories publicly available on the Internet. We would like to extend the archive coverage to source code available from other popular Distributed Version Control Systems (DVCs), and in particular Darcs. The goal of this internship is to develop automated "loaders" that can be used to ingest into the archive source code available from Darcs repositories.

Desirable skills to obtain this internship:

  • familiarity with the Distributed Version Control Systems (DVCs), Darcs in particular
  • graph data structures and algorithms
  • Python development
  • working knowledge of PostgreSQL would be a plus

Workplace: Inria Paris

Environnement: you will work shoulder to shoulder with all members of the Software Heritage team, and you will have a chance to witness from within the construction of the ultimate source code archive.

Internship mentors:

  • Stefano Zacchiroli <zack@upsilon.cc>