Integrate Software Heritage and ClearlyDefined (internship)
Context: Software Heritage is an ambitious initiative whose goal is to collect, preserve forever, and make publicly available the entire body of software, in the preferred form for making modifications to it.
Description: ClearlyDefined is a project whose goal is to collaboratively and semi-automatically curate information about Free/Open Source Software (FOSS) projects, including licensing and vulnerability information. As one of its main output, ClearyDefined maintains an open data knowledge-base that cross references FOSS source code artifacts found in version control systems, package repositories, etc. to curated information about their licenses and vulnerabilities. The same source code artifacts are archived by Software Heritage for long-term preservation purposes. The goal of this internship is to integrate ClearlyDefined and Software Heritage, for mutual benefit. Software Heritage will benefit from mirroring ClearlyDefined data, allowing to query them while navigating the archive and at scale; ClearlyDefined will benefit from learning about the existing of FOSS projects that have not been analyzed for "clarity" yet.
Desirable skills to obtain this internship:
- JavaScript / NodeJS
- Python
- experience with database management systems (of any kind)
Workplace: on site at Inria Paris (contact mentors for remote opportunities)
Environment: you will work shoulder to shoulder with members of the Software Heritage and ClearlyDefined project, with mentors from both projects
Internship mentors:
- Valentin Lorentz
- Stefano Zacchiroli <zack@upsilon.cc> (Zack on Matrix)
See also