Software Heritage periodically takes motivated students that are interested in the project's mission and mentor them throughout short periods (usually 4 to 6 months) of paid internships.
Below you can find a list of currently available internship topics, as well as a list of internships completed in the past, who are still documented here for historical reasons.
If you are a student interested in one of the available internship topics, please reach out directly to the contact points given in the internship descriptions.
- Archive search query language (internship)
- Expand metadata search coverage (internship)
- Fine-grained tracking of source code provenance (internship)
- Graph query language for the archive (internship)
- Ingest all Debian derivatives (internship)
- Ingest Wikidata software origins (internship)
- Integrate Software Heritage and ClearlyDefined (internship)
- Integrate Software Heritage and GHTorrent (internship)
- Large-scale license text recognition (internship)
- Source code search engine prototype (internship)
- Crawling project metadata (internship)
- Distributed self-healing object storage (internship)
- Expand archive coverage to Debian-based distros (internship)
- Expand archive coverage to other popular code hosting platforms (internship)
- Graph compression on the development history of software (internship)
- Large-scale progamming language detection (internship)
- The Vault (internship)