Difference between revisions of "Expand package metadata coverage (internship)"

From Software Heritage Wiki
Jump to: navigation, search
Line 1: Line 1:
'''Context''': {{Internship context}}
+
{{Internship
 
+
|description=[https://archive.softwareheritage.org/browse/search/ searching] projects in the Software Heritage archive is currently possible by either (parts of) URL or by [https://www.softwareheritage.org/2019/05/28/mining-software-metadata-for-80-m-projects-and-even-more/ package metadata].
'''Description''': [https://archive.softwareheritage.org/browse/search/ searching] projects in the Software Heritage archive is currently possible by either (parts of) URL or by [https://www.softwareheritage.org/2019/05/28/mining-software-metadata-for-80-m-projects-and-even-more/ package metadata].
 
 
Currently, only a limited number of package metadata are [https://docs.softwareheritage.org/devel/swh-indexer/metadata-workflow.html#supported-intrinsic-metadata supported], including Maven, NPM, PyPI, and Gems.
 
Currently, only a limited number of package metadata are [https://docs.softwareheritage.org/devel/swh-indexer/metadata-workflow.html#supported-intrinsic-metadata supported], including Maven, NPM, PyPI, and Gems.
 
The goal of this internship is to extend the coverage of supported metadata to additional package managers, the long-term goal being supporting all [https://libraries.io/ Libraries.io]-indexed package managers.
 
The goal of this internship is to extend the coverage of supported metadata to additional package managers, the long-term goal being supporting all [https://libraries.io/ Libraries.io]-indexed package managers.
  
'''Desirable skills''' to obtain this internship:
+
|skills=
 
* Python development
 
* Python development
  
Line 11: Line 10:
 
* knowledge of linked data technologies and ontologies (e.g., RDFa, JSON-LD, OWL, etc.)
 
* knowledge of linked data technologies and ontologies (e.g., RDFa, JSON-LD, OWL, etc.)
  
'''Workplace''': {{Internship workplace}}
+
|mentors=
 
 
'''Environment''': {{Internship environment}}
 
 
 
'''Internship mentors''':
 
 
* Morane Gruenpeter
 
* Morane Gruenpeter
 
* Valentin Lorentz
 
* Valentin Lorentz
 
* Stefano Zacchiroli <zack@irif.fr>
 
* Stefano Zacchiroli <zack@irif.fr>
 +
}}
  
 
[[Category:Available internship]]
 
[[Category:Available internship]]
 
[[Category:Internship]]
 
[[Category:Internship]]
 
[[Category:Lang:English]]
 
[[Category:Lang:English]]

Revision as of 13:49, 29 January 2020

Context: Software Heritage is an ambitious research project whose goal is to collect, preserve in the very long term, and share the whole publicly accessible Free/Open Source Software (FOSS) in source code form.

Description: searching projects in the Software Heritage archive is currently possible by either (parts of) URL or by package metadata. Currently, only a limited number of package metadata are supported, including Maven, NPM, PyPI, and Gems. The goal of this internship is to extend the coverage of supported metadata to additional package managers, the long-term goal being supporting all Libraries.io-indexed package managers.

Desirable skills to obtain this internship:

  • Python development

Will be considered a plus:

  • knowledge of linked data technologies and ontologies (e.g., RDFa, JSON-LD, OWL, etc.)

Workplace: on site at Inria Paris (contact mentors for remote opportunities)

Environment: you will work shoulder to shoulder with all members of the Software Heritage team, and you will have a chance to witness from within the construction of the great library of source code.

Internship mentors:

  • Morane Gruenpeter
  • Valentin Lorentz
  • Stefano Zacchiroli <zack@irif.fr>

See also