Difference between revisions of "TinkerPop Gremlin backend for WebGraph (internship)"

From Software Heritage Wiki
Jump to navigation Jump to search
(Created page with "{{Internship |description=Software Heritage uses the [http://webgraph.di.unimi.it/ WebGraph] framework for graph compression. This allows to manipulate the huge archive Merkle...")
 
(mark internship as completed)
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
{{Internship
 
{{Internship
 
|description=Software Heritage uses the [http://webgraph.di.unimi.it/ WebGraph] framework for graph compression. This allows to manipulate the huge archive Merkle DAG in RAM efficiently, via the [https://docs.softwareheritage.org/devel/swh-graph/ swh-graph component]. The [https://docs.softwareheritage.org/devel/swh-graph/api.html current RPC API] to navigate the graph is however very limited and ad hoc. We would like to exploit the current compressed graph representation using a standard graph traversal language such as the [https://en.wikipedia.org/wiki/Gremlin_(query_language) Gremlin graph traversal language].
 
|description=Software Heritage uses the [http://webgraph.di.unimi.it/ WebGraph] framework for graph compression. This allows to manipulate the huge archive Merkle DAG in RAM efficiently, via the [https://docs.softwareheritage.org/devel/swh-graph/ swh-graph component]. The [https://docs.softwareheritage.org/devel/swh-graph/api.html current RPC API] to navigate the graph is however very limited and ad hoc. We would like to exploit the current compressed graph representation using a standard graph traversal language such as the [https://en.wikipedia.org/wiki/Gremlin_(query_language) Gremlin graph traversal language].
The goal of this internship is to design, implement, and experiment with a backend for [http://tinkerpop.apache.org/ Apache Tinkerpop] (a popular open source implementation of Gremlin) that sits on top of WebGraph. If successful it will allow to traverse the huge Software Heritage graph with both the current efficiency and the convenience of a high-level and expressive graph traversal language.
+
The goal of this internship is to design, implement, and experiment with a backend for [http://tinkerpop.apache.org/ Apache TinkerPop] (a popular open source implementation of Gremlin) that sits on top of WebGraph. If successful it will allow to traverse the huge Software Heritage graph with both the current efficiency and the convenience of a high-level and expressive graph traversal language.
  
 
|skills=
 
|skills=
Line 16: Line 16:
 
}}
 
}}
  
[[Category:Available internship]]
+
[[Category:Completed internship]]

Latest revision as of 09:58, 15 July 2022

Context: Software Heritage is an ambitious initiative whose goal is to collect, preserve forever, and make publicly available the entire body of software, in the preferred form for making modifications to it.

Description: Software Heritage uses the WebGraph framework for graph compression. This allows to manipulate the huge archive Merkle DAG in RAM efficiently, via the swh-graph component. The current RPC API to navigate the graph is however very limited and ad hoc. We would like to exploit the current compressed graph representation using a standard graph traversal language such as the Gremlin graph traversal language. The goal of this internship is to design, implement, and experiment with a backend for Apache TinkerPop (a popular open source implementation of Gremlin) that sits on top of WebGraph. If successful it will allow to traverse the huge Software Heritage graph with both the current efficiency and the convenience of a high-level and expressive graph traversal language.

Desirable skills to obtain this internship:

  • Java development
  • basic knowledge of graph theory

Will be considered a plus:

  • experience with the implementation of graph-based applications

Workplace: on site at Inria Paris (contact mentors for remote opportunities)

Environment: you will work shoulder to shoulder with all members of the Software Heritage team, and you will have a chance to witness from within the construction of the great library of source code.

Internship mentors:

See also