Difference between revisions of "Google Summer of Code 2019/Graph compression"
Jump to navigation
Jump to search
m |
m |
||
Line 1: | Line 1: | ||
* '''Title:''' Graph compression | * '''Title:''' Graph compression | ||
− | * '''Description:''' The Software Heritage [https://docs.softwareheritage.org/devel/swh-model/data-model.html data model] is a big [https://en.wikipedia.org/wiki/Merkle_tree Merkle] DAG made of nodes like revisions, releases, directories, etc. It is a very big graph, with | + | * '''Description:''' The Software Heritage [https://docs.softwareheritage.org/devel/swh-model/data-model.html data model] is a big [https://en.wikipedia.org/wiki/Merkle_tree Merkle] DAG made of nodes like revisions, releases, directories, etc. It is a very big graph, with 12 billion nodes and 165 billion edges, which makes it hard to fit in memory using naive approaches. Graph compression techniques have been successfully used to compress the Web graph (which is slightly larger than the Software Heritage one) and make it fit in memory. The goal of this GSoC project is review existing graph compression techniques and apply the most appropriate one to the Software Heritage case, enabling in-memory processing of its Merkle DAG. |
* '''Student:''' Thibault 'haltode' Allançon | * '''Student:''' Thibault 'haltode' Allançon | ||
** [https://forge.softwareheritage.org/p/haltode/ forge activity] | ** [https://forge.softwareheritage.org/p/haltode/ forge activity] |
Revision as of 19:14, 21 August 2019
- Title: Graph compression
- Description: The Software Heritage data model is a big Merkle DAG made of nodes like revisions, releases, directories, etc. It is a very big graph, with 12 billion nodes and 165 billion edges, which makes it hard to fit in memory using naive approaches. Graph compression techniques have been successfully used to compress the Web graph (which is slightly larger than the Software Heritage one) and make it fit in memory. The goal of this GSoC project is review existing graph compression techniques and apply the most appropriate one to the Software Heritage case, enabling in-memory processing of its Merkle DAG.
- Student: Thibault 'haltode' Allançon
- Code:
- Mentors:
- Stefano Zacchiroli
- Antoine Pietri
- Activity reports:
- March 2019
- April 2019
- May 2019
- June 2019
- July 2019
- August 2019