Difference between revisions of "WG/Distribution Replication and Query"

From Software Heritage Wiki
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
= Working group on Distribution, Replication and Query =
+
= Charter =
== Charter ==
 
=== Mission ===
 
<blockquote><em>&#x2026;let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond  the reach of accident.</em><br/>Thomas Jefferson, February 18, 1791</blockquote>
 
  
 +
== Mission ==
 +
 +
<blockquote><em>let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond  the reach of accident.</em><br/>Thomas Jefferson, February 18, 1791</blockquote>
  
 
One of Software Heritage's main missions is to ensure that the source code
 
One of Software Heritage's main missions is to ensure that the source code
Line 32: Line 32:
 
metadata database, which may be also distributed.
 
metadata database, which may be also distributed.
  
=== Duration ===
+
== Duration ==
 
This working group is open ended.
 
This working group is open ended.
  
=== Expected outcomes ===
+
== Expected outcomes ==
 
The main expected outcomes are listed below.
 
The main expected outcomes are listed below.
  
Line 55: Line 55:
 
raise awareness among all the interested parties.
 
raise awareness among all the interested parties.
  
=== Milestones ===
+
== Milestones ==
 +
 
 +
= Team contact(s) =
  
== Team contact(s) ==
 
 
* [http://www.dicosmo.org Roberto Di Cosmo]
 
* [http://www.dicosmo.org Roberto Di Cosmo]
 
 
* [https://upsilon.cc/~zack Stefano Zacchiroli]
 
* [https://upsilon.cc/~zack Stefano Zacchiroli]
  
== Documents ==
+
= Documents =
 
Documents produced by the working group will be listed in this section.
 
Documents produced by the working group will be listed in this section.
  
== Connections ==
+
= Connections =
 
Active or planned connections to other initiatives, and activities will be listed in this section.
 
Active or planned connections to other initiatives, and activities will be listed in this section.
  
== Infrastructure ==
+
= Infrastructure =
=== Mailing list ===
+
 
 +
== Mailing list ==
 +
 
 
* https://sympa.inria.fr/sympa/info/direq-wg-swh
 
* https://sympa.inria.fr/sympa/info/direq-wg-swh
 +
 +
[[Category:Working group]]

Latest revision as of 13:45, 31 July 2016

Charter

Mission

let us save what remains: not by vaults and locks which fence them from the public eye and use in consigning them to the waste of time, but by such a multiplication of copies, as shall place them beyond the reach of accident.
Thomas Jefferson, February 18, 1791

One of Software Heritage's main missions is to ensure that the source code assets will be preserved in the long term. There are a variety of threats to digital information, ranging from operational failures to physical accidents, from human error to malicious attacks, from technical obsolescence to legal incertainty.

The most promising approach to withstand all these challenges is to ensure we build the Software Heritage archive on an infrastructure that is distributed and replicated in all respects:

  • technically: a variety of different technology
  • geographically: on different continents
  • administratively: under different control structures
  • legally: under different legal systems

The main goal of the DIREQ working group is to monitor and evaluate existing and forecoming approaches to distributed resilient archival, and to develop and evolve an API allowing the Software Heritage network of peers to abstract from the particular technologies used to implement the storage backends and the metadata database. This API will need to address the issues related to reading and writing information on the storage backend, and also updating and querying the metadata database, which may be also distributed.

Duration

This working group is open ended.

Expected outcomes

The main expected outcomes are listed below.

A common API for the distributed object storage abstracting away the details of the different underlying technologies that may be adopted by the network of peers. The solution proposed must allow several distinct technologies to be in operation at the same time, monitoring the degree of replication achieved.

State-of-the-art approaches to distributed metadata databases allowing to implement queries and updates on the database holding all the Software Heritage metadata. The append-only nature of this database may have implications on the impact of the well known CAP issues for distributed databases.

Monitoring and evaluation of existing and forecoming approaches to distributed resilient archival and databases.

Awareness The DIREQ working group will establish the relevant connections in order to raise awareness among all the interested parties.

Milestones

Team contact(s)

Documents

Documents produced by the working group will be listed in this section.

Connections

Active or planned connections to other initiatives, and activities will be listed in this section.

Infrastructure

Mailing list