<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.softwareheritage.org/index.php?action=history&amp;feed=atom&amp;title=Vault_Blueprint</id>
	<title>Vault Blueprint - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.softwareheritage.org/index.php?action=history&amp;feed=atom&amp;title=Vault_Blueprint"/>
	<link rel="alternate" type="text/html" href="https://wiki.softwareheritage.org/index.php?title=Vault_Blueprint&amp;action=history"/>
	<updated>2026-04-20T18:46:18Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.39.10</generator>
	<entry>
		<id>https://wiki.softwareheritage.org/index.php?title=Vault_Blueprint&amp;diff=778&amp;oldid=prev</id>
		<title>Seirl at 15:56, 16 January 2018</title>
		<link rel="alternate" type="text/html" href="https://wiki.softwareheritage.org/index.php?title=Vault_Blueprint&amp;diff=778&amp;oldid=prev"/>
		<updated>2018-01-16T15:56:44Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 15:56, 16 January 2018&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* '''WARNING: out of date draft blueprint'''&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* '''INFO: the Software Heritage implementation of the vault is now staying in its own repository with an up to date documentation, in [https://forge.softwareheritage.org/source/swh-vault/ swh-vault]'''&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;= Software Heritage Vault =&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;= Software Heritage Vault =&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Seirl</name></author>
	</entry>
	<entry>
		<id>https://wiki.softwareheritage.org/index.php?title=Vault_Blueprint&amp;diff=777&amp;oldid=prev</id>
		<title>Seirl: Created page with &quot;= Software Heritage Vault =  Software source code '''objects'''---e.g., individual source code files, tarballs, commits, tagged releases, etc.---are stored in the Software Her...&quot;</title>
		<link rel="alternate" type="text/html" href="https://wiki.softwareheritage.org/index.php?title=Vault_Blueprint&amp;diff=777&amp;oldid=prev"/>
		<updated>2018-01-16T15:39:34Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;= Software Heritage Vault =  Software source code &amp;#039;&amp;#039;&amp;#039;objects&amp;#039;&amp;#039;&amp;#039;---e.g., individual source code files, tarballs, commits, tagged releases, etc.---are stored in the Software Her...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;= Software Heritage Vault =&lt;br /&gt;
&lt;br /&gt;
Software source code '''objects'''---e.g., individual source code files, tarballs, commits, tagged releases, etc.---are stored in the Software Heritage (SWH) Archive in fully deduplicated form. That allows direct access to individual artifacts but require some preparation, usually in the form of collecting and assembling multiple artifacts in a single '''bundle''', when fast access to a set of related artifacts (e.g., the snapshot of a VCS repository, the archive corresponding to a Git commit, or a specific software release as a zip archive) is required.&lt;br /&gt;
&lt;br /&gt;
The '''Software Heritage Vault''' is a cache of pre-built source code bundles which are assembled opportunistically retrieving objects from the Software Heritage Archive, can be accessed efficiently, and might be garbage collected after a long period of non-use.&lt;br /&gt;
&lt;br /&gt;
== Requirements ==&lt;br /&gt;
&lt;br /&gt;
* '''Shared cache'''&lt;br /&gt;
&lt;br /&gt;
The vault is a cache shared among the various origins that the SWH archive tracks. If the same bundle, originally coming from different origins, is requested, a single entry for it in the cache shall exist.&lt;br /&gt;
&lt;br /&gt;
* '''Efficient retrieval'''&lt;br /&gt;
&lt;br /&gt;
Where supported by the desired access protocol (e.g., HTTP) it should be possible for the vault to serve bundles efficiently (e.g., as static files served via HTTP, possibly further proxied/cached at that level). In particular, this rules out building bundles on the fly from the archive DB.&lt;br /&gt;
&lt;br /&gt;
== API ==&lt;br /&gt;
&lt;br /&gt;
All URLs below are meant to be mounted at API root, which is currently at https://archive.softwareheritage.org/api/1/. Unless otherwise stated, all API endpoints respond on HTTP GET method.&lt;br /&gt;
&lt;br /&gt;
== Object identification ==&lt;br /&gt;
&lt;br /&gt;
The vault stores bundles corresponding to different kinds of objects. The following object kinds are supported:&lt;br /&gt;
&lt;br /&gt;
* directories&lt;br /&gt;
* revisions&lt;br /&gt;
* repository snapshots&lt;br /&gt;
&lt;br /&gt;
The URL fragment &amp;lt;code&amp;gt;:objectkind/:objectid&amp;lt;/code&amp;gt; is used throughout the vault API to fully identify vault objects. The syntax and meaning of :objectid for the different object kinds is detailed below.&lt;br /&gt;
&lt;br /&gt;
=== Directories ===&lt;br /&gt;
&lt;br /&gt;
* object kind: directory&lt;br /&gt;
* URL fragment: directory/:sha1git&lt;br /&gt;
&lt;br /&gt;
where :sha1git is the directory ID in the SWH data model.&lt;br /&gt;
&lt;br /&gt;
=== Revisions ===&lt;br /&gt;
&lt;br /&gt;
* object kind: revision&lt;br /&gt;
* URL fragment: revision/:sha1git&lt;br /&gt;
&lt;br /&gt;
where :sha1git is the revision ID in the SWH data model.&lt;br /&gt;
&lt;br /&gt;
=== Repository snapshots ===&lt;br /&gt;
&lt;br /&gt;
* object kind: snapshot&lt;br /&gt;
* URL fragment: snapshot/:sha1git&lt;br /&gt;
&lt;br /&gt;
where :sha1git is the snapshot ID in the SWH data model. ('''TODO''' repository snapshots don't exist yet as first-class citizens in the SWH data model; see References below.)&lt;br /&gt;
&lt;br /&gt;
== Cooking ==&lt;br /&gt;
&lt;br /&gt;
Bundles in the vault might be ready for retrieval or not. When they are not, they will need to be '''cooked''' before they can be retrieved. A cooked bundle will remain around until it expires; at that point it will need to be cooked again before it can be retrieved. Cooking is idempotent, and a no-op in between a previous cooking operation and expiration.&lt;br /&gt;
&lt;br /&gt;
To cook a bundle:&lt;br /&gt;
&lt;br /&gt;
* POST /vault/:objectkind/:objectid&lt;br /&gt;
&lt;br /&gt;
Request body: '''TODO''' something here in a JSON payload that would allow notifying the user when the bundle is ready.&lt;br /&gt;
&lt;br /&gt;
Response: 201 Created&lt;br /&gt;
&lt;br /&gt;
== Retrieval ==&lt;br /&gt;
&lt;br /&gt;
* GET /vault/:objectkind&lt;br /&gt;
&lt;br /&gt;
(paginated) list of all bundles of a given kind available in the vault; see Pagination. Note that, due to cache expiration, objects might disappear between listing and subsequent actions on them.&lt;br /&gt;
&lt;br /&gt;
Examples:&lt;br /&gt;
&lt;br /&gt;
* GET /vault/directory&lt;br /&gt;
* GET /vault/revision&lt;br /&gt;
* GET /vault/:objectkind/:objectid&lt;br /&gt;
&lt;br /&gt;
Retrieve a specific bundle from the vault.&lt;br /&gt;
&lt;br /&gt;
Response:&lt;br /&gt;
&lt;br /&gt;
* 200 OK: bundle available; response body is the bundle&lt;br /&gt;
* 404 Not Found: missing bundle; client should request its preparation (see Cooking)&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* [https://wiki.softwareheritage.org/index.php?title=User:StefanoZacchiroli/Repository_snapshot_objects Repository snapshot objects]&lt;br /&gt;
* Amazon Web Services, [http://docs.aws.amazon.com/amazonglacier/latest/dev/amazon-glacier-api.html API Reference for Amazon Glacier]; specifically [http://docs.aws.amazon.com/amazonglacier/latest/dev/job-operations.html Job Operations]&lt;br /&gt;
&lt;br /&gt;
= TODO =&lt;br /&gt;
&lt;br /&gt;
* '''TODO''' pagination using HATEOAS&lt;br /&gt;
* '''TODO''' authorization: the cooking API should be somehow controlled to avoid obvious abuses (e.g., let's cache everything)&lt;br /&gt;
* '''TODO''' finalize repository snapshot proposal&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Blueprint]]&lt;/div&gt;</summary>
		<author><name>Seirl</name></author>
	</entry>
</feed>