Difference between revisions of "Repository snapshot objects"

From Software Heritage Wiki
Jump to: navigation, search
(Git repository snapshot objects)
(Git repository snapshot objects)
Line 10: Line 10:
 
Each snapshot object has as its '''snapshot object ID''' the cryptographic has value of a textual serialization of the ''<branch name, revision ID>'' association list.
 
Each snapshot object has as its '''snapshot object ID''' the cryptographic has value of a textual serialization of the ''<branch name, revision ID>'' association list.
  
== Git repository snapshot objects ==
+
== Git implementation ==
  
 
In the spirit of other [https://git-scm.com/book/en/v2/Git-Internals-Git-Objects Git objects], snapshot object for Git repositories can be implemented as follows.
 
In the spirit of other [https://git-scm.com/book/en/v2/Git-Internals-Git-Objects Git objects], snapshot object for Git repositories can be implemented as follows.

Revision as of 15:32, 9 August 2016

Repository snapshot objects

A repository snapshot object is a Merkle DAG node used to capture the state of a VCS repository.

Conceptually, a snapshot object is a map from branch names to revision identifiers.
Practically, the map is serialized as an association list sorted by branch name.

Object ID

Each snapshot object has as its snapshot object ID the cryptographic has value of a textual serialization of the <branch name, revision ID> association list.

Git implementation

In the spirit of other Git objects, snapshot object for Git repositories can be implemented as follows.

  # create repo with some commits, branches, and tags
$ git init test
$ cd test/
$ echo foo > foo.txt
$ git add foo.txt 
$ git commit -m 'checkin foo'
$ git branch foo
$ echo bar >> foo.txt 
$ git commit -a -m 'add bar'
$ git tag bar
$ echo baz >> foo.txt 
$ git commit -a -m 'add baz'

  # ASSUMPTION: the output of git show-ref is sorted by ref name using
  # the usual Git sort algorithm for textual object manifests. This is
  # currently the case as of Git 2.8.1, but it is not documented
  # behavior in git-show-ref(1).

  # repository object in full (the manifest)
$ git show-ref
585f6e27f540012af621a18d0155aae2a8ec0276 refs/heads/foo
6d976a397fe0b28a5bc59540e64f7f36a861af68 refs/heads/master
521cb6d728f9fa3d6c4d73ddd309c0796ddf6995 refs/tags/bar

  # repository object ID, as a Git SHA1
$ git show-ref | git hash-object -w --stdin --literally -t snapshot
7e78262ed2b2998244864284fbbc7b770bce2951

  # raw content of the repository object, including Git header
$ zlib-flate -uncompress < .git/objects/7e/78262ed2b2998244864284fbbc7b770bce2951 
snapshot 170585f6e27f540012af621a18d0155aae2a8ec0276 refs/heads/foo
6d976a397fe0b28a5bc59540e64f7f36a861af68 refs/heads/master
521cb6d728f9fa3d6c4d73ddd309c0796ddf6995 refs/tags/bar

  # i.e., a 170-byte long object of type "snapshot"