Difference between revisions of "User:StefanoZacchiroli/Content deduplication"

From Software Heritage Wiki
Jump to navigation Jump to search
(Created page with "Some experiments on deduplicating contents at sub-file granularity using [https://en.wikipedia.org/wiki/Rabin_fingerprint Rabin fingerprints].")
 
Line 1: Line 1:
 
Some experiments on deduplicating contents at sub-file granularity using [https://en.wikipedia.org/wiki/Rabin_fingerprint Rabin fingerprints].
 
Some experiments on deduplicating contents at sub-file granularity using [https://en.wikipedia.org/wiki/Rabin_fingerprint Rabin fingerprints].
 +
 +
== Linux kernel, Git repo ==
 +
 +
* origin: git.kernel.org, on 2018-01-06
 +
* 1.653.941 content blobs, for a total of 19 GB (compressed)

Revision as of 08:43, 7 January 2018

Some experiments on deduplicating contents at sub-file granularity using Rabin fingerprints.

Linux kernel, Git repo

  • origin: git.kernel.org, on 2018-01-06
  • 1.653.941 content blobs, for a total of 19 GB (compressed)