User:StefanoZacchiroli/Content deduplication: Difference between revisions

Revision as of 13:02, 7 January 2018

Some experiments on deduplicating contents at sub-file granularity.

Rabin fingerprint parameters:

Results:

@@ Line 10: / Line 10: @@
 == Rabin fingerprints ==
-Use [https://en.wikipedia.org/wiki/Rabin_fingerprint Rabin fingerprints].
+* Approach: use [https://en.wikipedia.org/wiki/Rabin_fingerprint Rabin fingerprints]
+* Implementation: [https://forge.softwareheritage.org/rDSNIPe26a6cee53d0a0e748f3f3c9b0934477eaf25b5b swh-dedup-blocks.py]
 === test 1: linux git ===