<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://wiki.softwareheritage.org/index.php?action=history&amp;feed=atom&amp;title=Reverse_project_phylogenesis_%28internship%29</id>
	<title>Reverse project phylogenesis (internship) - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.softwareheritage.org/index.php?action=history&amp;feed=atom&amp;title=Reverse_project_phylogenesis_%28internship%29"/>
	<link rel="alternate" type="text/html" href="https://wiki.softwareheritage.org/index.php?title=Reverse_project_phylogenesis_(internship)&amp;action=history"/>
	<updated>2026-04-22T23:12:36Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.39.10</generator>
	<entry>
		<id>https://wiki.softwareheritage.org/index.php?title=Reverse_project_phylogenesis_(internship)&amp;diff=1822&amp;oldid=prev</id>
		<title>StefanoZacchiroli at 15:11, 4 February 2024</title>
		<link rel="alternate" type="text/html" href="https://wiki.softwareheritage.org/index.php?title=Reverse_project_phylogenesis_(internship)&amp;diff=1822&amp;oldid=prev"/>
		<updated>2024-02-04T15:11:37Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 15:11, 4 February 2024&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l15&quot;&gt;Line 15:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 15:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;|mentors=&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;|mentors=&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Roberto Di Cosmo &amp;lt;roberto@dicosmo.org&amp;gt; (rdicosmo on [[&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;IRC&lt;/del&gt;]])&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Roberto Di Cosmo &amp;lt;roberto@dicosmo.org&amp;gt; (rdicosmo on [[&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Matrix&lt;/ins&gt;]])&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;}}&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;}}&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Available internship]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Available internship]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>StefanoZacchiroli</name></author>
	</entry>
	<entry>
		<id>https://wiki.softwareheritage.org/index.php?title=Reverse_project_phylogenesis_(internship)&amp;diff=1727&amp;oldid=prev</id>
		<title>RobertoDiCosmo: /* Add Phylogenesis internship */</title>
		<link rel="alternate" type="text/html" href="https://wiki.softwareheritage.org/index.php?title=Reverse_project_phylogenesis_(internship)&amp;diff=1727&amp;oldid=prev"/>
		<updated>2022-10-31T17:30:41Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;Add Phylogenesis internship&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 17:30, 31 October 2022&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l5&quot;&gt;Line 5:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 5:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This means that there are documents, research articles, blog posts, documentation, and many other sources out there that contain broken links: the Software Heritage archive provides a way to find easily the archived version of the project, but does not help identifying the new repository where the development may have migrated.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;This means that there are documents, research articles, blog posts, documentation, and many other sources out there that contain broken links: the Software Heritage archive provides a way to find easily the archived version of the project, but does not help identifying the new repository where the development may have migrated.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The goal of this internship is to explore heuristics that exploit the special feature of the Software Heritage merkle graph to identify repositories that may be the new development strand of an old repository saved from a discontinued platform.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The goal of this internship is to explore heuristics that exploit the special feature of the Software Heritage merkle graph to identify repositories that may be the new development strand of an old repository saved from a discontinued platform&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;, and show these links in the relevant repositories: this corresponds to developing the [https://www.vocabulary.com/dictionary/phylogenesis phylogenesis] of a software project&lt;/ins&gt;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;One of the challenges will be to compare various heuristics and scale the approach up to the millions of repositories involved.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;One of the challenges will be to compare various heuristics and scale the approach up to the millions of repositories involved.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>RobertoDiCosmo</name></author>
	</entry>
	<entry>
		<id>https://wiki.softwareheritage.org/index.php?title=Reverse_project_phylogenesis_(internship)&amp;diff=1726&amp;oldid=prev</id>
		<title>RobertoDiCosmo: Created page with &quot;{{Internship |description=The [https://archive.softwareheritage.org/ Software Heritage Archive] contains a large number of projects (over one million!) salvaged from code host...&quot;</title>
		<link rel="alternate" type="text/html" href="https://wiki.softwareheritage.org/index.php?title=Reverse_project_phylogenesis_(internship)&amp;diff=1726&amp;oldid=prev"/>
		<updated>2022-10-31T17:26:13Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;{{Internship |description=The [https://archive.softwareheritage.org/ Software Heritage Archive] contains a large number of projects (over one million!) salvaged from code host...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{Internship&lt;br /&gt;
|description=The [https://archive.softwareheritage.org/ Software Heritage Archive] contains a large number of projects (over one million!) salvaged from code hosting platforms that have been closed down, ranging from large ones like Google Code or Gitorious.org, to small institutional ones, like the old Inria gForge, and from platforms that have phased out support for some version control systems, like Bitbucket.&lt;br /&gt;
Some of these projects migrated to other platforms, where they continued their development.&lt;br /&gt;
&lt;br /&gt;
This means that there are documents, research articles, blog posts, documentation, and many other sources out there that contain broken links: the Software Heritage archive provides a way to find easily the archived version of the project, but does not help identifying the new repository where the development may have migrated.&lt;br /&gt;
&lt;br /&gt;
The goal of this internship is to explore heuristics that exploit the special feature of the Software Heritage merkle graph to identify repositories that may be the new development strand of an old repository saved from a discontinued platform.&lt;br /&gt;
&lt;br /&gt;
One of the challenges will be to compare various heuristics and scale the approach up to the millions of repositories involved.&lt;br /&gt;
&lt;br /&gt;
|skills=&lt;br /&gt;
* Python development&lt;br /&gt;
* understanding of version control systems (git in particular) and familiarity with code hosting platforms&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
|mentors=&lt;br /&gt;
* Roberto Di Cosmo &amp;lt;roberto@dicosmo.org&amp;gt; (rdicosmo on [[IRC]])&lt;br /&gt;
}}&lt;br /&gt;
&lt;br /&gt;
[[Category:Available internship]]&lt;/div&gt;</summary>
		<author><name>RobertoDiCosmo</name></author>
	</entry>
</feed>