Difference between revisions of "Code search"

From Software Heritage Wiki
Jump to: navigation, search
(Created page with "Searching through the vast amount of code available in Software Heritage is a real technical and scientific challenge. Indeed, when browsing code, one would like to go be...")
 
Line 1: Line 1:
 
Searching through the vast amount of code available in [[Software Heritage]] is a real technical and scientific challenge.
 
Searching through the vast amount of code available in [[Software Heritage]] is a real technical and scientific challenge.
  
Indeed, when browsing code, one would like to go beyond classical full text search, for which efficient tools do exist.
+
Indeed, when browsing code, one would like to go beyond classical full text search, for example via regular expression search or AST search.
  
One possibility is performing search using regular expressions, for which we list here the relevant prior art:
+
== Searchcode ==
 +
 
 +
Besides search tools included in [[GitHub]] or [[OpenHub]], we know of:
 +
 
 +
* http://www.boyter.org/category/searchcode/ and https://searchcode.com/ provide full text code search over more than 20Billions lines of code, last work in October 2015, done by Ben E. Boyter in Sidney as a side project.
 +
 
 +
== Sourcegraph ==
 +
 
 +
Performing search using regular expressions has also been explored, see:
  
 
* [https://text.sourcegraph.com/google-i-o-talk-building-sourcegraph-a-large-scale-code-search-cross-reference-engine-in-go-1f911b78a82e#.6lcu1k2kr SourceGraph] is an implementation in Go of a regular expression search algorithm  
 
* [https://text.sourcegraph.com/google-i-o-talk-building-sourcegraph-a-large-scale-code-search-cross-reference-engine-in-go-1f911b78a82e#.6lcu1k2kr SourceGraph] is an implementation in Go of a regular expression search algorithm  
 
* [https://codesearch.debian.net/ CodeSearch] is a Debian service that uses Google's code for searching through the Debian sources code base
 
* [https://codesearch.debian.net/ CodeSearch] is a Debian service that uses Google's code for searching through the Debian sources code base

Revision as of 08:17, 26 September 2016

Searching through the vast amount of code available in Software Heritage is a real technical and scientific challenge.

Indeed, when browsing code, one would like to go beyond classical full text search, for example via regular expression search or AST search.

Searchcode

Besides search tools included in GitHub or OpenHub, we know of:

Sourcegraph

Performing search using regular expressions has also been explored, see:

  • SourceGraph is an implementation in Go of a regular expression search algorithm
  • CodeSearch is a Debian service that uses Google's code for searching through the Debian sources code base