The data collected by Software Heritage is harvested from a diverse set of origins, and organised into a data model that provides a uniform view of all the source code, and its development history, independently of the data model of their origins.
This uniformity of access to all the contents turns the Software Heritage archive into a research corpus which is unprecedented, and uniquely positioned for enabling a variety of scientific applications.
The goal ot the SAPI working group is to interface with the research community, elicit the expected functionalities, and help identifying those that are general enough to be incorporated in the Software Heritage data access API, separating them from specific functionalities that are best implemented as part of applications built on top of Software Heritage.
How long the group is expected to stay in operation; includes date of creation
Monitor the needs of the research communities that want to use the Software Heritage corpus.
Contribute to the Software Heritage data access API, proposing extensions corresponding to the functionalities that are best implemented inside the Software Heritage services, as opposed to those that are best implemented in the client services.
Raise awareness of the relevance of the Software Heritage corpus for research, and encourage research centers interested in accessing it to become full nodes in the Software Heritage network and host a local copy of the data.
Related working groups
This working group is related to: Metadata and Linked Data (MELD), Modeling and Ingesting Version control systems (MIV), Distribution, Replication and Query (DIREQ).
Documents produced by the working group will be listed in this section.
Active or planned connections to other initiatives, and activities will be listed in this section.