Difference between revisions of "Software ontologies"

From Software Heritage Wiki
Jump to: navigation, search
Line 1: Line 1:
 +
== Definition of Software Ontology ==
 +
In computer science, the term ontology refers to a structure of concepts or entities within a domain, organized by relationships [https://en.wikipedia.org/wiki/Ontology_%28information_science%29]
 +
 +
The specification takes the form of a representational vocabulary (classes, relations, and so forth), which provide meanings for the vocabulary and formal constraints on its coherent use.
 +
[http://tomgruber.org/writing/ontology-definition-2007.htm]
 +
 +
A software ontology is a classification of categories describing software with explicit specifications of its entities and relationships.
 +
 +
We are working on a list of all ontologies, vocabularies and metadata formats describing software. The categorization of contexts is defined as follows:
 +
* software ontology: for a well defined ontology using xml/rdf with a direct link to the ontology itself
 +
* linked data: vocabularies used by search engines
 +
* generic: metadata terms used in other contexts as well as software domain
 +
* research: metadata terms used in research context, in particular software citation
 +
* catalog: metadata terms in a specific catalog
 +
* dev: metadata terms used in the development process. Can be contained in the software source code package, depending usually on code language
 +
 +
== Ontologies, vocabularies and metadata terms ==
 
Pointers to relevant software ontologies and software metadata, that might be used as inspiration for the upper-level/metadata part of the [[Software Heritage]] data model.
 
Pointers to relevant software ontologies and software metadata, that might be used as inspiration for the upper-level/metadata part of the [[Software Heritage]] data model.
  
* [https://joinup.ec.europa.eu/asset/adms_foss/asset_release/admssw-05 ADMS.SW] (Asset Description Metadata Schema for Software, and specifically FOSS)
+
{| class="wikitable sortable"
* [[CodeMeta]]
+
|-
* [https://en.wikipedia.org/wiki/DOAP DOAP]
+
! name
* [[Schema.org]]
+
! description
* [http://www.se-on.org/ SEON] (A family of Software Evolution ONtologies)
+
! context
* [https://softwareontology.wordpress.com/ SWOP] (Software Ontology Project)
+
! created
** [http://theswo.sourceforge.net/ sourceforge project]
+
! last update
** [https://softwareontology.wordpress.com/2011/02/23/an-overview-of-sword/ overview blog post]
+
! version
* [[TOTEM]] (Trustworthy Online Technical Environment Metadata Database), for digital objects in general
+
! links
* [https://www.wikidata.org/wiki/Wikidata:WikiProject_Informatics/Software#Properties Wikidata software properties] (generic)
+
! in CodeMeta crosswalk table
** [https://www.wikidata.org/wiki/Wikidata:WikiProject_Informatics/FLOSS#Properties FOSS-specific properties]
+
! file name
 +
|-
 +
| ADMS.SW
 +
| Asset Description Metadata Schema for Software, and specifically FOSS
 +
| software ontology, research
 +
| 2012
 +
| 2015
 +
| 1.00
 +
| [https://joinup.ec.europa.eu/asset/adms_foss/asset_release/admssw-05 global description] ,
 +
[http://dropbox.ashlock.us/private/ADMS.SW_Specification_1.00.pdf specification],
 +
[https://joinup.ec.europa.eu/svn/adms_foss/adms_sw_v1.00/rdf2html.xsl rdf2html]
 +
| no
 +
| not found
 +
|-
 +
| DOAP
 +
| Description of a project
 +
| software ontology
 +
| 2010
 +
| 2017
 +
| no version
 +
| [https://github.com/ewilderj/doap on github],
 +
[https://en.wikipedia.org/wiki/DOAP on wikipedia],
 +
[https://github.com/ewilderj/doap rdf]
 +
| waiting pull request
 +
|doap.xml, doap.json
 +
|-
 +
| [[Schema.org]]
 +
| Vocabularies for structured data use on the internet and beyond. Code, SoftwareSourceCode and SoftwareApplication are the main classes describing software.
 +
| linked data
 +
| 2011
 +
| 2017
 +
| 3.2
 +
| [https://schema.org/ homepage]
 +
[https://en.wikipedia.org/wiki/Schema.org wikipedia]
 +
[https://github.com/schemaorg/schemaorg github]
 +
| no but used by CodeMeta
 +
|-
 +
|-
 +
| SEON
 +
| A family of Software Evolution ONtologies
 +
| software ontology
 +
| 2012
 +
| unknown
 +
| no version
 +
| [http://www.se-on.org/ homepage][http://se-on.org/ontologies/index.html documentation][]
 +
| no
 +
|-
 +
|-
 +
| SWOP
 +
| The Software Ontology Project- "is a resource for describing software tools, their types, tasks, versions, provenance and data associated." funded by the [https://www.jisc.ac.uk/ JISC]
 +
| software ontology
 +
| 2011
 +
| 2016
 +
| -
 +
| [https://softwareontology.wordpress.com/ SWOP],
 +
[http://theswo.sourceforge.net/ sourceforge project],
 +
[https://softwareontology.wordpress.com/2011/02/23/an-overview-of-sword/ overview blog post]
 +
| no
 +
|-
 +
| [[TOTEM]]
 +
| Trustworthy Online Technical Environment Metadata Database for digital objects in general
 +
| generic, catalog
 +
| 2008
 +
| unknown
 +
| unknown
 +
| [http://www.keep-totem.co.uk/ homepage]
 +
| no
 +
|-
 +
| Wikidata
 +
| provides data about software with Q7397
 +
| linked data, catalog
 +
| 2012
 +
| 2017
 +
| no version
 +
| [https://www.wikidata.org/wiki/Wikidata:WikiProject_Informatics/Software#Properties generic software propreties],
 +
[https://www.wikidata.org/wiki/Wikidata:WikiProject_Informatics/FLOSS#Properties FOSS-specific properties][https://github.com/Wikidata github]
 +
[https://www.wikidata.org/wiki/Q128751 Source code page]
 +
| no
 +
|-
 +
|-
 +
| Dbpedia
 +
| Multi-domain ontology
 +
mappings:
 +
* between dbpedia and schema.org without Software entities
 +
* between dbpedia and wikidata and a wikiparser
 +
| linked data
 +
| 2007
 +
| 2015
 +
| 3.11
 +
| [http://dbpedia.org/page/Software software page]
 +
[http://dbpedia.org/page/Source_code software_code page]
 +
| no
 +
|-
 +
|-
 +
| DataCite
 +
| the schema is not software specific
 +
| generic, research
 +
| 2009
 +
| 2016
 +
| 4.0
 +
| [https://schema.datacite.org/meta/kernel-4.0/metadata.xsd schema][http://rrr.cs.st-andrews.ac.uk/wp-content/uploads/2015/10/guidelines-software-identification.pdf guidlines]
 +
| yes
 +
|-
 +
|-
 +
| Dublin Core
 +
| -
 +
| generic
 +
| -
 +
| -
 +
| -
 +
| -
 +
| yes
 +
|-
 +
|-
 +
| Zenodo
 +
| exported possibilities : MARCXML, Dublin Core, and DataCite Metadata Schema
 +
| generic, research
 +
| -
 +
| -
 +
| -
 +
| [https://guides.github.com/activities/citable-code/ github citable-code]
 +
| yes
 +
| no file in source code
 +
|-
 +
| Figshare
 +
| Making research outputs available online
 +
| generic
 +
| 2011
 +
| -
 +
| -
 +
| -
 +
| yes
 +
| no file in source code
 +
|-
 +
| code.jsonld
 +
| in CodeMeta crosswalk table but can't find source
 +
| -
 +
| -
 +
| -
 +
| -
 +
| -
 +
| yes
 +
|-
 +
|-
 +
| R Package Description
 +
| DESCRIPTION file stored in an R package, containing important metadata
 +
| dev
 +
| -
 +
| -
 +
| -
 +
| -
 +
| yes
 +
|TBD
 +
|-
 +
| Debian Package
 +
| An effort to collect meta-information about projects. Trying to use DOAP vocabulary. Captured in a file called debian/upstream/metadata in YAML format. another file with EDAM ontology  can be provided at: debian/upstream/edam
 +
| dev
 +
| -
 +
| 2017
 +
| -
 +
| [https://wiki.debian.org/UpstreamMetadata wiki]
 +
| yes
 +
| debian/upstream/metadata
 +
|-
 +
| debtags
 +
| debtags are terms used to describe package content in a non-formal way.
 +
| dev
 +
| 2005
 +
| 2017
 +
| no version
 +
| [https://anonscm.debian.org/cgit/debtags/vocabulary.git/tree/debian-packages vocabulary]
 +
[https://wiki.debian.org/Debtags/FAQ wiki]
 +
| no
 +
| not in source code
 +
|-
 +
| Python Distutils (PyPI)
 +
| The Python Package Index (PyPI) stores metadata that describes package. setup;py file is used when package was packaged and distributed by Distutils (the standard for distributing Python Modules)
 +
| dev
 +
| 1999
 +
| 2017
 +
| 35.0.1 Setuptools
 +
|[https://setuptools.readthedocs.io/en/latest/setuptools.html setuptools doc]
 +
[https://en.wikipedia.org/wiki/Python_Package_Index wikipedia]
 +
[https://martin-thoma.com/analyzing-pypi-metadata/ analyzing pypi metadata 2015]
 +
| yes
 +
| setup.py
 +
|-
 +
| Trove Software Map
 +
| Distutils Trove Classification
 +
| dev
 +
| 1998
 +
| 2002
 +
| -
 +
| [http://www.catb.org/~esr/trove/ Trove project]
 +
[https://www.python.org/dev/peps/pep-0301/ usage with Python]
 +
| yes
 +
|-
 +
|-
 +
| CPAN::Meta
 +
| Comprehensive Perl Archive Network (CPAN) used somewhat like a package manager. The CPAN::META known as META.yml or META.json file is typically created by other tools, Module::Build and ExtUtils::MakeMaker. The raw form of the metadata doesn't exists in the source code.
 +
| dev
 +
| 2003
 +
| -
 +
| 2.150010
 +
| [https://en.wikipedia.org/wiki/CPAN wikipedia]
 +
[http://www.cpan.org/ homepage]
 +
[https://github.com/Perl-Toolchain-Gang/CPAN-Meta on github]
 +
| yes
 +
| META.json, META.yml, .spec
 +
|-
 +
| Ruby Gem
 +
| Specifications in Ruby called gemspec that can hold arbitrary metadata in a .gemspec file or a Rakefile
 +
| dev
 +
| 2006
 +
| 2015
 +
| 0.3.1
 +
| [http://guides.rubygems.org/specification-reference/ guide]
 +
[https://github.com/pjump/gemspec on github]
 +
| yes
 +
| .gemspec, Rakefile
 +
|-
 +
| JavaScript -npm
 +
| A package.json file containing specifications about npm package. The name and version of the package are a unique identifier, changes to the package should come with changes to the version
 +
| dev
 +
| 2010
 +
| 2017
 +
| -
 +
| [https://docs.npmjs.com/files/package.json documentation]
 +
[https://nodesource.com/blog/the-basics-of-package-json-in-node-js-and-npm/ guide]
 +
| yes
 +
| package.json
 +
|-
 +
| Maven
 +
| pom.xml file in  the project root. POM stands for "Project Object Model", an XML representation of a Maven project.
 +
| dev
 +
| -
 +
| 2017
 +
| 3.5.0
 +
| [https://maven.apache.org/pom.html POM reference]
 +
[http://maven.apache.org/ref/3.5.0/maven-model/maven.html maven model]
 +
| yes
 +
| pom.xml
 +
|-
 +
| Octave
 +
| DESCRIPTION file in package which contains various information about package
 +
| dev
 +
| -
 +
| -
 +
| -
 +
| [https://www.gnu.org/software/octave/doc/interpreter/The-DESCRIPTION-File.html howto]
 +
| yes
 +
| DESCRIPTION
 +
|-
 +
| CodeMeta
 +
| a minimal metadata schema for science software and code, in JSON and XML
 +
| software ontology, research, linked data
 +
| 2014
 +
| 2017
 +
| no version
 +
| [http://codemeta.github.io/ homepage]
 +
[https://github.com/codemeta/codemeta on github]
 +
[https://raw.githubusercontent.com/codemeta/codemeta/master/data/codemeta-json-schema.json schema]
 +
| yes
 +
|code.json
 +
|-
 +
| Marc
 +
| MAchine-Readable Cataloging
 +
* MIT is using MARC records
 +
| generic
 +
| 1960s
 +
| -
 +
| -
 +
| [http://www.loc.gov/marc/ homepage]
 +
| no
 +
|-
 +
<!--
 +
|-
 +
| name
 +
| description
 +
| category
 +
| created
 +
| last update
 +
| version
 +
| links
 +
| in crosswalk table
 +
-->
 +
|}
 +
 
 +
 
 +
PRONOM - [https://www.wikidata.org/wiki/User:YULdigitalpreservation/Software]
 +
 
 +
DOLCE -Outdated ?
 +
 
 +
CSO  - Outdated ?
  
 
[[Category:Related work]]
 
[[Category:Related work]]
 
[[Category:Software ontology]]
 
[[Category:Software ontology]]
 +
[[Category:Software metadata]]

Revision as of 14:57, 25 April 2017

Definition of Software Ontology

In computer science, the term ontology refers to a structure of concepts or entities within a domain, organized by relationships [1]

The specification takes the form of a representational vocabulary (classes, relations, and so forth), which provide meanings for the vocabulary and formal constraints on its coherent use. [2]

A software ontology is a classification of categories describing software with explicit specifications of its entities and relationships.

We are working on a list of all ontologies, vocabularies and metadata formats describing software. The categorization of contexts is defined as follows:

  • software ontology: for a well defined ontology using xml/rdf with a direct link to the ontology itself
  • linked data: vocabularies used by search engines
  • generic: metadata terms used in other contexts as well as software domain
  • research: metadata terms used in research context, in particular software citation
  • catalog: metadata terms in a specific catalog
  • dev: metadata terms used in the development process. Can be contained in the software source code package, depending usually on code language

Ontologies, vocabularies and metadata terms

Pointers to relevant software ontologies and software metadata, that might be used as inspiration for the upper-level/metadata part of the Software Heritage data model.

name description context created last update version links in CodeMeta crosswalk table file name
ADMS.SW Asset Description Metadata Schema for Software, and specifically FOSS software ontology, research 2012 2015 1.00 global description ,

specification, rdf2html

no not found
DOAP Description of a project software ontology 2010 2017 no version on github,

on wikipedia, rdf

waiting pull request doap.xml, doap.json
Schema.org Vocabularies for structured data use on the internet and beyond. Code, SoftwareSourceCode and SoftwareApplication are the main classes describing software. linked data 2011 2017 3.2 homepage

wikipedia github

no but used by CodeMeta
SEON A family of Software Evolution ONtologies software ontology 2012 unknown no version homepagedocumentation[] no
SWOP The Software Ontology Project- "is a resource for describing software tools, their types, tasks, versions, provenance and data associated." funded by the JISC software ontology 2011 2016 - SWOP,

sourceforge project, overview blog post

no
TOTEM Trustworthy Online Technical Environment Metadata Database for digital objects in general generic, catalog 2008 unknown unknown homepage no
Wikidata provides data about software with Q7397 linked data, catalog 2012 2017 no version generic software propreties,

FOSS-specific propertiesgithub Source code page

no
Dbpedia Multi-domain ontology

mappings:

  • between dbpedia and schema.org without Software entities
  • between dbpedia and wikidata and a wikiparser
linked data 2007 2015 3.11 software page

software_code page

no
DataCite the schema is not software specific generic, research 2009 2016 4.0 schemaguidlines yes
Dublin Core - generic - - - - yes
Zenodo exported possibilities : MARCXML, Dublin Core, and DataCite Metadata Schema generic, research - - - github citable-code yes no file in source code
Figshare Making research outputs available online generic 2011 - - - yes no file in source code
code.jsonld in CodeMeta crosswalk table but can't find source - - - - - yes
R Package Description DESCRIPTION file stored in an R package, containing important metadata dev - - - - yes TBD
Debian Package An effort to collect meta-information about projects. Trying to use DOAP vocabulary. Captured in a file called debian/upstream/metadata in YAML format. another file with EDAM ontology can be provided at: debian/upstream/edam dev - 2017 - wiki yes debian/upstream/metadata
debtags debtags are terms used to describe package content in a non-formal way. dev 2005 2017 no version vocabulary

wiki

no not in source code
Python Distutils (PyPI) The Python Package Index (PyPI) stores metadata that describes package. setup;py file is used when package was packaged and distributed by Distutils (the standard for distributing Python Modules) dev 1999 2017 35.0.1 Setuptools setuptools doc

wikipedia analyzing pypi metadata 2015

yes setup.py
Trove Software Map Distutils Trove Classification dev 1998 2002 - Trove project

usage with Python

yes
CPAN::Meta Comprehensive Perl Archive Network (CPAN) used somewhat like a package manager. The CPAN::META known as META.yml or META.json file is typically created by other tools, Module::Build and ExtUtils::MakeMaker. The raw form of the metadata doesn't exists in the source code. dev 2003 - 2.150010 wikipedia

homepage on github

yes META.json, META.yml, .spec
Ruby Gem Specifications in Ruby called gemspec that can hold arbitrary metadata in a .gemspec file or a Rakefile dev 2006 2015 0.3.1 guide

on github

yes .gemspec, Rakefile
JavaScript -npm A package.json file containing specifications about npm package. The name and version of the package are a unique identifier, changes to the package should come with changes to the version dev 2010 2017 - documentation

guide

yes package.json
Maven pom.xml file in the project root. POM stands for "Project Object Model", an XML representation of a Maven project. dev - 2017 3.5.0 POM reference

maven model

yes pom.xml
Octave DESCRIPTION file in package which contains various information about package dev - - - howto yes DESCRIPTION
CodeMeta a minimal metadata schema for science software and code, in JSON and XML software ontology, research, linked data 2014 2017 no version homepage

on github schema

yes code.json
Marc MAchine-Readable Cataloging
  • MIT is using MARC records
generic 1960s - - homepage no


PRONOM - [3]

DOLCE -Outdated ?

CSO - Outdated ?