PROV (Provenance)
The PROV standard defines a data model, serializations, and definitions to support the interchange of provenance information on the Web.[1] Here provenance includes all "information about entities, activities, and people involved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness".
Abbreviation | PROV |
---|---|
Status | Published, W3C Recommendation |
Year started | 2013 |
Editors | Paul Groth, Luc Moreau |
Related standards | RDF, OWL, XML |
Domain | Semantic Web |
Website | www |
PROV is a set of recommended standards of the World Wide Web Consortium.[2] These include its data model,[3] an XML schema for that model, an OWL2 ontology mapping that model to RDF, and a mapping from that ontology to Dublin Core. It also includes a notation standard for provenance that is easy for humans to read; methods for accessing and querying prov; and a few other subspecifications.[1]
PROV model overview
The core concepts defined by the PROV Model are Entity, Activity and Agent.[4] The remaining concepts are relationships between these (e.g. Derivation, Usage, Generation) or specializations (e.g. Person, Collection, Plan).
An Entity captures a thing in the world (in a particular state). The entity was derived from some other entity, and was generated by an Activity that used other entities.
An Agent (e.g. a person or software execution) was associated with the activity, and the entity that was generated by the activity was attributed to that agent.
PROV serializations
Provenance statements can be serialized in different PROV formats, while expressing the same PROV model. Some of the PROV types and relationship names have slight variations from the PROV model concepts to be idiomatic to the format.
For example, PROV-N is a textual format that has a direct mapping to the PROV model:
document
prefix ex <http://example.com/>
entity(ex:e1)
activity(ex:a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01)
wasGeneratedBy(ex:e1, ex:a2, -)
endDocument
The above can be expressed as XML using the PROV-XML schema:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<prov:document xmlns:prov="http://www.w3.org/ns/prov#"
xmlns:ex="http://example.com/">
<prov:entity prov:id="ex:e1"/>
<prov:activity prov:id="ex:a2">
<prov:startTime>2011-11-16T16:00:00.000Z</prov:startTime>
<prov:endTime>2011-11-16T16:00:01.000Z</prov:endTime>
</prov:activity>
<prov:wasGeneratedBy>
<prov:entity prov:ref="ex:e1"/>
<prov:activity prov:ref="ex:a2"/>
</prov:wasGeneratedBy>
</prov:document>
Using the PROV-O mapping to the OWL2 ontology language, which again can be serialized in the RDF format Turtle:
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix ex: <http://example.com/> .
ex:e1 a prov:Entity .
ex:a2 a prov:Activity ;
prov:startedAtTime "2011-11-16T16:00:00.000Z"^^xsd:dateTime ;
prov:endedAtTime "2011-11-16T16:00:01.000Z"^^xsd:dateTime .
ex:e1 prov:wasGeneratedBy ex:a2 .
Tooling
Software tools have been developed to help converting between PROV formats and to generate/parse PROV documents in different programming languages:
- PROV Translator - web service
- PROV Toolbox - Java API and command line tool
- PROV Python library - Python API
References
- "PROV-Overview". www.w3.org. Retrieved 2018-10-03.
- Moreau, Luc; Groth, Paul; Cheney, James; Lebo, Timothy; Miles, Simon (2015-12-01). "The rationale of PROV". Web Semantics: Science, Services and Agents on the World Wide Web. 35: 235–257. doi:10.1016/j.websem.2015.04.001. ISSN 1570-8268.
- "PROV-DM: The PROV Data Model". www.w3.org. Retrieved 2018-10-04.
- "PROV Model Primer". www.w3.org. W3C. Retrieved 2018-10-17.