makeprov.prov

Functions

dataclass([cls, init, repr, eq, order, ...])

Add dunder methods based on the fields defined in the class.

deepcopy(x[, memo, _nil])

Deep copy operation on arbitrary Python objects.

field(*[, default, default_factory, init, ...])

Return an object to identify dataclass fields.

pep503_normalize(name)

Normalize a package name according to PEP 503 rules.

project_metadata([dist_name])

Extract package metadata for provenance enrichment.

Classes

ActivityNode(id, type[, startedAtTime, ...])

AgentNode(id, type[, label, hasVersion, source])

Any(*args, **kwargs)

Special type indicating an unconstrained type.

BaseNode(id, type)

DepNode(id, type[, label])

EnvNode(id, type[, label, title, ...])

FileEntity(id, type[, format, extent, ...])

GraphEntity(id, type[, wasGeneratedBy, ...])

Path(*args, **kwargs)

PurePath subclass that can make system calls.

Prov(base_iri, name, provenance, results[, ...])

ProvDoc([provenance])

RDFMixin()

Provide JSON-LD serialization helpers for dataclasses.

datetime(year, month, day[, hour[, minute[, ...)

The year, month and day arguments are required.

timezone

Fixed offset from UTC implementation of tzinfo.

class makeprov.prov.ActivityNode(id, type, startedAtTime=None, endedAtTime=None, wasAssociatedWith=None, used=None, comment=None)

Bases: BaseNode

comment: str | None = None
endedAtTime: datetime | None = None
startedAtTime: datetime | None = None
used: tuple[FileEntity | str | dict[str, Any]] | None = None
wasAssociatedWith: AgentNode | str | dict[str, Any] | None = None
class makeprov.prov.AgentNode(id, type, label=None, hasVersion=None, source=None)

Bases: BaseNode

hasVersion: str | None = None
label: str | None = None
source: str | None = None
class makeprov.prov.BaseNode(id, type)

Bases: RDFMixin

id: str
type: Any
class makeprov.prov.DepNode(id, type, label=None)

Bases: BaseNode

label: str | None = None
class makeprov.prov.EnvNode(id, type, label='Python environment', title=None, hasVersion=None, requires=None)

Bases: BaseNode

hasVersion: str | None = None
label: str = 'Python environment'
requires: tuple[DepNode] | None = None
title: str | None = None
class makeprov.prov.FileEntity(id, type, format=None, extent=None, modified=None, identifier=None, wasGeneratedBy=None)

Bases: BaseNode

extent: int | None = None
format: str | None = None
identifier: str | None = None
modified: datetime | None = None
wasGeneratedBy: ActivityNode | str | dict[str, Any] | None = None
class makeprov.prov.GraphEntity(id, type, wasGeneratedBy=None, wasAttributedTo=None, generatedAtTime=None)

Bases: BaseNode

generatedAtTime: datetime | None = None
wasAttributedTo: AgentNode | str | dict[str, Any] | None = None
wasGeneratedBy: ActivityNode | str | dict[str, Any] | None = None
class makeprov.prov.Prov(base_iri, name, provenance, results, context=<factory>)

Bases: object

base_iri: str
context: dict
classmethod create(base_iri, name, run_id, t0, t1, inputs, outputs, results, success=True)

Assemble a provenance graph from rule execution details.

Parameters:
  • base_iri (str | None) – Base IRI for generated identifiers.

  • name (str) – Logical rule name.

  • run_id (str) – Unique identifier for this run, typically timestamp-based.

  • t0 (datetime) – Start time of the rule execution.

  • t1 (datetime) – End time of the rule execution.

  • inputs (list[Path]) – Input files consumed by the rule.

  • outputs (list[Path]) – Output files produced by the rule.

  • results (list[RDFMixin]) – Optional result graphs to embed alongside provenance records.

  • success (bool) – Whether the rule completed successfully.

Returns:

A populated Prov instance ready for serialization.

Return type:

Prov

Examples

prov = Prov.create(
    base_iri=None,
    name="uppercase",
    run_id="20240101T120000",
    t0=start,
    t1=end,
    inputs=[Path("input.txt")],
    outputs=[Path("output.txt")],
    results=[],
)
classmethod merge(provs)

Combine multiple provenance documents into one.

Parameters:

provs (list[Prov]) – Provenance objects to merge.

Returns:

A new object containing combined provenance and results from all inputs.

Return type:

Prov

Examples

merged = Prov.merge([prov_a, prov_b])
name: str
provenance: list[RDFMixin]
results: tuple[GraphEntity, list[RDFMixin]]
to_graph(frame='provenance')
to_jsonld(frame='provenance', with_context=False)
Return type:

dict

write(prov_path, fmt='json', frame='provenance', context=False, context_url=None)

Serialize provenance to disk.

Parameters:
  • prov_path (str | Path) – Output path (without extension) where the provenance document should be written.

  • fmt (str) – Output format, "json" for JSON-LD or "trig" for RDF TriG.

  • frame (str) – Which structure to make primary subject of jsonld or trig named graph. Options: “provenance” or “results”.

  • context (bool) – Whether to include the JSON-LD context inline when writing JSON.

Returns:

The path to the written provenance document with extension.

Return type:

Path

Raises:

Exception – If the requested format is unsupported.

Examples

output = prov.write("prov/uppercase", fmt="json", context=True)
class makeprov.prov.ProvDoc(provenance=<factory>)

Bases: RDFMixin

provenance: tuple[RDFMixin]
makeprov.prov.pep503_normalize(name)

Normalize a package name according to PEP 503 rules.

Parameters:

name (str) – The distribution name to normalize.

Returns:

Lowercase, normalized package name with punctuation collapsed.

Return type:

str

makeprov.prov.project_metadata(dist_name=None)

Extract package metadata for provenance enrichment.

Parameters:

dist_name (str | None) – Distribution name; when None the caller’s package name is inferred from the module context.

Returns:

Distribution name, version, and dependency specifications. Empty values are returned when metadata cannot be found.

Return type:

tuple[str | None, str | None, list[str]]

Examples

name, version, requires = project_metadata("makeprov")