makeprov
Track file provenance in Python workflows using PROV semantics
Functions
|
Recursively build a target and its prerequisites. |
|
Build all concrete targets that have no dependents. |
|
Log the steps required to build |
|
Log the rule used for each target in build order. |
|
Return registered rule names in alphabetical order. |
|
Return concrete targets produced by non-pattern rules. |
|
Entry point for running registered CLI subcommands. |
|
Determine whether outputs are stale relative to dependencies. |
Create a fresh session with isolated registries and buffers. |
|
|
Return the execution order for building a target. |
|
Resolve a target to its registered rule and parameters. |
|
Return concrete targets that are not dependencies of other rules. |
|
Decorate a function as a build rule with automatic provenance. |
|
Render the dependency graph for |
Classes
|
Input wrapper that lazily downloads and records source metadata. |
|
Base configuration container with TOML application helpers. |
|
Input directory that tracks files declared within it. |
|
Marker for input paths where |
|
Output directory that tracks files declared within it. |
|
Marker for output paths where |
|
Filesystem path with first-class support for stream placeholders. |
|
Runtime configuration for provenance generation. |
|
Provide JSON-LD serialization helpers for dataclasses. |
|
In-memory registries and buffers for a makeprov run. |
|
Scope provenance buffering to a context or decorator. |
- class makeprov.CachedDownload(url, cache_path, *, headers=None, transform='prov:wasDerivedFrom')
Bases:
InPathInput wrapper that lazily downloads and records source metadata.
- open(mode='r', *args, **kwargs)
Open the path for reading, honoring stdin streams.
- Parameters:
mode (
str) – File mode; defaults to read.*args – Additional positional arguments forwarded to
Path.open.**kwargs – Additional keyword arguments forwarded to
Path.open.
- Returns:
Readable file-like object.
- Return type:
IOBase
Examples
InPath("example.txt").open().read()
- class makeprov.InDir(*paths: str | bytes | ProvPath)
Bases:
InPathInput directory that tracks files declared within it.
The
file()helper producesInPathinstances rooted in the directory while recording them for provenance collection.
- class makeprov.InPath(*paths: str | bytes | ProvPath)
Bases:
ProvPathMarker for input paths where
"-"maps to stdin.Examples
from makeprov.paths import InPath src = InPath("data/input.txt") with src.open() as handle: _ = handle.read()
- open(mode='r', *args, **kwargs)
Open the path for reading, honoring stdin streams.
- Parameters:
mode (
str) – File mode; defaults to read.*args – Additional positional arguments forwarded to
Path.open.**kwargs – Additional keyword arguments forwarded to
Path.open.
- Returns:
Readable file-like object.
- Return type:
IOBase
Examples
InPath("example.txt").open().read()
- class makeprov.OutDir(*paths: str | bytes | ProvPath)
Bases:
OutPathOutput directory that tracks files declared within it.
The
file()helper producesOutPathinstances rooted in the directory while recording them for provenance collection.
- class makeprov.OutPath(*paths: str | bytes | ProvPath)
Bases:
ProvPathMarker for output paths where
"-"maps to stdout.Examples
from makeprov.paths import OutPath dest = OutPath("data/output.txt") dest.write_text("generated")
- as_inpath()
Convert an output marker into an input marker.
- Returns:
A new instance pointing to the same filesystem location.
- Return type:
- Raises:
ValueError – If the current path represents a stream.
Examples
from makeprov.paths import OutPath OutPath("data/output.txt").as_inpath()
- open(mode='w', *args, **kwargs)
Open the path for writing, creating parent directories when needed.
- Parameters:
mode (
str) – File mode; defaults to write.*args – Additional positional arguments forwarded to
Path.open.**kwargs – Additional keyword arguments forwarded to
Path.open.
- Returns:
Writable file-like object.
- Return type:
IOBase
Examples
with OutPath("output.txt").open("w") as handle: handle.write("hello")
- class makeprov.ProvPath(*paths: str | bytes | 'ProvPath')
Bases:
PosixPathFilesystem path with first-class support for stream placeholders.
Hyphen (
"-") paths are treated as stdin/stdout streams but still behave likepathlib.Pathinstances for all other operations.Examples
from makeprov.paths import ProvPath p = ProvPath("-") assert p.is_stream
- open(mode='r', *args, **kwargs)
Open the path while respecting stream semantics.
- Parameters:
mode (
str) – File open mode, passed through toPath.openwhen not operating on a stream.*args – Additional positional arguments forwarded to
Path.open.**kwargs – Additional keyword arguments forwarded to
Path.open.
- Returns:
A file-like object for the requested mode.
- Return type:
IOBase
Examples
from makeprov.paths import ProvPath with ProvPath("output.txt").open("w") as handle: handle.write("hello")
- class makeprov.ProvenanceConfig(base_iri=None, prov_dir='prov', prov_path=None, force=False, merge=True, dry_run=False, out_fmt='json', frame='provenance', context=False, context_url='https://w3id.org/makeprov/context')
Bases:
ConfigRuntime configuration for provenance generation.
- frame: Frame = 'provenance'
- out_fmt: ProvFormat = 'json'
- class makeprov.RDFMixin
Bases:
objectProvide JSON-LD serialization helpers for dataclasses.
The mixin preserves unknown fields when round-tripping JSON-LD documents and offers convenient conversion to rdflib graphs.
Examples
@dataclass class Person(RDFMixin): id: str type: str = "ex:Person" name: str | None = None person = Person(id="ex:alice", name="Alice") jsonld = person.to_jsonld()
- classmethod fields_subclass_first()
Return dataclass fields with subclass members ordered first.
Examples
from dataclasses import dataclass @dataclass class Thing(RDFMixin): id: str Thing.fields_subclass_first()
- classmethod from_jsonld(data)
Deserialize a JSON-LD mapping into the dataclass instance.
- Parameters:
data (
dict) – Parsed JSON-LD object including optional@context.- Returns:
An instance of
clspopulated fromdata.- Return type:
Examples
person = Person.from_jsonld({"id": "ex:alice", "name": "Alice"})
- to_graph()
Convert this object to an
rdflib.Graphfrom JSON-LD.- Returns:
Graph containing triples representing the instance.
- Return type:
rdflib.Graph
- Raises:
RuntimeError – If
rdflibis not installed.
Examples
graph = Person(id="ex:alice").to_graph()
- to_jsonld(with_context=True, include_extra=True)
Serialize the object to a JSON-LD-compatible mapping.
- Parameters:
- Returns:
JSON-LD representation of the object.
- Return type:
Examples
person = Person(id="ex:alice", name="Alice") payload = person.to_jsonld()
- class makeprov.Session(rules_by_target=<factory>, rules_by_name=<factory>, pattern_rules=<factory>, commands=<factory>, prov_buffers=<factory>)
Bases:
objectIn-memory registries and buffers for a makeprov run.
- makeprov.build(target, _seen=None, *, session=None, **kwargs)
Recursively build a target and its prerequisites.
- makeprov.build_all(*, session=None)
Build all concrete targets that have no dependents.
- makeprov.dry_run_build(target, *, session=None)
Log the steps required to build
targetwithout executing rules.- Return type:
- makeprov.explain(target, *, session=None)
Log the rule used for each target in build order.
- Return type:
- makeprov.list_rules(*, session=None)
Return registered rule names in alphabetical order.
- makeprov.list_targets(*, session=None)
Return concrete targets produced by non-pattern rules.
- makeprov.main(subcommands=None, conf_obj=None, argparse_kwargs={}, *, session=None, **kwargs)
Entry point for running registered CLI subcommands.
- makeprov.needs_update(outputs, deps)
Determine whether outputs are stale relative to dependencies.
- Parameters:
- Returns:
Trueif any output is missing or older than a dependency; the absence of dependencies returnsFalseto avoid unnecessary rebuilds.- Return type:
Examples
from makeprov.core import needs_update if needs_update(["data/output.txt"], ["data/input.txt"]): regenerate()
- makeprov.new_session()
Create a fresh session with isolated registries and buffers.
- Return type:
- makeprov.plan(target, *, session=None)
Return the execution order for building a target.
The plan is derived using
resolve_target()for each dependency, ensuring concrete and templated rules are treated uniformly.
- makeprov.resolve_target(target, *, session=None)
Resolve a target to its registered rule and parameters.
Concrete targets are looked up directly in
RULES_BY_TARGET. Pattern rules are attempted in registration order usingparsetemplates.
- makeprov.root_targets(*, session=None)
Return concrete targets that are not dependencies of other rules.
- makeprov.rule(*, name=None, phony=False, base_iri=None, prov_dir=None, prov_path=None, force=None, dry_run=None, out_fmt=None, frame=None, config=None, context=None, merge=None, session=None)
Decorate a function as a build rule with automatic provenance.
- Parameters:
name (
str|None) – Logical name for the rule; defaults to the function name.phony (
bool) – WhenTrue, do not require anOutPathparameter and always execute the wrapped function regardless of timestamps. Useful for meta-rules such as aggregators or reporting commands.base_iri (
str|None) – Base IRI for provenance identifiers; overrides global configuration when provided.prov_dir (
str|None) – Directory where provenance documents are saved.prov_path (
str|None) – Explicit path for the provenance file; overridesprov_dirwhen set.force (
bool|None) – WhenTrue, always run the rule regardless of timestamps.dry_run (
bool|None) – WhenTrue, log activity without executing the wrapped function.out_fmt (
Optional[Literal['json','trig']]) – Output format for provenance files ("json"or"trig").frame (
Optional[Literal['provenance','results']]) – Which structure to make primary subject of jsonld or trig named graph. Options: “provenance” or “results”.config (
ProvenanceConfig|None) – Configuration object to use instead of the process-wide configuration returned bymakeprov.ProvenanceConfig.context (
bool|None) – Whether to embed JSON-LD context in output when writing provenance.merge (
bool|None) – WhenTrue, buffer provenance for this rule and any nested rule calls, emitting a single merged document. Defaults to the configured merge behavior.session (
Session|None) – Registry and buffer container to use instead of the process-wide default session. Passing a dedicated session isolates rules, commands, and provenance buffers from other runs.
- Returns:
A decorator that wraps the target function and registers it as a rule when outputs are discoverable from annotations. Templated
InPathorOutPathdefaults usingstr.formatstyle placeholders (e.g."data/{sample:d}.txt") register as pattern rules and are resolved dynamically for matching targets.- Return type:
Callable
Examples
Annotate parameters with
InPathandOutPathto let the decorator infer dependencies:from makeprov import InPath, OutPath, rule @rule() def uppercase(src: InPath, dst: OutPath): dst.write_text(src.read_text().upper()) uppercase("data/input.txt", "data/output.txt")
- class makeprov.span(label, prov_path=None, *, frame=None, context=None, session=None)
Bases:
ContextDecoratorScope provenance buffering to a context or decorator.
Starting a span begins a provenance buffer; exiting flushes it to disk or merges it into the parent buffer. This avoids manual buffer start/flush orchestration around backend calls.
- makeprov.to_dot(target, *, session=None)
Render the dependency graph for
targetin DOT format.- Return type:
Modules