semsynth

Toolkit to profile, describe and synthesize tabular datasets.

  • Unified model interface: run both Metasyn, PyBNesian and SynthCity models from a single config.

  • Uniform outputs: each model writes artifacts under dataset/models/<model-name>/.

  • Provider-aware metadata and UMAP visuals.

  • Generate static HTML reports

Functions

get_version()

Return the installed SemSynth version string.

semsynth.get_version() str

Return the installed SemSynth version string.

Modules

app

Minimal Flask application exposing SemSynth search and report actions.

backends

Backend implementations and shared interfaces.

catalog

Build DCAT catalogs for SemSynth reports and datasets.

datasets

downstream_fidelity

Downstream fidelity comparison between real and synthetic data.

jsonld_to_rdfa

jsonld_to_rdfa.py Generate static HTML with RDFa from nested compact JSON-LD.

mappings

metadata

metrics

missingness

Missingness modeling utilities for backend generators.

models

Model configuration loading and run discovery utilities.

pipeline

Pipeline orchestration for dataset processing and reporting.

privacy_metrics

reporting

Utilities for rendering dataset reports.

reports_index

Stub module to satisfy autosummary; full report index generation lives elsewhere.

runtime

Runtime helpers for resolving SemSynth dependencies.

semmap

specs

Shared data structures for SemSynth dataset handling.

templates

Packaged report templates.

torch_compat

Compatibility helpers for optional PyTorch features.

umap_utils

Utilities for building and rendering UMAP embeddings lazily.

utils