Building blocks you can inspect and evolve.
DataLab, storage access, catalogs, credentials, policies, snapshots, and controllers can be adopted as composable infrastructure instead of a closed platform.
Versioneer
Versioneer helps organizations turn domain expertise, existing
storage, and pipelines into governed data products people can
trust and reuse. We implement the platform model around how your
users really work, from Earth Observation profiles for EO data
products to lifecycle automation for other data-intensive
domains.
Data platforms should be operated by facilitators close to the
end users: data stewards, domain leads, and platform teams who
understand the work. Versioneer supports them with open-source
building blocks, hands-on engineering, and a control plane
operated in the customer's cloud.
Domain Platform Engineering
A useful data platform is not a generic place where every file is copied. It is a set of domain-aware workflows that let teams register, check, version, approve, and publish data products while keeping ownership close to the people who understand them.
Versioneer gives those workflows a technical foundation: storage connections, snapshots, access rules, lifecycle states, metadata profiles, workflow definitions, and live status. In Earth Observation this becomes an EO profile with STAC, product states, publication gates, and long-term archival handover. For other domains, the same pattern captures their own product rules. Publication becomes a governed transition, not a late file move.
What You Get
We do not sell a distant black-box platform. Versioneer combines open components, implementation work, and an operated control plane so your platform team can run a domain data platform inside its own cloud.
DataLab, storage access, catalogs, credentials, policies, snapshots, and controllers can be adopted as composable infrastructure instead of a closed platform.
We help platform teams and data stewards define the domain profile: product model, lifecycle states, metadata, validation, access, publication gates, and the interfaces that make those rules usable by humans, pipelines, and agents.
Your data plane stays in your environment. Versioneer operates the control plane with your team, handling lifecycle automation, policy rollout, monitoring, updates, and support.
What We Believe
We have run cloud sandboxes for years. The lesson is simple and more important now: people, scripts, and AI agents all touch the same infrastructure. Domain rules should not live only in meetings and memory. Important changes should be written down, reviewed, applied automatically, and limited by policy.
Workspaces, data products, access, lifecycle states, credentials, storage, and services should be defined in one clear form, not scattered across tickets and manual scripts.
Kubernetes gives teams a shared way to run namespaces, policies, services, storage, sandboxes, agents, and the custom resources that describe them.
The wanted state should be versioned, reviewed, and easy to trace. Controllers can then keep applying it without relying on hand-run commands.
Identity, permissions, quotas, security, approvals, and data lifecycle changes should be checked before work starts, especially when agents can act faster than humans can review.
What It Takes
Register datasets while they are still changing. Do not wait until publication, force a full transfer first, or manage the lifecycle in spreadsheets and tickets.
Keep working data separate from immutable published snapshots, with clear states such as staged, committed, and published.
Encode product structure, metadata, validation, lifecycle states, access policy, and publication checks once. EO can have an EO profile; other domains get their own profile.
Describe inspection, validation, snapshots, publication, and permissions as code so controllers can apply them and keep status visible to facilitators and platform teams.
How We Facilitate
Our Earth Observation work with EOX, EarthCODE, and EOEPCA is one concrete profile. The same platform pattern applies wherever domain teams need governed products, shared infrastructure, and reproducible publication.
We translate domain rules into reusable contracts: product models, metadata expectations, validation hooks, lifecycle states, access policies, and publication gates. The EO profile for Earth Observation is the example we are proving in practice.
Data stewards, platform teams, and domain leads should own the operating context. We support them with implementation, automation, and hands-on engineering so stewardship becomes visible, repeatable, and less manual.
Existing object storage, shared filesystems, scientific tools, machine learning libraries, and cloud infrastructure can be connected without forcing teams into one new system or repeated full-dataset copies. The control plane coordinates lifecycle and policy without forcing a central copy of all data.
apiVersion: pkg.internal/v1beta2
kind: Datalab
metadata:
name: s-research-team
spec:
users: [jane, jim, john]
sessions: [{name: default, state: started}, {name: analysis, state: stopped}]
vcluster: true
persistence:
storageClassName: sbs-default-retain
data:
readOnlyMount: true
quota:
memory: 64Gi
storage: 2Ti
budget: x-large
registry: # OCI container registry
enabled: true
storage: 500Gi
security:
policy: privileged
kubernetesRole: admin
kubernetesAccess: false
databases: # PostgreSQL
pg0:
names: [analytics, dev, prod]
storage: 250Gi
backupStorage: 750Gi
documentStores: # MongoDB
prod:
storage: 200Gi
cacheStores: # Redis
prod:
storage: 100Gi
vectorStores: # Qdrant
prod:
storage: 50Gi
Source: DataLab examples on GitHub
DataLab Foundation
The DataLab creates shared cloud workspaces on Kubernetes for
the people and agents working closest to the data. A single
Datalab claim says who can enter, which sessions
exist, whether the lab needs its own cluster space, which
storage credentials are mounted, and which services are
available.
The important part is the Kubernetes resource model behind the manifest. The claim is a durable platform contract with metadata, desired state, observed state, labels, RBAC, admission checks, audit, reconciliation, and status. Teams get self-service inside a bounded workspace; the platform keeps governance around it.
This is the welcome experience we want for data engineering: humans and agents arrive in the same governed sandbox, with domain data, tools, storage access, quotas, security settings, and databases already in place.
Researchers, engineers, and stewards get a ready lab with storage, services, permissions, and enough room to work.
Agents can use the same governed sandbox with scoped credentials, quotas, services, and policy boundaries.
Our Offering
Versioneer can run the control plane where your organization needs it: inside your cloud and governance boundary. The data plane remains your storage, compute, identity, network, and workspaces. Domain facilitators stay close to end users while Versioneer handles lifecycle automation, policy rollout, monitoring, updates, and support.
Storage, compute, identity, network boundaries, sensitive data, pipelines, and DataLab workspaces stay close to the teams, stewards, and platforms that use them.
Versioneer coordinates snapshots, validation, monitoring, policy changes, updates, and support without becoming the place where all data must be copied.
Capability Map
These are the pieces needed to run governed data products across clouds, teams, workspaces, and domain-specific publication workflows.
States, metadata, versions, policies, publication rules, and domain profiles.
Actions for validation, promotion, publication, archive, and access.
Several buckets and storage systems with shared identity and rules, without forcing one central copy.
Controllers compare declared state with the live system, then run validation, indexing, copying, lifecycle changes, policy updates, and status checks.
Metadata, catalog updates, webhooks, and long-term storage steps.
Public, embargoed, and licensed access, established through common OIDC and STS concepts with secrets handled safely.
Git history, lifecycle events, current status, and user/API views.
Version graphs, delta copy, reused unchanged content, and clear change reports.
Reusable Helm or Kustomize packages, docs, reference deployments, and demos.