Capability · Data platform

Data Platform from Scratch

Building a scalable analytics foundation on an open source stack — from raw data to business decisions.

Data engineering
Open source
Self-service analytics
ML-ready
dbt · Airflow · ClickHouse

0 → prod
Full platform in 8–12 weeks from kickoff
100+
Business-ready data models across domains
Days → hours
Time from raw data to actionable insight

Who we build this for

Scale-up startups outgrowing spreadsheets
E-commerce with fragmented data sources
Foodtech & QSR with ops + digital data
Enterprises with legacy reporting chaos

The starting point is usually the same: data lives in production databases, Google Sheets, or third-party tools — and nobody has a reliable single number to answer even basic business questions. Analysts spend most of their time preparing data, not analyzing it.

What we build

A production-grade data platform that ingests, transforms, and serves data reliably — built entirely on battle-tested open source tools, with no vendor lock-in and full cost control.

Storage
- S3 / Azure Blob
- Data lake
- Raw → clean → curated zones
Transform
- dbt
- Versioned data models
- Tested & documented
Orchestration
- Airflow / Dagster
- Scheduled pipelines
- Monitoring & alerts
Analytics
- ClickHouse
- Sub-second queries
- Billions of rows

Unified data layer
All sources — app events, transactions, CRM, marketing — flow into one consistent, documented data model.
Self-service analytics
Analysts get clean, trusted tables and can answer business questions without involving engineers.
ML-ready datasets
Structured feature tables and historical snapshots ready to feed recommendation and prediction models.
Data quality built in
Automated tests, freshness checks, and lineage tracking — so the business can trust what they see.

How we package delivery

Implementation and support plans

We deliver the platform as a working production system — not slides, not a reference architecture.

Pick the scope that matches where you are today; you can grow into the next tier when the platform earns its keep.

Foundation

Data lake + first pipelines — a real starting point, not a POC

initial implementation€18,000

+support starting at €1,500/ month

S3 / Azure Blob data lake (raw + clean zones)
1–2 source ingestion pipelines (database / API)
Airflow orchestration (managed or self-hosted)
dbt project scaffolding + 5–10 starter models
Production CI/CD for the data project
2-hour analyst onboarding session

4–6 weeks for the lake + first pipelines. Minimum: implementation + 1 month of support.

Full Platform

The default Drafted build — production-grade open-source data platform

initial implementation€55,000

+support starting at €4,500/ month

Everything in Foundation
All sources connected (events, transactions, CRM, marketing)
Curated zone with 100+ tested dbt models
ClickHouse analytics layer (sub-second on billions of rows)
Data quality framework: tests, freshness, lineage
Monitoring & alerting (pipeline + DQ incidents)
Self-service starter dashboards on Apache Superset
Documentation + 2 training sessions for analysts

8–12 weeks for the full platform. Minimum: implementation + 3 months of support.

Enterprise

When the platform is core infrastructure for the business

initial implementation€95,000

+support starting at €7,000/ month

Everything in Full Platform
ML feature store (versioned features, point-in-time correctness)
Multi-domain governance (data contracts, ownership, certification)
Embedded engineers in your team during delivery
Compliance hardening (PII tagging, RBAC, audit trails)
Dedicated SLAs and on-call coverage

14–20 weeks for the full enterprise rollout. Minimum: implementation + 6 months of support.

Other delivery options — coming soon

We're adding two more variants of this capability so you can match the platform to your existing tooling and budget.

Cloud-managed data platform: Same coverage on a managed warehouse stack (Snowflake / BigQuery / Databricks) — faster to ship, higher run cost.
Premium custom stack: Spark + Iceberg + Trino for petabyte-scale workloads with stricter governance and ML-first delivery.

What becomes possible

Unit economics on demand. The CFO opens a dashboard on Monday morning and sees margin by channel, country, and cohort — without waiting for an analyst to pull numbers from three different systems.
KPI monitoring without firefighting. When conversion drops in a specific market, the product team sees it within hours — not at the end-of-month review — and can act immediately.
Campaign performance in real time. Marketing launches a promo and tracks order volume, average check, and new user conversion as it happens — adjusting spend the same day.
ML that actually ships. With clean, versioned training data available, the data science team spends time on models — not on data wrangling — and gets experiments into production faster.

Result

Companies go from "we don't trust our numbers" to a reliable, scalable analytics foundation that grows with the business — without expensive proprietary tools or vendor dependency.

The platform becomes the backbone for dashboards, KPI tracking, strategic planning, and machine learning — all from a single, well-structured source of truth.

Ready to scope your data platform?

Let's map what your platform should look like and what it costs.

On the call we review your current data sources, business questions, and constraints. You leave with a concrete scope: which tier fits, what to ship in phase one, and a realistic timeline to production.

Source landscape and ingestion priorities
Storage and warehouse layout (lake / curated zones)
Orchestration and CI/CD model
Analytics layer and self-service surface
Data quality, monitoring and ownership
Tier selection, timeline and support model

Book a call

Data Platform from Scratch

Unified data layer

Self-service analytics

ML-ready datasets

Data quality built in

Implementation and support plans

Foundation

Full Platform

Enterprise

Let's map what your platform should look like and what it costs.