Capability · Data platform
Data Platform from Scratch
Building a scalable analytics foundation on an open source stack — from raw data to business decisions.
- Data engineering
- Open source
- Self-service analytics
- ML-ready
- dbt · Airflow · ClickHouse
0 → prod
Full platform in 8–12 weeks from kickoff
100+
Business-ready data models across domains
Days → hours
Time from raw data to actionable insight
Who we build this for
- Scale-up startups outgrowing spreadsheets
- E-commerce with fragmented data sources
- Foodtech & QSR with ops + digital data
- Enterprises with legacy reporting chaos
The starting point is usually the same: data lives in production databases, Google Sheets, or third-party tools — and nobody has a reliable single number to answer even basic business questions. Analysts spend most of their time preparing data, not analyzing it.
What we build
A production-grade data platform that ingests, transforms, and serves data reliably — built entirely on battle-tested open source tools, with no vendor lock-in and full cost control.
- Storage
- S3 / Azure Blob
- Data lake
- Raw → clean → curated zones
- Transform
- dbt
- Versioned data models
- Tested & documented
- Orchestration
- Airflow / Dagster
- Scheduled pipelines
- Monitoring & alerts
- Analytics
- ClickHouse
- Sub-second queries
- Billions of rows
Unified data layer
All sources — app events, transactions, CRM, marketing — flow into one consistent, documented data model.
Self-service analytics
Analysts get clean, trusted tables and can answer business questions without involving engineers.
ML-ready datasets
Structured feature tables and historical snapshots ready to feed recommendation and prediction models.
Data quality built in
Automated tests, freshness checks, and lineage tracking — so the business can trust what they see.
How we package delivery
Implementation and support plans
We deliver the platform as a working production system — not slides, not a reference architecture.
Pick the scope that matches where you are today; you can grow into the next tier when the platform earns its keep.
Foundation
Data lake + first pipelines — a real starting point, not a POC
+support starting at €1,500/ month
- S3 / Azure Blob data lake (raw + clean zones)
- 1–2 source ingestion pipelines (database / API)
- Airflow orchestration (managed or self-hosted)
- dbt project scaffolding + 5–10 starter models
- Production CI/CD for the data project
- 2-hour analyst onboarding session
4–6 weeks for the lake + first pipelines. Minimum: implementation + 1 month of support.
Full Platform
The default Drafted build — production-grade open-source data platform
+support starting at €4,500/ month
- Everything in Foundation
- All sources connected (events, transactions, CRM, marketing)
- Curated zone with 100+ tested dbt models
- ClickHouse analytics layer (sub-second on billions of rows)
- Data quality framework: tests, freshness, lineage
- Monitoring & alerting (pipeline + DQ incidents)
- Self-service starter dashboards on Apache Superset
- Documentation + 2 training sessions for analysts
8–12 weeks for the full platform. Minimum: implementation + 3 months of support.
Enterprise
When the platform is core infrastructure for the business
+support starting at €7,000/ month
- Everything in Full Platform
- ML feature store (versioned features, point-in-time correctness)
- Multi-domain governance (data contracts, ownership, certification)
- Embedded engineers in your team during delivery
- Compliance hardening (PII tagging, RBAC, audit trails)
- Dedicated SLAs and on-call coverage
14–20 weeks for the full enterprise rollout. Minimum: implementation + 6 months of support.
Other delivery options — coming soon
We're adding two more variants of this capability so you can match the platform to your existing tooling and budget.
- Cloud-managed data platform: Same coverage on a managed warehouse stack (Snowflake / BigQuery / Databricks) — faster to ship, higher run cost.
- Premium custom stack: Spark + Iceberg + Trino for petabyte-scale workloads with stricter governance and ML-first delivery.
What becomes possible
Unit economics on demand. The CFO opens a dashboard on Monday morning and sees margin by channel, country, and cohort — without waiting for an analyst to pull numbers from three different systems.
KPI monitoring without firefighting. When conversion drops in a specific market, the product team sees it within hours — not at the end-of-month review — and can act immediately.
Campaign performance in real time. Marketing launches a promo and tracks order volume, average check, and new user conversion as it happens — adjusting spend the same day.
ML that actually ships. With clean, versioned training data available, the data science team spends time on models — not on data wrangling — and gets experiments into production faster.
Result
Companies go from "we don't trust our numbers" to a reliable, scalable analytics foundation that grows with the business — without expensive proprietary tools or vendor dependency.
The platform becomes the backbone for dashboards, KPI tracking, strategic planning, and machine learning — all from a single, well-structured source of truth.
Ready to scope your data platform?
Let's map what your platform should look like and what it costs.
On the call we review your current data sources, business questions, and constraints. You leave with a concrete scope: which tier fits, what to ship in phase one, and a realistic timeline to production.
- Source landscape and ingestion priorities
- Storage and warehouse layout (lake / curated zones)
- Orchestration and CI/CD model
- Analytics layer and self-service surface
- Data quality, monitoring and ownership
- Tier selection, timeline and support model
On the call we review your current data sources, business questions, and constraints. You leave with a concrete scope: which tier fits, what to ship in phase one, and a realistic timeline to production.