Capability · Data platform

Data Platform from Scratch

Building a scalable analytics foundation on an open source stack — from raw data to business decisions.

  • Data engineering
  • Open source
  • Self-service analytics
  • ML-ready
  • dbt · Airflow · ClickHouse
  • 0 → prod

    Full platform in 8–12 weeks from kickoff

  • 100+

    Business-ready data models across domains

  • Days → hours

    Time from raw data to actionable insight

Who we build this for

  • Scale-up startups outgrowing spreadsheets
  • E-commerce with fragmented data sources
  • Foodtech & QSR with ops + digital data
  • Enterprises with legacy reporting chaos

The starting point is usually the same: data lives in production databases, Google Sheets, or third-party tools — and nobody has a reliable single number to answer even basic business questions. Analysts spend most of their time preparing data, not analyzing it.

What we build

A production-grade data platform that ingests, transforms, and serves data reliably — built entirely on battle-tested open source tools, with no vendor lock-in and full cost control.

  • Storage
    • S3 / Azure Blob
    • Data lake
    • Raw → clean → curated zones
  • Transform
    • dbt
    • Versioned data models
    • Tested & documented
  • Orchestration
    • Airflow / Dagster
    • Scheduled pipelines
    • Monitoring & alerts
  • Analytics
    • ClickHouse
    • Sub-second queries
    • Billions of rows
  • Unified data layer

    All sources — app events, transactions, CRM, marketing — flow into one consistent, documented data model.

  • Self-service analytics

    Analysts get clean, trusted tables and can answer business questions without involving engineers.

  • ML-ready datasets

    Structured feature tables and historical snapshots ready to feed recommendation and prediction models.

  • Data quality built in

    Automated tests, freshness checks, and lineage tracking — so the business can trust what they see.

How we package delivery

Implementation and support plans

We deliver the platform as a working production system — not slides, not a reference architecture.

Pick the scope that matches where you are today; you can grow into the next tier when the platform earns its keep.

Foundation

Data lake + first pipelines — a real starting point, not a POC

initial implementation€18,000

+support starting at €1,500/ month

  • S3 / Azure Blob data lake (raw + clean zones)
  • 1–2 source ingestion pipelines (database / API)
  • Airflow orchestration (managed or self-hosted)
  • dbt project scaffolding + 5–10 starter models
  • Production CI/CD for the data project
  • 2-hour analyst onboarding session

4–6 weeks for the lake + first pipelines. Minimum: implementation + 1 month of support.

Full Platform

The default Drafted build — production-grade open-source data platform

initial implementation€55,000

+support starting at €4,500/ month

  • Everything in Foundation
  • All sources connected (events, transactions, CRM, marketing)
  • Curated zone with 100+ tested dbt models
  • ClickHouse analytics layer (sub-second on billions of rows)
  • Data quality framework: tests, freshness, lineage
  • Monitoring & alerting (pipeline + DQ incidents)
  • Self-service starter dashboards on Apache Superset
  • Documentation + 2 training sessions for analysts

8–12 weeks for the full platform. Minimum: implementation + 3 months of support.

Enterprise

When the platform is core infrastructure for the business

initial implementation€95,000

+support starting at €7,000/ month

  • Everything in Full Platform
  • ML feature store (versioned features, point-in-time correctness)
  • Multi-domain governance (data contracts, ownership, certification)
  • Embedded engineers in your team during delivery
  • Compliance hardening (PII tagging, RBAC, audit trails)
  • Dedicated SLAs and on-call coverage

14–20 weeks for the full enterprise rollout. Minimum: implementation + 6 months of support.

Other delivery options — coming soon

We're adding two more variants of this capability so you can match the platform to your existing tooling and budget.

  • Cloud-managed data platform: Same coverage on a managed warehouse stack (Snowflake / BigQuery / Databricks) — faster to ship, higher run cost.
  • Premium custom stack: Spark + Iceberg + Trino for petabyte-scale workloads with stricter governance and ML-first delivery.

What becomes possible

  • Unit economics on demand. The CFO opens a dashboard on Monday morning and sees margin by channel, country, and cohort — without waiting for an analyst to pull numbers from three different systems.

  • KPI monitoring without firefighting. When conversion drops in a specific market, the product team sees it within hours — not at the end-of-month review — and can act immediately.

  • Campaign performance in real time. Marketing launches a promo and tracks order volume, average check, and new user conversion as it happens — adjusting spend the same day.

  • ML that actually ships. With clean, versioned training data available, the data science team spends time on models — not on data wrangling — and gets experiments into production faster.

Result

Companies go from "we don't trust our numbers" to a reliable, scalable analytics foundation that grows with the business — without expensive proprietary tools or vendor dependency.

The platform becomes the backbone for dashboards, KPI tracking, strategic planning, and machine learning — all from a single, well-structured source of truth.

Ready to scope your data platform?

Let's map what your platform should look like and what it costs.

On the call we review your current data sources, business questions, and constraints. You leave with a concrete scope: which tier fits, what to ship in phase one, and a realistic timeline to production.

  • Source landscape and ingestion priorities
  • Storage and warehouse layout (lake / curated zones)
  • Orchestration and CI/CD model
  • Analytics layer and self-service surface
  • Data quality, monitoring and ownership
  • Tier selection, timeline and support model

On the call we review your current data sources, business questions, and constraints. You leave with a concrete scope: which tier fits, what to ship in phase one, and a realistic timeline to production.

Book a call