15 min read

Apache Superset vs Microsoft Power BI

an engineering-first comparison for teams picking a BI stack inside — or outside — the Microsoft orbit

Where queries run, how Fabric capacity pricing actually behaves, what Entra ID lock-in means in practice, and how embedded licensing diverges — a side-by-side for engineers picking a BI stack, not a marketing funnel.

By drafted.work· Operational data team

Most "Superset vs Power BI" articles online belong to one of two genres: vendor marketing with a thin coat of pros/cons, or SEO filler written by people who have never carried either tool in production. This one is written for the engineer who has to pick a BI stack and live with it — knowing the licence bill, the on-call pager, the data contract with their warehouse, and the patience of their stakeholders.

Both tools render dashboards. They disagree on almost everything else. Power BI is a deeply integrated piece of the Microsoft stack — Entra ID for identity, OneLake for storage, Fabric for compute, Purview for governance. Apache Superset is a thin, open-source SQL layer over a warehouse you already own. That single architectural gap drives everything below: cost, governance, hiring profile, failure modes.

1. Product snapshot

Apache Superset is an open-source data exploration and visualization platform, a Top-Level Project at the Apache Software Foundation, distributed under Apache License 2.0. The codebase lives at github.com/apache/superset — a high-traffic repo (70k+ stars, broad contributor base with strong representation from Preset, Airbnb, Lyft, Dropbox). The 4.x line is the stable baseline most production teams run; the project ships minor releases several times a year with a documented UPDATING.md for every breaking change.

Microsoft Power BI in 2026 is no longer a standalone product — it is one of the "experiences" inside Microsoft Fabric, the unified SaaS platform that merged Power BI Premium capacities, Synapse, Data Factory, Real-Time Analytics and OneLake under a single billing and governance plane. What you actually consume splits into:

  • Power BI Desktop — free, Windows-only authoring tool.
  • Power BI Service — the SaaS tenant at powerbi.com, reachable via Pro / PPU per-user licences or by running content on Fabric capacity.
  • Power BI Report Server — on-prem server for customers who cannot move reporting to the cloud.
  • Power BI Mobile — native iOS / Android apps.

The important shift: the old Premium P-SKUs are gone. Anything that used to require P1/P2/P3 capacity now runs on Fabric F-SKUs, which are billed as Azure resources rather than as Microsoft 365 subscriptions.

2. Pricing — the honest numbers

Power BI splits into per-user licences for small and mid-sized teams and Fabric capacity for organisational scale. Prices below are from microsoft.com/power-bi/pricing and azure.microsoft.com/pricing/details/microsoft-fabric (USD, pay-as-you-go monthly unless stated).

2.1 Per-user licences

LicencePriceWhat it buys
Power BI Pro$10 / user / moIndividual authoring + consumption in shared capacity.
Premium Per User$20 / user / moPro features plus paginated reports, 100 GB models, AI features, deployment pipelines.
Fabric Free$0View-only — useful only when the tenant has F64+ capacity (see below).

2.2 Fabric capacity (F-SKUs)

F-SKUs replaced the legacy P-SKUs. They are Azure resources — you pay per-second when provisioned, and you can pause / resume from the Azure portal. Capacity is measured in Capacity Units (CUs); 1 CU ≈ 0.125 v-cores in the old Power BI sizing.

SKUCUsLegacy equivalentMonthly PAYG (≈)Monthly reserved (≈)
F22$262$156
F88EM1 / A1$1,051$625
F3232EM3 / A3$4,205$2,501
F6464P1 / A4$8,410$5,003
F128128P2 / A5$16,819$10,005
F256256P3 / A6$33,638$20,011
F512512$67,277$40,021
F10241024$134,554$80,043

Prices rounded. Always reconfirm against the Azure pricing page at sign time — Microsoft re-prices regionally, and reserved-instance discounts shift.

Two F-SKU thresholds matter more than the rest:

  • F64 is the "free viewers" threshold. On F64+ a user with the Fabric free licence (or a viewer role) can consume Power BI content without a Pro or PPU seat. Below F64, every consumer needs a paid licence. That's the cliff every medium-sized org hits: between roughly 50 and 500 end-users, you have to choose between paying per-user Pro/PPU or paying ~$8.4k/month for F64.
  • Power BI Embedded for external apps historically used A-SKUs (billed hourly, starting around $735/month for A1). On new greenfield deployments, Microsoft is steering customers onto F-SKUs for embedded scenarios as well.

A 60-day Fabric trial equivalent to F64 is available to eligible tenants — enough to seriously evaluate the platform end-to-end before committing.

2.3 Apache Superset

Apache Superset is $0 licence fee under Apache 2.0. The real TCO is people and infrastructure:

  • Compute: web servers + Celery workers + Celery Beat scheduler + optional headless Chrome for alerts/reports.
  • Stateful deps: a metadata database (Postgres or MySQL), Redis (or another Celery broker + results backend).
  • Ops: someone who can read UPDATING.md, run Alembic migrations, and debug Celery backlogs. This role is real — budget for it.

If you want Superset without running it yourself, Preset and other vendors sell managed offerings that trade a chunk of that people cost for a per-user fee — which reintroduces a vendor, but on your own terms.

2.4 When the licence line crosses

The Superset-vs-Power-BI cost story is not "open source is cheaper." It is: the crossover point shifts dramatically with viewer count and warehouse ownership.

  • Under a hundred creators and a couple hundred viewers in an M365-native shop? Per-user Pro is fine; self-hosting Superset is probably not worth the headcount.
  • Thousands of viewers, or external embedding? The per-user or F-capacity cost of Power BI gets painful fast, and Superset's flat infra cost becomes the sane path.
  • Heavy warehouse investment (Snowflake, BigQuery, Databricks SQL, ClickHouse)? Superset leans on compute you're already paying for; Power BI's Import / Direct Lake model duplicates a copy of it into Fabric.

3. Architecture where it actually matters

Skipping "both have dashboards". Here's the stuff that changes engineering decisions.

3.1 Where queries actually run

  • Superset is a thin layer over your warehouse. Every chart is a SQL query sent via SQLAlchemy / DB-API to Snowflake, BigQuery, Databricks, ClickHouse, Postgres, Trino, etc. There is no Superset-managed storage engine. Your warehouse's performance is Superset's performance.
  • Power BI has its own columnar in-memory engine — VertiPaq — and a multi-mode storage story on top of it:
    • Import mode. Data is copied into the semantic model and compressed into VertiPaq. Fastest interactive performance, but the model has to be refreshed on a schedule.
    • DirectQuery. Each visual fires a query to the source. Fresh but slower, and subject to warehouse / network latency.
    • Direct Lake (Fabric only). VertiPaq reads Delta-Parquet files directly out of OneLake with metadata "framing." Effectively Import performance without the refresh cycle — the flagship Fabric feature.
    • Composite models. Mix Import, DirectQuery and Direct Lake tables in a single report; useful in practice, complicated to reason about.

Consequence: if your organisation already treats the warehouse as the source of truth and invests in it (dbt, materialisations, proper clustering), Superset inherits that work for free. Power BI's flexibility comes at the cost of a second place where "the data" lives — the VertiPaq model — which is a different thing for your data engineers to own, refresh, secure and version.

3.2 Semantic / metrics layer

  • Superset datasets are either physical tables or virtual SQL-backed objects. Metrics, calculated columns, and Jinja-templated SQL (e.g. {{ current_user_id() }}) are defined at the dataset level. It's a lightweight layer — simple to understand, easy to put under Git (see our GitOps post for one workflow). Transformation logic belongs in dbt / the warehouse; Superset mostly avoids duplicating it.
  • Power BI's semantic model is a direct descendant of SQL Server Analysis Services. You get a full multidimensional model with relationships, hierarchies, and DAX measures, plus Power Query (M) for ingestion-time transformations. Rich and battle-tested — but DAX is a specialised language, and teams without a dedicated Power BI architect routinely get measures subtly wrong (time intelligence, filter context, row-level vs. filter-level evaluation).

If you want metric definitions in Git with code review, Superset (paired with dbt in the warehouse) is much less fight. If you need deep multi-dimensional modelling, parent-child hierarchies, and non-technical modellers building complex calculations without writing SQL, Power BI's DAX/M world buys you real capability — at the cost of a hiring pipeline that can actually write it.

3.3 Query execution, refresh, and caching

  • Superset pushes everything to the warehouse. Long-running queries run async via Celery, results land in a Redis-backed cache, thumbnails in a separate cache. Failure modes are concentrated in two places: warehouse saturation and Celery-worker sizing.
  • Power BI juggles three different execution paths depending on mode: VertiPaq evaluates Import and Direct Lake queries in memory; DirectQuery folds back into the source; composite models route per-table. Capacity monitoring happens through the Microsoft Fabric Capacity Metrics app, which is the primary tool for spotting throttling, evictions, and long-running refreshes.

Failure modes differ. Superset usually fails because the warehouse is overloaded or a worker pool is wrong-sized. Power BI most often fails on capacity saturation — a heavy refresh burns CU budget and interactive reports slow down across the tenant. Autoscale exists; it can also quietly double your bill.

3.4 Visualizations

  • Superset renders via Apache ECharts plus a handful of legacy D3-based charts, and exposes a plugin system for custom React / TypeScript charts. The catalogue is broad — bars, lines, time series with forecasting, pivot tables, heatmaps, geo. Pixel-perfect formatting is not the strong suit.
  • Power BI ships a large built-in catalogue plus AppSource for community / commercial visuals and a custom-visual SDK (R, Python, TypeScript). The drag-and-drop formatting experience is still the benchmark for non-technical authors; "conditional formatting on a measure" takes two clicks instead of fifty lines of SQL.

3.5 SQL editor

  • Superset has SQL Lab: multi-tab editor, query history, metadata browser, "Create Table As Select", and a direct path from query result to saved dataset to chart. It's a genuine reason engineers tolerate Superset's rougher edges elsewhere.
  • Power BI has no first-class SQL editor. The DAX Query View lets you query semantic models in DAX; Power Query is visual-first M with a scripting escape hatch; deep ad-hoc SQL exploration belongs in SSMS / Azure Data Studio / Synapse / your warehouse's own IDE.

3.6 Alerts and scheduled reports

  • Superset: Celery Beat schedules jobs; headless Chrome (via Playwright) renders charts/dashboards; delivery via SMTP or Slack. Standing this up cleanly in production takes real work — headless-browser orchestration is a known rough patch, and fonts / chart layout quirks under Chrome are a recurring support channel.
  • Power BI: first-class subscriptions, data-driven alerts, and tight Power Automate integration for anything more complex. Email digests, Teams cards, triggered workflows — all paved road, configured from the UI.

3.7 Embedded analytics

  • Superset: @superset-ui/embedded-sdk with guest tokens (JWT), iframe + postMessage, CSS-level theming, full control over host-app → viz interactions. Apache 2.0 means you can embed in a commercial product without a per-end-user licence bill. For SaaS vendors, this is often the decisive factor.
  • Power BI Embedded historically ran on A-SKUs billed hourly from Azure; new deployments are increasingly steered onto Fabric F-SKUs. "App owns data" with service principals works cleanly and is well-documented, but every end user either consumes against your capacity (and you pay for that CU budget) or needs a Pro licence — which in a multi-tenant SaaS is a non-starter.

If you're embedding into a product you resell, the licence model alone decides this one in the vast majority of cases.

3.8 Mobile

  • Superset: responsive dashboards in a normal browser. No native mobile app; "mobile BI" is not part of its identity. Teams build bespoke wrappers or accept that phones get the web view.
  • Power BI: dedicated Power BI Mobile apps for iOS and Android with biometrics, offline caching, phone-specific report layouts, and push alerts. Also Microsoft Intune Mobile Application Management can wrap the app with DLP policies — useful in regulated environments.

4. Governance, security, auth

4.1 Identity and authentication

  • Power BI has a hard dependency on Microsoft Entra ID (formerly Azure AD). You cannot deploy Power BI without an Entra tenant — every user, group, service principal, Conditional Access policy, and MFA rule runs through it. For M365 shops this is seamless. For multi-cloud or identity-federated orgs, it's a load-bearing dependency worth naming out loud.
  • Superset delegates auth to Flask-AppBuilder (FAB). Out of the box: DB-backed users, OAuth2 / OIDC (with PKCE), LDAP, SAML (via add-ons), REMOTE_USER for header-based integrations. Any identity provider that speaks OIDC will plug in — Okta, Keycloak, Auth0, Entra ID itself, a home-grown IdP.

If your identity strategy is anything other than "everything is in Entra", Superset is the more neutral choice.

4.2 Authorization, row-level security, data governance

  • Superset has role-based access control through FAB (roles granted permissions on views / menus / datasets) and Row-Level Security expressed as SQL filter clauses tied to roles or user attributes. For embedded scenarios, guest tokens can carry per-session RLS rules. Governance depth beyond that — data catalog, lineage, data contracts — lives in external tools (DataHub, OpenMetadata, dbt docs).
  • Power BI layers rich governance directly into the platform: RLS and OLS (object-level security) defined with DAX expressions in the semantic model; native Microsoft Purview integration for catalog and lineage; sensitivity labels that persist through Excel / PDF exports and trigger DLP in the wider M365 compliance stack. Combined with Intune MAM on mobile, this is the most integrated governance story on the market — and it only works if the rest of your stack is Microsoft's.

Rough heuristic: Superset forces you to be honest about where governance lives (usually the warehouse plus your own tooling). Power BI lets you put governance inside the BI tool itself — at the price of committing to the Microsoft data / compliance platform end-to-end.

4.3 Audit & lineage

  • Superset logs queries and dashboard views into its metadata DB; export to ELK / Splunk / Loki as needed. Lineage and impact analysis are external concerns.
  • Power BI / Fabric exposes activity logs through the Microsoft 365 compliance centre; Purview provides tenant-wide lineage across Fabric items and Power BI artefacts.

5. Deployment and day-2 operations

Superset (self-hosted)

  • Reference deployments: the official Helm chart on Kubernetes, or Docker Compose for smaller setups.
  • Runtime components: web, worker (Celery), beat (Celery Beat), optional headless Chrome container for alerts/reports, Redis, metadata DB.
  • Upgrade posture: every minor release has an UPDATING.md documenting breaking changes, config renames, and manual migrations. Reading it is non-optional.
  • Honest truth: self-hosting Superset is a real infra project, not a weekend docker-compose up. Budget the people, or buy the managed version.

Power BI / Fabric

  • SaaS by default. Power BI Service and Fabric are consumption products. You don't patch servers; you don't own a Kubernetes cluster. Capacity is a dial in the Azure portal — pause, resume, scale, enable autoscale bursting, all from the web UI or ARM templates.
  • On-premises Data Gateway is the one piece of customer-managed infrastructure most tenants need — a small Windows service (Personal or Enterprise mode) that bridges Power BI Service to firewalled on-prem sources.
  • Power BI Report Server exists for organisations that truly cannot move reporting to the cloud (sovereign workloads, air-gapped networks). It ships about one major release per year and lags Cloud features by design.
  • Day-2 ops centre on capacity management, not server administration. The Fabric Capacity Metrics app is the first stop when users complain about slow reports — throttling and CU overages surface there first.

6. Real-world scale — traceable signals

A short reality check on where each tool operates today, sourced from engineering blogs and public disclosures rather than vendor decks.

Apache Superset

  • Airbnb (project progenitor) scaled Superset to thousands of users and documented a cache-warmup strategy via Apache Airflow driving an 86% cache hit rate for Presto-backed charts.
  • Dropbox replaced 10+ legacy visualization tools with Superset and documented the migration publicly, citing reusable metrics and SQL-first workflows.
  • Lyft runs Superset against Presto / Hive with nodes cycled every 24 hours to keep query performance stable.
  • The project is listed by the ASF as in active use at American Express, Nielsen, X/Twitter and others.

Microsoft Power BI / Fabric

  • Microsoft reported on its Q2 FY2026 earnings call that Fabric passed $2B in annual recurring revenue within two years of GA, with 31,000 customers and 60% YoY growth — the fastest-growing product in their analytics portfolio.
  • Microsoft Cloud revenue crossed $51.5B in Q2 FY2026 (+26% YoY), driven by Fabric / AI.
  • More than 90% of the Fortune 500 now have M365 Copilot, which uses Power BI as its analytics surface.

Both datapoints matter, in different ways. Superset scale is proven by engineering-led deployments with specific traceable architectures; Power BI scale is proven by enterprise-wide adoption across thousands of organisations that mostly don't write engineering blog posts.

7. Honest weaknesses

Apache Superset

  • Upgrades are not fire-and-forget. Metadata migrations and config changes between minor 4.x versions are a recurring operational tax for teams that lag behind a few releases.
  • Alerts & Reports (headless Chrome + Celery Beat) is a common source of incidents — rendering quirks, timeouts, font issues.
  • Filter UX and dashboard polish are visibly behind Power BI for non-technical consumers; teams serving execs sometimes keep Superset for internal exploration and ship a different layer for that audience.
  • Governance out of the box is thin. Lineage, catalog, sensitivity labels, DLP — all external tools you have to wire up.
  • No native mobile app. Responsive web only.

Power BI / Fabric

  • Windows-only authoring. Power BI Desktop does not run on macOS or Linux natively. Mac/Linux-heavy engineering teams end up with Parallels, VMs, or web modelling as a compromise.
  • Entra ID lock-in. Identity is not a pluggable layer here.
  • The F64 pricing cliff. Below F64 every viewer needs a paid licence; at F64 you jump to roughly $8.4k/month. For orgs with 50–500 consumers, neither side of the cliff is comfortable.
  • DirectQuery pitfalls. Live queries are marketed as a smooth solution for real-time; in practice complex reports hit performance cliffs and visuals arrive at slightly different times, creating "time-inconsistent" dashboards.
  • Capacity-based failure modes. A badly behaved semantic model can burn CU budget and slow the entire tenant. Diagnosing takes real Fabric-admin experience.
  • DAX learning curve. Specialised language; bad measures look correct until they aren't.
  • Fabric bundling. Modern Power BI features (Direct Lake, some governance controls) are tightly coupled to Fabric capacity, which means the "BI tool" decision turns into a "data platform" decision.

8. When to pick which

Pick Apache Superset when…Pick Power BI when…
Your audience is SQL-fluent — data engineers, analytics engineers, product teams.Your audience is non-technical business users, analysts, execs.
You want thousands of viewers without per-seat licences.You're already on M365 / Entra ID / Fabric and want seamless SSO + governance.
You're embedding BI into a SaaS product you resell.You need batteries-included RLS, sensitivity labels, Purview lineage, DLP.
Your data strategy is "warehouse-first" with live queries, no data duplication.You want Direct Lake performance over Delta-Parquet in OneLake.
You want metric definitions in Git, under code review (via dbt + Superset).You need deep DAX modelling with relationships, hierarchies, time intelligence.
You have platform capacity to operate a real Python + Celery + Redis stack.You want the vendor to own infra, upgrades, mobile apps, and SLAs.
Your identity strategy is anything other than "everything in Entra ID".Your executive audience lives on phones and demands polished dashboards.

9. TL;DR for the impatient

Both tools work. Superset optimises for engineering control, zero per-user pricing, and warehouse-first architecture at the cost of operating it yourself. Power BI optimises for polished UX, mobile, executive-friendliness, and deeply integrated Microsoft governance at the cost of per-user / capacity licences, Entra ID lock-in, and a data model that often duplicates your warehouse.

If you're engineering-led, warehouse-heavy, and embedding analytics into products you sell — Superset is the answer, possibly on a managed provider if you don't want to run it. If your organisation already lives inside M365 / Entra / Fabric and your BI audience is business users and execs consuming polished reports — Power BI is worth its price.

The honest answer to "which should we pick" falls out of two questions: who are the primary consumers, and is your identity / compliance stack already Microsoft's? The rest is detail.

References

Topics

  • Apache Superset
  • Power BI
  • Microsoft Fabric
  • BI comparison
  • Embedded analytics
  • Data governance