Why Metadata Matters in BI Product Development
Part 1 of 5: Introduction
Introduction: why metadata matters

I treat all the content on Tableau Server as my personal product. This product helps the business operate more efficiently and effectively. The general goal of my work is to assist the business in making more high-quality decisions based on my product.
Monitoring and analyzing internal data on how my BI product is performing — server infrastructure, server content, users, access permissions, refresh schedules, query timing — lets me actually run that product. Without it, the BI team is back to a request queue with no feedback loop.
I still remember those days we were operating without a full view of our Tableau Server environment. We had hundreds of workbooks, multiple user groups, many users had completely non-standardized access rights. Questions like “Who has access to that sensitive workbook?”, “Which dashboards might be affected if we switch to another data marts in the database?”, “How are our licenses utilized?”, “Who has never visited dashboards even though they requested their development?” — or simply “Do we have duplicate users with the same email?” (because our user management system was fully manual) kept coming up, and I had to check all this stuff manually every time.
Initially, I relied on standard Admin Views in Tableau, hoping they’d reveal everything. They’re convenient for the basic questions and useless for the specific ones. That is when I began combining the REST API, the PostgreSQL repository (on-premise only), and the GraphQL Metadata API into a single operating layer for the platform.
This article is the introduction to a five-part series on doing exactly that.
What “Tableau metadata” actually contains
When people say “metadata” they usually mean three different layers stacked on top of each other. Tableau exposes all three, but through different endpoints:
- Inventory metadata — what content exists. Workbooks, views, published datasources, projects, sites, owners, tags. Stable, mostly hierarchical, easy to mirror into your warehouse. Available on Server and Cloud via the REST API.
- Lineage metadata — what depends on what. Embedded vs published datasources, upstream tables, columns, calculated fields, downstream workbooks. The foundation of impact analysis. Available on Server and Cloud via the GraphQL Metadata API.
- Behavioural metadata — what is actually happening. View counts, login frequency, refresh durations, query timings, audit events, permission changes. The richest layer — and the one that is on-premise only in any deep form, because it requires direct PostgreSQL repository access.
The Part 2 article walks through exactly which method covers which layer, and which workflows break when you move from Server to Cloud: Tableau Cloud vs On-Premise: Metadata Access Comparison.
Five workflows metadata unlocks
These are the workflows that paid for the entire investment in our team — moved BI from “ticket factory” to a measurable platform.
- Permissions matrix as a queryable table. Who has Read/Write/Delete on every workbook, joined across users, groups, projects, and sites. Built once, queried every audit cycle. Repository-backed on-prem; on Cloud you reconstruct it from REST + Cloud Manager.
- Usage and adoption analytics. Daily view counts per dashboard, unique users per week, time-since-last-view per workbook. Identifies the long tail of “built but never used” content and the small set of dashboards that drive the business.
- Lineage and impact analysis. Before you deprecate a data source or rename a column, get the exact list of downstream workbooks. GraphQL Metadata API one-shot query.
- License utilization audits. Active vs inactive users per license tier. Triggers the conversation about reclaiming Creator licenses from people who only consume. Pays for itself the first quarter.
- Refresh and performance monitoring. Extract refresh durations, failed refreshes, slow vizql queries. The starting point for any platform tuning work.
Workflows 1, 2, 4, and 5 lean heavily on behavioural metadata — that is why the platform decision (Server vs Cloud) has direct consequences for what you can build.
How to access each layer
Three programmatic methods, each with its own sweet spot:
| Method | Best for | Available on |
|---|---|---|
| REST API | Automation, user/content lifecycle, embedding | Server + Cloud |
| PostgreSQL repository | Historical analytics, deep audit, custom permissions matrix | Server only |
| GraphQL Metadata API | Lineage, impact analysis, data catalog | Server + Cloud |
The next four parts of this series go deep on each method with code, query examples, and the real workflows I built on top of them:
- Part 2 — Cloud vs Server, side by side: Tableau Cloud vs On-Premise: Metadata Access Comparison
- Part 3 — REST API patterns: Mastering Tableau REST API for BI Automation
- Part 4 — PostgreSQL repository deep dive: Deep Dive into Tableau PostgreSQL Repository
- Part 5 — GraphQL Metadata API: GraphQL for Advanced Tableau Metadata Analysis
Treating BI as a product — concretely
A “BI as product” mindset is easy to say and hard to operationalize. Metadata is what makes it real. It is the difference between “we have dashboards” and “we have a platform with measurable usage, an inventory of assets, a permissions model, and a feedback loop.”
A short checklist of what changes when you start treating your Tableau instance as a product instrumented by metadata:
- You have a single source of truth for content inventory (auto-refreshed from REST) instead of a Confluence page that goes stale in a month.
- You have a usage scoreboard instead of guessing which dashboards matter.
- You have an audit-ready permissions report instead of clicking through workbook properties one at a time.
- You have a deprecation workflow for old content — measured, not gut-feeling.
- You have a performance dashboard for the platform itself, not just for the business.
That posture changes how the rest of the company perceives BI. You stop being a request queue and start being infrastructure with a roadmap.
FAQ
Where does Tableau metadata live? In three different places: the REST API (current inventory and operations), the GraphQL Metadata API (relationships and lineage), and the PostgreSQL repository (historical events and configuration). The repository is on-premise only.
Can I do all of this on Tableau Cloud? Most workflows — yes, but you must rebuild the historical analytics layer yourself by snapshotting REST and GraphQL responses into your own warehouse on a schedule. The on-premise repository does that work for free, which is the single biggest analytical difference between the platforms. The full comparison lives in Part 2 of this series.
What is the minimum useful metadata pipeline? A daily REST snapshot of users, groups, projects, workbooks, views, and view counts into your warehouse. From that single table you can already answer 80% of the platform questions a BI team faces. Add GraphQL lineage and (on-prem) repository queries to cover the remaining 20%.
Is the GraphQL Metadata API a paid add-on? Available on both platforms. Always confirm current licensing in the official Tableau documentation before scoping work — entitlements have shifted across releases.
Where do I start if I have never queried Tableau metadata before? Generate a Personal Access Token, install tableauserverclient, and pull the list of workbooks. That five-line script is the foothold — every workflow above starts from there. Part 3 of this series walks through it end-to-end.