Governed Surface Area
Governed surface area (GSA) is the percentage of an organization's analytical questions that a semantic layer can answer natively – without workarounds, custom SQL, derived tables, or manual analyst intervention. It is the single most practical metric for evaluating whether a semantic layer is actually doing its job.
A semantic layer with 90% GSA means nine out of ten real business questions flow through governed metric definitions. A semantic layer with 40% GSA means more than half of all questions escape into ungoverned territory – spreadsheets, ad hoc SQL, dashboard-level formulas, or analyst tickets.
How to measure it: the 20-question test
GSA is measured empirically rather than theoretically. The method:
Collect 20 real analytical questions from your organization. Pull them from Slack threads, analyst ticket queues, dashboard requests, and executive ask-me-anythings. Don't cherry-pick. Include the awkward ones – the period-over-period comparisons, the cross-grain ratios, the "rank customers by change in spend vs. last quarter" questions that actually drive decisions.
Attempt each question in the semantic layer. Use only the tool's native metric definitions, dimensions, and relationships. No derived tables. No table calculations. No custom SQL blocks.
Score each question. Full pass: the semantic layer produces the correct answer natively. Partial pass: the answer requires a workaround (derived table, dashboard formula) that lives outside the governed layer. Fail: the question can't be answered without custom SQL or an analyst writing a query from scratch.
Calculate GSA. Full passes divided by total questions. Partial passes don't count – if the logic escapes the semantic layer, it falls outside governance.
The result is a concrete number you can compare across tools, track over time, and use to justify investment in semantic layer expressiveness.
Why GSA matters for AI
AI-powered analytics amplifies whatever GSA you have – in both directions.
High GSA means the AI has a rich governed vocabulary to work with. When a user asks a complex question, the AI composes an answer from trusted metric definitions. The answer is consistent with every dashboard, every report, every other AI-generated response in the organization.
Low GSA means the AI frequently hits the semantic ceiling and falls back to generating raw SQL. The AI guesses at table relationships, filter logic, and aggregation rules. It produces answers that look correct but may diverge from official metric definitions. And the user has no way to tell the difference.
In an AI context, GSA is effectively the percentage of questions where the AI can give a trustworthy answer. Everything outside that percentage is an unverified guess.
What high vs. low GSA looks like
High GSA (75%+): Most business users get reliable answers through self-service. Analysts focus on novel questions rather than re-answering standard ones. AI interfaces produce consistent results. Derived table count stays stable – new questions are handled by the semantic layer instead of engineering new materializations.
Moderate GSA (40–75%): Simple questions work well. Multi-step questions – anything involving composition, comparison over time, or mixed granularity – require workarounds. Analysts spend significant time servicing requests that should be self-service. AI gives good answers for basic queries and unreliable answers for complex ones.
Low GSA (below 40%): The semantic layer covers lookup-style queries only. Anything beyond "metric by dimension" requires engineering support. The organization effectively operates in two modes – governed for simple stuff, ungoverned for real analysis. AI is trustworthy for basic questions and a liability for everything else.
GSA and composability
The primary technical driver of GSA is metric composability. A semantic layer where metrics can reference other metrics – where composable metrics are a first-class concept – handles a much wider range of questions natively. Nested aggregations, period-over-period comparisons, and multi-step calculations all require the ability to chain metric operations.
Semantic layers that treat metrics as terminal nodes – compute an aggregation, return a number, done – hit their ceiling quickly. The ceiling directly determines GSA: the lower the ceiling, the smaller the governed surface area, the more logic escapes into ungoverned workarounds.
Tracking GSA quarterly, using the same set of representative questions, gives the clearest signal of whether your semantic layer investment is actually paying off.
The Holistics Perspective
Governed surface area is the core evaluation metric Holistics uses to compare semantic layers. The 20-question test – run real business questions through the semantic layer and count how many it handles natively – is a practical methodology any buyer can apply during a PoC.
See how Holistics approaches this →