Rewriting the AML Compiler in Rust: 27× Faster, 12× Less Memory

Why we re-implemented Holistics' AML compiler from TypeScript to Rust: ~27× faster compiles and ~12× less memory at the median. On the largest project it went from 67s at 8.5 GB to 1.8s at 2.0 GB.

July 01, 2026 · 22 min read · Duy Phan
Rewriting the AML Compiler in Rust: 27× Faster, 12× Less Memory

Holistics is a code-first AI analytics platform. On most BI tools, your metrics, models, and dashboards are UI configuration: clicked together in a web app and locked inside a database, where they can't be diffed, reviewed, or reproduced. Holistics inverts that. The entire semantic layer, from how tables relate to how "revenue" is calculated, lives in plain-text files that teams own, version in Git, and change through pull requests, the same way they ship the rest of their software. That governed layer is also what grounds AI, so it answers questions from defined metrics instead of guessing at raw SQL.

The language behind all of this is AML (Analytics Modeling Language). It defines datasets, models, dashboards, relationships, permissions, metrics, and governed business logic, and because it's typed and composable, a metric is written once and reused everywhere instead of copy-pasted across dashboards, with mistakes caught as you type rather than after deploy.

Over time, an AML project becomes the source of truth for how an entire organization understands its data: hundreds of models, thousands of metrics, years of encoded decisions. And like any codebase, the bigger it gets, the more it depends on fast feedback. A change should surface its errors in milliseconds, not minutes.

Sitting under all of that is the AML compiler. It checks every file for errors, powers the editor features developers lean on (diagnostics, hover, autocomplete, go-to-definition), and produces the compiled representation that drives dashboard rendering, deploys, live editing, and Holistics AI. When the compiler is slow, everything built on top of it feels slow. That makes it one of the most critical pieces of infrastructure we run.

How the AML compiler fits in: it validates AML files, powers AML Studio editor features, and produces the compiled representation consumed by dashboards, deploys, live editing, and Holistics AI.

For years, that compiler was written in TypeScript. As customer projects grew, it became a bottleneck.

The largest project in our benchmark had more than 3,400 AML files and roughly 1.4 million lines of AML. Under Node's default memory limit, the TypeScript compiler couldn't finish. With the heap raised to 8 GB, it completed in 67 seconds at 8.5 GB of RAM.

After the Rust rewrite, the same project compiles in 1.8 seconds using 2.0 GB. Across all customer tenants, the compiler is roughly 27× faster with a fraction of the memory.

This post is about why we re-implemented the AML compiler, what changed in Rust, and what it means for customers building large semantic models in Holistics.

The AML Language

A BI platform has a lot to configure: tables, metrics, semantic models, dashboards, delivery schedules, transform-and-persist pipelines, and more. Most platforms usually dump those configurations to YAML and call it a day. But at any real scale that stops being maintainable: the same logic gets copy-pasted across files, there are no types to catch mistakes before deploy, and changing one thing means find-and-replace across the project.

AML takes a different approach. It's a programmable configuration language for BI. You can think of it as YAML + functions + types + modules: the configuration stays declarative, but you define logic once and reuse it instead of duplicating it.

A single example shows the main moves. It extends a model, declares a function, and composes new metrics from existing ones:

// Build a model on top of another model
Model users_with_activation = users.extend({
  dimension first_order_date {
    type: 'date'
    definition: @aql min(orders.created_date) | exact_grains(users.id) ;;
  }
})

// Declare parameterized logic (a function)
Func avg_by(period: String) {
  @aql unique(users.created_date | ${period}()) | average(same_day_conversion_rate) ;;
}

Dataset users_conversion_rate {
  models: [users_with_activation]

  // A metric that returns a table
  metric users_same_day {
    type: 'number'
    definition: @aql users_with_activation | filter(created_date == first_order_date) ;;
  }

  // Define a new metric by composing existing metrics
  metric num_users_same_day_activation {
    type: 'number'
    definition: @aql users_with_activation.num_users | where(users_with_activation.id in users_same_day) ;;
  }

  // Invoke the parameterized function
  metric conversion_rate_monthly {
    type: 'number'
    definition: avg_by('month')
  }
}

The compiler turns an AML project into the compiled representation that dashboards, deploys, live editing, and Holistics AI consume. Across thousands of files, doing that (parse, resolve names, type-check, evaluate) is a real compiler workload, and it's what the Rust rewrite targets.

For more on why we chose to build AML over adopting YAML, see AML vs YAML.

The benchmark

We measured the Rust compiler against the old TypeScript compiler on every customer tenant.

The projects ranged from small AML codebases to large semantic-layer projects with thousands of files. We ran both compilers on the same machine (Apple M4 Pro) and compared two paths: the full compiler pipeline and the typecheck pass.

Full pipeline

The full pipeline is the path Holistics uses when it needs the compiled representation of an AML project:

parse → typecheck → interpret

The compiled representation is consumed by product surfaces such as dashboard rendering, deploys, live editing, and Holistics AI.

Across all tenants, the full pipeline dropped from 471.7 seconds on TypeScript to 17.5 seconds on Rust.

That is ~27× faster in aggregate and ~35× faster on the median project.

Average full-pipeline compile time across all tenants: 471.7s on TypeScript vs 17.5s on Rust.

Typecheck pass

The typecheck pass is the compiler's semantic validation step:

parse → typecheck

It validates the AML project before interpretation, catches modeling errors, resolves names and types, and builds much of the semantic understanding used by later compiler phases.

We report it separately because typecheck is the operation users hit constantly while editing AML. It powers the instant validation, diagnostics, hover, and autocomplete that AML Studio surfaces as they type.

Across all tenants, the typecheck pass dropped from 199.0 seconds on TypeScript to 5.1 seconds on Rust.

That is ~39× faster in aggregate.

Aggregate typecheck-pass time across all tenants: 199.0s on TypeScript vs 5.1s on Rust.

Memory

Memory improved just as much as runtime.

At the median, the Rust compiler uses about a twelfth of the memory. On the largest projects, where TypeScript needed a raised heap just to finish, it uses roughly a quarter.

Project TypeScript Rust Reduction
Median 480 MB 41 MB ~12×
p95 2,005 MB 421 MB ~4.8×
Largest 8.5 GB* 2.0 GB ~4.3×

TypeScript needed an 8 GB heap to finish this project at all. Under Node's default limit it couldn't complete.

Full-pipeline memory usage: TypeScript vs Rust across projects.

Why we chose TypeScript and why we outgrew it

The original compiler was written in TypeScript for one main reason: it had to run in two places. The same compiler powers AML Studio in the browser (diagnostics, hover, autocomplete as you type) and runs on the backend for deploys and server-side compilation. JavaScript was the only language native to both, and TypeScript added types, velocity, and a single shared codebase across front end and back end.

TypeScript served us well until customer projects reached thousands of files. As customer projects grew, the compiler became both CPU- and memory-bound. TypeScript gives you little control over the things that now dominated: memory layout, allocation, and garbage collection. Every AST node, symbol, and type lives on the JavaScript heap, and V8 decides how it's laid out, hashed, and collected. Convenient, but it meant we couldn't directly tune the hot paths a compiler spends most of its time in.

On the backend, we had an escape hatch: when the compiler ran out of memory, we could raise Node's heap. That was the only reason the largest project finished at all, at 8 GB. In the browser there's no such lever. The JavaScript engine caps how much memory a tab can use, and nothing lets you raise it the way --max-old-space-size does on the server. A compiler that needs multiple gigabytes simply hits a wall.

And we expect this to get harder, not easier. As Holistics AI generates more AML, project sizes will keep climbing, pushing the same memory and CPU costs further past what the old compiler could absorb.

So we needed a language that let us control memory and CPU directly, without giving up safety. That means compact data layouts, no garbage collector, and explicit allocation. That points to Rust.

Microsoft recently ported tsc to Go for similar performance reasons, which shows that compilers tend to outgrow their host language as projects scale. We chose Rust over Go because its WebAssembly story is better. Standard Go compiles its runtime and garbage collector into every WASM binary, producing multi-megabyte downloads. Rust has neither, so it compiles to compact WASM and has mature browser tooling like wasm-bindgen. Since the compiler also runs in the browser in AML Studio, that mattered. Rust let us target both the backend and the browser from one codebase, the same goal that originally put us on TypeScript.

What changed in Rust

The next few sections are a deep dive into the engineering. If you mainly care about what this means for your projects, skip ahead to Upcoming changes for customers.

We did not port the TypeScript implementation mechanically. We kept the behavior, but redesigned the internal representation around what profiling had shown us.

The biggest wins came from making compiler data smaller, flatter, and cheaper to traverse. These were design improvements that could in principle have been adopted in TypeScript, but that Rust made the natural default rather than a discipline we'd have to enforce by hand.

1. Packed arenas and handles instead of pointer-heavy object graphs

The TypeScript compiler represented AST nodes, symbols, and types as JavaScript objects connected by references. That is the textbook way to represent compiler data, but it creates a large object graph: many small allocations, many pointers, and memory layout controlled by the JavaScript runtime.

The Rust compiler uses a different shape.

Instead of storing references between JavaScript objects, the Rust compiler stores compiler data in packed arenas and passes around small typed indexes (NodeIdx, SymbolId, TypeId) that identify entries in those arenas. These are a form of handles. A handle is a reference stored as a small integer index into a pool rather than a raw pointer (as described by Andre "Floooh" Weissflog and Andrew Kelley).

Packed arenas with small typed handles (NodeIdx, SymbolId, TypeId) replace a pointer-heavy object graph.

The layout helped in three concrete areas.

1/ CPU cache efficiency. An arena is contiguous memory, so walking it is a tight loop the CPU's cache prefetcher handles well, unlike chasing pointers between scattered heap objects. And each handle is a u32, half the size of a 64-bit pointer and carrying no object header. So references take four bytes instead of eight, more of the tree fits in cache at once, and because equal entities share one canonical handle, comparing them is a single integer compare instead of walking two structures.

2/ Simpler memory management. Nodes are allocated together into the arena and freed in bulk when it's dropped, with no per-object allocation and none of the garbage-collection churn a large object graph creates. Lifetimes live with the arena instead of being scattered across millions of small objects.

3/ Serialization-friendly. Because entries reference each other by integer index, not pointer, the arena has the same shape in memory, on disk, and across the network, with nothing to fix up. That makes caches that persist between sessions cheap, and opens the door to shipping compiled data between the AML server and the browser instead of rebuilding it from scratch.

The same compactness shows up in trivia: whitespace, comments, tabs, and other source-level details used by formatting, UI-oriented code generation, and source-to-source transforms. Instead of materializing each piece of trivia as a separate heap object, the Rust compiler stores trivia as compact rows in a shared arena: a small tag, a span into the original source text, and a few bytes of optional data per entry. The text itself never gets a separate allocation. It's recovered by slicing the source string on demand.

This layout changed the parser dramatically. Both the TypeScript and Rust parsers are LL(1) recursive-descent parsers over the same AML grammar. Only the representation they build differs.

On a large project with more than 3,400 AML files, the TypeScript parser took around 17 seconds. The Rust parser took 53 milliseconds.

Seventeen seconds is slow, and we will not pretend otherwise. TypeScript can parse faster than that, and we could have kept optimizing the TypeScript parser: profiling it, finding the hot paths, rewriting them, and reaching for low-level tricks.

A full move to Rust was the better choice, and paid off over time. A data-oriented design effectively solves this class of performance problem, and it leaves real headroom for aggressive optimization, because Rust gives the low-level control that JavaScript withholds. That 53 milliseconds already includes some of those optimizations, the kind Rust makes easy to reach for and JavaScript does not.

Memory followed the same pattern. On the same project:

  • Source on disk: 35 MB
  • Rust AST in memory: 148 MB
  • TypeScript AST in memory: ~1,159 MB

The Rust AST is ~7.8× smaller.

Parse time by project size: TypeScript vs Rust.

AST size in memory: ~1,159 MB on TypeScript vs 148 MB on Rust.

The same property compounds across the compiler. Name resolution, scope walks, type comparison, member lookup, dependency tracking, and cache access all benefit from the same underlying representation.

2. Coarser compiler caches

Long before the Rust port, our first answer to slow compiles was caching.

The driving use case was the AML Studio editor. As users edit AML in real time, they expect immediate feedback: diagnostics on the line they just typed, autocomplete that knows the surrounding types, hover info on every identifier. None of that works if every keystroke triggers a full project recompile.

So we built an incremental query system inspired by compiler frameworks like Salsa. The idea is simple: every derived value (a parse tree, a resolved name, an inferred type) is computed by a query whose inputs and dependencies are tracked. When a file changes, only the queries that transitively depend on it recompute. Everything else is reused.

An incremental query graph. Every query, from the top-level typecheckFile(A) down to the raw fileContents, caches its result and tracks what it depends on.

An incremental query graph. Every query, from the top-level typecheckFile(A) down to the raw fileContents, caches its result and tracks what it depends on.

After editing fileContent(A), only the dirty subtree recomputes. The rest of the graph is reused from cache.

After editing fileContent(A), only the dirty subtree (in red) recomputes. The rest of the graph is reused from cache.

On warm paths, it worked beautifully. The editor felt instant even as projects grew.

But two things kept hurting us.

The first was cold paths. Every fresh deploy, new compiler worker, new browser session, or cold dashboard compile still had to build the world from scratch. A cache only helps if there is a cache. On a cold start there's nothing to reuse.

The second only became clear once we started seriously profiling the compiler in production. We watched where time was actually going, and the result was surprising: for a large fraction of queries, the bookkeeping cost (hash lookups, dependency edges, revision checks) was higher than the cost of just recomputing the value. We were paying cache overhead millions of times per compile to save work that wasn't too expensive. The fine-grained design felt obviously right when we built it. It took running the system at scale to learn otherwise.

The TypeScript engine kept a memoized entry per AST node; the Rust engine caches one entry per file per phase.

The TypeScript engine kept a memoized entry per AST node, which was fast to invalidate at the smallest unit but expensive to maintain when millions of them exist. The Rust engine caches one entry per file per phase, trading some invalidation precision for far less bookkeeping.

The lesson wasn't Rust-specific. We could have rebuilt the TypeScript cache around a coarser grain. During the port, we replaced the fine-grained query cache instead of carrying it forward.

So the Rust compiler caches at a coarser grain, roughly per file, per phase. Parse results, name resolution, type indexes, and interpretation results live as larger cached units, so the compiler no longer pays query overhead on every lookup. This is one reason the type-check pass improved so much.

3. Cheaper hashing and allocation

This is the headroom for optimization we mentioned earlier.

Start with the smallest example. Switching the whole compiler to a faster allocator was two lines at the top of the crate.

#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

That single change made the parser benchmark about 11% faster. Every allocation in the compiler (every Vec, every Box, every hash map) now routes through mimalloc, with no other code changes.

Another example: SIMD. In Rust, we can reach for SIMD explicitly, through library calls or compiler intrinsics. The lexer uses it to scan string literals and comments at native vector speed. In TypeScript, SIMD is hidden inside the V8 runtime. You get whatever it chooses to vectorize internally, with no way to ask for the specific primitives a lexer needs.

A third example: hashing. Rust lets us swap in faster hash functions tuned for the workload: non-cryptographic hashers like xxhash and FxHash (the one used by the Rust compiler itself) for short keys, or a passthrough hasher when the keys are already well-distributed integer IDs the compiler generates. In TypeScript, Map and Set are stuck with V8's built-in hash.

How we kept the migration safe

The rewrite had one hard rule: identical behavior. Customers shouldn't see their AML behave differently just because the compiler underneath changed language.

We built a parity harness that ran both compilers on the same inputs and compared outputs. It ran across internal test suites and real customer projects, and a compiler phase wasn't considered complete until it matched the TypeScript compiler's behavior.

This was the most expensive part of the work. Most failures were small mismatches rather than crashes: output ordering, edge-case type inference, floating-point quirks, two serializations of an equivalent value that downstream systems treated as different.

The parity harness ran both compilers on the same inputs and compared outputs, phase by phase.

Early production impact

The benchmark showed the Rust compiler was faster and smaller. Production metrics showed the same pattern.

P95 AML server memory dropped sharply after rollout, from multi-gigabyte usage to a few hundred megabytes.

P95 AML server memory before and after the Rust rollout.

AML server latency also became more stable, with much smaller and less frequent p99 spikes.

AML server latency before and after the Rust rollout.

How we used AI agents

The TypeScript compiler had been in active development for about four years. The Rust migration took about three months. AI coding agents helped make that possible, but not by doing a giant one-shot rewrite. That would have produced an unreviewable diff and probably a subtly wrong compiler.

The useful pattern was smaller and more mechanical:

  • Humans set the boundaries: which compiler phase to port, which behavior had to match, and which tradeoffs were acceptable.
  • Agents handled bounded implementation work: porting one pass, fixing compiler errors, translating tests, comparing diffs, or chasing a parity failure down to a specific difference.
  • The parity harness was the first gate. A ported phase was not "done" because it compiled. It had to match the TypeScript compiler on real inputs first.
  • Then humans reviewed it: the implementation itself, plus any place the Rust version intentionally diverged from TypeScript (in design, not behavior, such as a different data structure) and why. Only an approved review marked the phase done.
  • The recurring bugs from each phase were written up into a short guide, so the next phase started out avoiding the mistakes the last one hit.
HUMANS Set the boundaries • which phase to port • which behavior must match • which tradeoffs are acceptable AGENTS Do the bounded work • port a pass • fix compiler errors • translate tests • chase a parity failure to a specific difference Parity harness matches the TypeScript compiler? Human review implementation + intended divergences? ✓ Phase done FEEDBACK Implementation guide recurring bugs to avoid on the next phase yes approved no needs changes write up recurring bugs informs the next phase
The port ran phase by phase in a tight loop. Humans scope the work, agents implement it, and nothing is marked done until it matches the old compiler and clears human review. Each phase's bugs feed a guide that sharpens the next.

Agents were most useful when the task had a narrow boundary and a parity test waiting at the end. When we let agents run too far without constraints, they produced changes that looked plausible and passed tests, but were wrong in more fundamental ways. The workflow mattered more than the model: small tasks, explicit invariants, parity checks, and human review at the boundaries.

Upcoming changes for customers

Customers will feel the change in three places: deploys, editing, and reliability.

Faster deploys on large projects

On large projects, AML compilation is a major part of publish-to-production time today. In the largest cases, it can dominate the release path or require operational workarounds.

After this ships, the AML compile step will be small enough that it no longer acts as the ceiling.

(Deploys include other server-side work, but the compiler will no longer be the dominant cost on large projects.)

Faster, more responsive editing

The whole edit-and-preview loop runs through the compiler: dashboard rendering, plus AML Studio services like diagnostics, hover, autocomplete, and go-to-definition. On large projects, cold compile paths add seconds of latency today and can push AML Studio toward freezes or browser memory limits.

With Rust, that cost will drop to a small part of each request, giving teams a faster feedback loop and much more headroom on the largest projects.

Fewer OOM failures and operational workarounds

The largest practical improvement will be reliability.

Projects that crash Node under the default heap today will compile. Projects that require heap-size tuning will have more memory headroom. Browser paths vulnerable to memory pressure will no longer hit the same ceiling.

That will remove an entire class of operational problems:

  • per-project heap workarounds
  • failed compiles on the largest projects
  • browser OOMs in AML Studio
  • runbooks for compiler memory failures
  • engineering time spent managing symptoms instead of improving the product

What's next

The Rust port was deliberately conservative on the first pass. Preserving TypeScript behavior was the goal. Now that the new compiler is stable, we can push it further.

Parallelism is the obvious next gain. Parsing and typechecking are already split per file in the cache. Running them in parallel is the natural step from there. The benchmark numbers in this post are all single-threaded, so the remaining headroom on large projects is real.

We're not finished. But for the first time, the compiler is fast and lean enough that the next round of improvements is about what we build on top of it, not fighting the runtime underneath.