Swim‑lane (ROC term)	What really moves the needle (2025‑27)
1. Price per token	(a) Closed versus open‑source quality gap (b) Monetization model — Seat licenses / application programming interface (API) calls / revenue‑share (c) Competitive discounting and hyperscaler bundles (d) Enterprise shift to on‑premise (on-prem) / private deployments
2. Variable cost per token	(a) Electricity US$/kWh (b) GPU and high bandwidth memory (HBM) depreciation schedule (c) Foundation‑model royalty / revenue‑share (d) Data‑center networking costs
3. Tokens served	(a) Adoption of agentic copilot workflows (b) Average tokens per query (reasoning chains versus single‑shot) (c) Physical‑AI workloads (robots, audio-visuals (Avs), industrial Internet of things (IoT))
4. Utilization rate	(a) How often GPUs sit idle versus billed (b) Bottlenecks (memory, networks, energy etc.) (c) Edge/off‑prem inference off‑loading
5. Fixed outlays (CapEx + OpEx)	(a) GPU ASP trends (b) Data center build cost per megawatt (MW) (land, cooling, fit out) (c) Long term power lock-in contracts (e.g., SMR nuclear PPAs) (d) Staff and support overhead; software licensing

Compute scaling phase	1. Pre-training	2. Post‑training (fine tuning)	3. Test‑time (inference)
Primary spend	Massive GPU clusters	GPU + High Bandwidth Memory (HBM), human‑in‑loop	Inference accelerators, memory bandwidth
Bottleneck	GPU supply	Data‑labor for human feedback	Latency and memory (context window etc.)
Bottleneck	Access to fresh data	Fast, low‑cost fine‑tune silicon	Energy cost per token

Factor	Link to ROC swim‑lane	Why it now matters
Compute S‑curves (pre‑training → post‑training → test‑time)	-Tokens served ‑ Utilization	Diminishing pre‑training gains force the industry to chase volume‑heavy inference and memory‑heavy reasoning, shifting demand from GPUs to HBM and smarter scheduling.
Token economics (pricing versus cost deflation)	‑ Price per token ‑ Variable cost per token	Open‑source fine‑tunes and usage‑based business models push price/ token down; success will depend on driving cost/ token down even faster.
Open‑source acceleration	‑ Fixed OpEx (royalties) ‑ CapEx agility	Community models such as DeepSeek R‑1 prove frontier quality at a fraction of historical budgets, allowing enterprises to redeploy spend higher up the stack.
Emerging bottlenecks (energy, HBM, latency)	‑ Variable cost per token ‑ Utilization	Power, memory bandwidth and edge latency — not GPUs — become the new choke‑points, determining how much of the installed fleet is actually sweated.
Demand shock from agentic & physical AI	+ Tokens served	Agents, factory robots and autonomous fleets multiply inference load, but only if end‑users perceive clear value and shoulder the bill.
China’s parallel supply chain	‑ Capex ‑ Variable cost per token	Export controls have catalyzed a domestic GPU and lithography stack that could cut accelerator ASPs by ~40 %, lowering both upfront build costs and ongoing depreciation.

The scale of the wager

A simple check up before another billion

Where are we on the compute S-curve?

Monetization and business models

Open-source squeezes token prices

Energy moves to center stage

Demand: Agentic and physical AI multiply tokens

Asia Pacific’s parallel supply chain

Investors see opportunities in real-world AI applications

Looking forward

Following the money in AI

Value Creation in Private Equity

The electricity economy

Our people

Barnaby Robson

Javier Rodriguez

Sanjay Sehgal

Footnotes

Looking  forward