Understanding Vercel Usage

2026-05-30 · 12 min read

The first time I opened the Vercel usage dashboard, I genuinely did not know what I was looking at. I had expected a single number — bandwidth, maybe, or "compute hours." Instead I found something closer to fifty separate meters spread across eleven categories: Fast Data Transfer, Fast Origin Transfer, Edge Requests, ISR Reads, ISR Writes, Image Transformations, Function Invocations, Active CPU, Provisioned Memory, Build CPU Minutes, Blob operations, Queue sends, Trace Spans, and a long tail of things I had never heard of. It looked less like a bill and more like a parts catalog for a machine I had been driving without ever opening the hood.

I'm on the free Hobby tier, so none of this costs me anything. But I think usage literacy matters even when you're paying zero, and here's why. The same meters that read "free" on Hobby are the exact meters that read "dollars" on Pro — the only thing that changes is the column to the right. If I understand what drives each number now, I'll already know where the money is when a project graduates to a paid plan. And the plans behave differently in a way worth internalizing up front. On Pro, when you hit 100% of an allowance, your deployments are not automatically stopped — that resource simply switches to on-demand (pay-as-you-go) billing so your site stays live through a traffic spike. On Hobby there is no overage to buy: exceed an included allowance and that feature pauses until the allowance resets (in most cases you wait until 30 days have passed). Hobby is also restricted to genuinely non-commercial personal use under Vercel's Fair Use Guidelines — anything commercial is supposed to be on Pro or above.

So this post is me reading my own dashboard, top to bottom, and writing down what every meter actually means. All the numbers and units below come straight from Vercel's official docs.

The mental model: five buckets and a billing rule

Once I stopped reading the meters alphabetically and started grouping them by what they're for, the whole thing collapsed into five buckets:

Get bytes to users — the CDN layer: data transfer and request counts.
Render and cache — turning data into pages and remembering the result: ISR, the Data Cache, image optimization.
Run your code — functions, Fluid Compute, edge runtimes, middleware.
Ship it — builds and the Turborepo remote cache.
Data and platform add-ons — Blob storage, Queues, Cron, observability, flags, BotID, Sandbox.

Underneath all of them sits one billing rule, and it's the most important sentence in this whole post. Vercel's "Managed Infrastructure" is billed by actual resource consumption — data transferred, requests made, compute duration — not a flat fee. Every product breaks down into resources, and each resource has an included allowance plus an on-demand rate beyond it. On Pro, the order of charging is: each resource first consumes its own included allocation; then usage beyond that draws against a $20/month credit that applies across all managed-infrastructure resources; and only after that credit is exhausted does it bill at the true on-demand per-unit rate. The credit and allocations reset monthly and don't roll over. On Hobby there's no credit and no overage — the included allowance is the whole story, and crossing it pauses the feature. Vercel's pricing overview is the canonical reference for this structure.

With that frame, here's the line-by-line tour.

Get bytes to users (the CDN)

This is the networking bucket: the cost of moving bytes between Vercel's edge and either the visitor or your own origin compute. Every page view touches at least two of these meters at once — a transfer and a request. Most of them are regionally priced, meaning the per-unit rate varies by the region where requests originate.

Metric	What it measures	Unit / how it's billed	What drives it up	How to lower it
Fast Data Transfer	Data sent between the CDN and the visitor's device	GB, regional; counts the full body, all headers, the full URL, and any compression	Big page weights, unoptimized images, large JS/CSS bundles, high traffic, large uploads	Image optimization, trim bundles, smaller assets; use the Regions filter to find heavy traffic
Fast Origin Transfer	Data sent between the CDN and Functions / Middleware / Blob / Data Cache	GB, regional; billed both directions of compute I/O	Large/extraneous function responses, big uploads, Middleware that also hits a function (can count twice), ISR generation	Return only needed fields, add Cache-Control, support ETag/If-Modified-Since, scope Middleware with a matcher
Edge Requests	Count of requests the CDN processes (static assets and functions)	Raw count, regional; each request = 1	High visit volume, many assets per page, re-mounting that refetches, polling / refetch-on-focus	Cache, cut re-mounting (watch repeated 304s), reduce polling, consolidate assets
Edge Request CPU Duration	CPU the CDN burns on routing per request	CPU time; ≤10 ms per request is free, then metered in 10 ms increments, regional	Many routes, many redirects, complex routing regexes	Fewer routes/redirects, simpler regexes; diff the deployment where it jumped
Microfrontends Routing	Requests routed across your microfrontends	Count of routed requests; each = 1	Traffic across a microfrontends setup, cross-microfrontend navigations, more projects	Optimize navigations, consolidate microfrontend projects

On included allowances: Hobby gives you 100 GB of Fast Data Transfer per month and 1,000,000 Edge Requests; Pro raises those to 1 TB and 10,000,000 respectively. Fast Origin Transfer and Edge Request CPU Duration have no published flat included tier in the pricing tables — the only "free" provision documented for CPU Duration is that 10 ms-per-request threshold. Microfrontends Routing includes 50,000 requests/month on Hobby (Pro has no fixed tier). The full breakdown lives in Vercel's CDN usage docs.

Render and cache

This bucket is about not doing work twice. It covers Incremental Static Regeneration (ISR), the Data Cache, and image optimization. The recurring theme: a free ephemeral CDN cache sits in front of a billed durable cache, and you pay when requests fall through to the durable layer.

ISR and the Data Cache share an elegant unit: the 8 KB Read Unit and 8 KB Write Unit. A read happens on a CDN miss that falls back to the durable cache in your function region; a write happens when content is first generated or when revalidation produces different content. The killer detail I didn't expect: if revalidation runs but the regenerated content is identical to what was cached, no write units are incurred at all — and that's true for both time-based and on-demand revalidation. Writes cost exactly 10x reads per unit.

Metric	What it measures	Unit / how it's billed	What drives it up	How to lower it
ISR Reads	Data read from the durable ISR cache on CDN misses	8 KB Read Units, regional ($0.40–$0.64 per 1M)	CDN cache misses, short revalidation intervals, large payloads, frequent deploys (each gets a fresh cache)	Longer revalidation, selective pre-rendering, smaller payloads, cost-optimal region
ISR Writes	New data durably written to the ISR cache	8 KB Write Units, regional ($4.00–$6.40 per 1M)	Short revalidation, aggressive on-demand revalidation, non-deterministic output (`new Date()`, `Math.random()`)	Longer intervals, event-driven revalidation, remove non-determinism
Data Cache Reads	Cached `fetch`/`unstable_cache` results read back (Next.js 14 and below)	8 KB Read Units, regional ($0.40–$0.64 per 1M)	High cache-hit traffic, large cached payloads, running in multiple regions (one cache each)	Cache only needed fields, consolidate regions
Data Cache Writes	New/revalidated cache entries stored	8 KB Write Units, regional ($4.00–$6.40 per 1M)	Frequent changed-content revalidation, large items, non-deterministic output	Longer intervals, on-demand revalidation, deterministic output
Data Cache Bandwidth	Data movement for the cache (not a separate billed line)	Surfaces under Fast Data / Fast Origin Transfer, not its own meter	Larger payloads, higher traffic, cross-region movement	Optimize the underlying transfer resources
Data Cache Revalidations	Count of revalidation events	A count in Observability, not a priced unit	Short intervals, frequent on-demand calls, broad cache tags	Lengthen intervals, scope tags tightly, deterministic output
Image Transformations	Optimizing operations (per cache miss/stale)	1 transformation each, regional ($0.05–$0.0812 per 1K)	Many width/quality/format variants, multiple formats, short TTLs	Long TTL, single format, fewer sizes/qualities
Image Cache Reads	Optimized images fetched from the shared global cache	8 KB Read Units ($0.40–$0.64 per 1M)	Cross-region traffic, larger optimized files	Lower quality, efficient formats, keep in-region caches warm
Image Cache Writes	Optimized images written to the global cache (per miss/stale)	8 KB Write Units ($4.00–$6.40 per 1M)	Many first-time variants, short TTLs, large files	Long TTL, fewer variants, lower quality
Source Images (Legacy)	Unique source images requested (old model)	$5.00 per 1,000 source images	Number of distinct source `src` values requested	Reuse sources; mark tiny/SVG/GIF as unoptimized

Two things worth flagging. The ISR and Data Cache reads/writes don't list a published included allowance in the pricing tables — they appear only as on-demand rows, so I don't assume a free tier there. Image optimization does: Hobby includes 5,000 transformations, 300,000 image cache reads, and 100,000 image cache writes per month. The "Source Images" line is the legacy model that the transformation-based model replaced in early 2025; it bills per unique source rather than per variant. Vercel's ISR pricing page and image optimization pricing are the references here.

Run your code

This is the bucket I find most interesting, because the billing model under it genuinely changed. The headliner is Fluid Compute.

Here's the short version. Traditional serverless billed a single wall-clock number: memory multiplied by elapsed time, including every millisecond your function sat idle waiting on a database query or an AI model call. Fluid Compute splits that into separate dimensions and — this is the part that matters — only bills Active CPU while your code is actually running. If a function spends 100 ms processing data and 400 ms waiting on a DB query, you're billed for the 100 ms of CPU, not the 400 ms of waiting. Memory still bills for the instance's whole lifetime (it stays reserved during I/O), but between requests, once the instance pauses, nothing is billed. Vercel's own one-liner is "you pay for memory whenever work is in progress, never for idle CPU, and nothing at all between requests." On top of that, multiple invocations can share one instance (optimized concurrency), so fewer instances run for the same traffic. It's been the default for new projects since April 2025.

Metric	What it measures	Unit / how it's billed	What drives it up	How to lower it
Function Invocations	Each request that reaches the function	Count (per million); success + error both count, cache hits excluded	Traffic, chatty clients, uncached routes, polling, ISR revalidations	Cache responses (cache hits don't count), batch calls, static/ISR
Function Duration (GB-Hours)	Legacy/non-Fluid: memory × wall-clock time	GB-Hours, includes I/O wait	Long execution, high memory, high volume	Enable Fluid Compute; lower memory, cap `maxDuration`, cache
Function Throttles	Requests not served because the concurrency limit was hit	Count — not a billed metric (health signal)	Traffic spikes outrunning burst scaling, slow functions holding instances	Reduce duration, enable Fluid concurrency, spread across regions
Fluid Provisioned Memory	Memory reserved for the instance lifetime (incl. I/O)	GB-Hours, regional	Higher configured memory, long instance lifetimes, background `waitUntil`	Right-size memory, optimized concurrency, shorten I/O waits
Fluid Active CPU	CPU time while code is actively running (not I/O)	CPU time per CPU-hour, regional	CPU-bound work (image processing, crypto, parsing)	Offload heavy compute, cache results, optimize algorithms
Edge Function Invocations	Times an Edge-runtime function runs (deprecated runtime)	Count; success + error, cache hits excluded	Raw traffic, low cache-hit ratio, polling	Cache; migrate to Node.js on Fluid Compute
Edge Execution Units	Edge CPU bucketed into 50 ms blocks	1 unit = up to 50 ms CPU (net, excludes I/O wait)	CPU-bound work per invocation	Keep per-request CPU under 50 ms, cache, migrate
Edge Function CPU Time	Net CPU the Edge function spends computing	CPU ms, then bucketed into 50 ms units	Computation (parse, crypto, render); not I/O waits	Reduce compute, cache, push I/O to async awaits
Edge Middleware Invocations	Times Routing Middleware runs (before the cache)	Count, priced via Fluid Compute model	Broad matchers, site-wide auth/redirect/personalization	Narrow the matcher, return early, exclude static assets
Edge Middleware CPU Time	Net CPU the middleware spends per invocation	CPU ms; Fair Use caps it at 50 ms average	Heavy routing/regex, JWT/crypto, big cookie parsing	Keep middleware lightweight, push heavy work downstream

On allowances: 1,000,000 Function Invocations are included on both Hobby and Pro, with Pro on-demand at $0.60 per million beyond that. Active CPU includes 4 CPU-hours on Hobby (Pro 16), and Provisioned Memory includes 360 GB-hours on Hobby (Pro 1440). Worth noting that Edge Functions are now officially deprecated — the docs literally title the page "Do not use Edge Functions" — and Vercel directs new work to the Node.js runtime on Fluid Compute, where the unified Invocations + Active CPU + Provisioned Memory model applies. Throttles, importantly, are not a chargeable metric; they're a reliability signal telling you a concurrency ceiling got hit. Details live in the Functions pricing docs and the Fluid Compute docs.

Ship it

Builds have their own little economy, and it took me a minute to realize there are two distinct billing shapes depending on the build machine. Fixed machines (Standard, Enhanced, Turbo) bill per build minute. The auto-scaling Elastic machine bills per CPU minute instead — because it varies its vCPU count (4 to 30) per build, Vercel charges by duration times the vCPUs it assigned. On Hobby you can't use the larger machines and aren't billed for builds at all.

Metric	What it measures	Unit / how it's billed	What drives it up	How to lower it
Build Time	Wall-clock duration of one build/deploy	Duration (diagnostic, not its own bill); 45 min hard cap	Large apps, heavy bundling/type-checking, cold caches, big monorepos	Build cache, remote caching, skip unaffected projects
Number of Builds	Count of build events (one per deploy)	Integer count	Frequent pushes, many previews, monorepo projects deploying per push	Skip unaffected projects, Ignored Build Step, fewer redeploys
Build Minutes	Billed compute for Standard/Enhanced/Turbo machines	Per build minute at the machine's rate (Pro from $0.014–$0.126/min)	Long builds, larger machines, On-Demand Concurrency, frequent deploys	Cut duration, prefer Elastic, don't over-provision to Turbo
Build CPU Minutes	Billed compute for the Elastic machine	Per CPU minute (from $0.0035), = duration × assigned vCPUs	Long Elastic builds, CPU/memory-bound work upgrading vCPUs	Cut duration, let Elastic down-size, trim CPU-heavy steps
Artifacts Time Saved	Time avoided by reusing cached task outputs	Aggregate time — not billed (higher is better)	More cache hits, shared work, expensive cached tasks	N/A — this is the savings, maximize it
Remote Cache Artifacts	Count of artifacts up/downloaded to the Remote Cache API	Count — not billed (charged on size, not count)	Many cacheable tasks, team/CI size, frequent builds	Cache deliberately; stay under per-minute request limits
Remote Cache Total Size	Size of artifacts moved through the Remote Cache	GB — not billed (free under fair use)	Caching large outputs (e.g. the whole `.next` folder)	Cache only needed files; scope task `outputs` tightly

This is where Turborepo remote caching earns its keep. The three Artifacts meters are all marked "Priced: No" — Vercel's Remote Cache is free on all plans subject to fair use. What it buys you is build time: unchanged tasks replay from the cache instead of rebuilding, which is exactly what "Artifacts Time Saved" measures. Fair-use upload limits are 100 GB/month on Hobby and 1 TB on Pro, and uploaded artifacts auto-expire after 7 days. The one trap: don't cache the .next/cache folder — it's for development and just bloats your remote cache size. Vercel's remote caching docs cover this.

Data and platform add-ons

This is the long tail — everything that isn't core request/render/run. I'll group it into one big reference table since most of these are opt-in products you either use or you don't.

Metric	What it measures	Unit / how it's billed	What drives it up	How to lower it
Blob Storage	Data stored in your blob store	GB-month (15-min snapshots averaged), regional	Large media, long retention, orphaned old versions	Delete unused blobs (free), compress media, prune promptly
Blob Simple Operations	`head()` calls + cache-miss URL accesses	Count, regional ($0.35–$0.56 per 1M)	Low cache-hit ratio, frequent `head()`, blobs >512 MB	Long `cacheControlMaxAge`, immutable blobs, conditional reads
Blob Advanced Operations	`put`/`upload`/`copy`/`list` actions	Count, regional ($4.50–$7.00 per 1M)	Frequent uploads, multipart (each part counts), unpaginated `list()`, dashboard browsing	Batch/paginate, fewer parts, cache `list()` results
Blob Data Transfer	Outbound download bandwidth	GB, regional ($0.05–$0.117); uploads don't count	Popular/large downloads, low cache-hit, blobs >512 MB	Cache aggressively, serve public assets via browser/CDN
Queue Message Sends	Each publish to a topic	Per API op, 4 KiB chunks; regional ($0.60–$0.96 per 1M); idempotency keys = 2x	High publish throughput, large payloads, idempotency keys	Batch messages, send IDs not blobs, key only when needed
Queue Message Receives	Each consume from a consumer group	Per API op, 4 KiB chunks; 1–10 messages per call	Frequent polling, fan-out across consumer groups	Receive batches of 10, back off empty polls, fewer groups
Queue Message Deletes	Each acknowledgment of a processed message	Per API op, 4 KiB chunks	Volume of processed messages, fan-out	Reduce total messages, trim unnecessary consumer groups
Queue Notifications	Push-mode callback deliveries to your function	Per API op, 4 KiB chunks; max-concurrency = 2x	Push mode, high throughput, retries, fan-out	Use poll mode where cheaper, idempotent consumers
Queue Visibility Changes	Each extension of a message's visibility timeout	Per API op, 4 KiB chunks	Long-running consumers heartbeating timeouts	Longer initial timeout, smaller messages, fewer extensions
Cron Job Invocations	Each time a scheduled cron fires	Billed as a Function Invocation (no separate meter)	Number of crons × schedule frequency	Run no more often than needed, enqueue heavy work to Queues
Drain Volume	Observability data forwarded to drains	$0.50 per 1 GB (Pro), 1 GB increments	Traffic volume, draining previews, verbose logs, no sampling	Drain production only, apply sampling, restrict log sources
Web Analytics Events	Page views + custom events (`track()`)	One event each, pooled team-wide	Traffic, heavy custom-event instrumentation, more projects	`beforeSend()` to exclude routes, trim custom events
Speed Insights Data Points	Web Vital measurements (~3–6 per visit)	One data point per metric; Pro $10/mo per project + per-point	Traffic × 3–6, hard navigations, more enabled projects	Lower `sampleRate`, `beforeSend()`, disable on unused projects
Monitoring data points	One event per request (legacy, being sunset)	$1.20 per 1M events, team-wide	Request volume (no per-request control)	Migrate to Observability Plus; reduce overall requests
Observability Events	Events per request (fans out into up to 5 types)	$1.20 per 1M (Plus); Pro $10/mo base	Traffic × architectural fan-out (middleware, functions, APIs)	Cache to serve 1 event not the chain; scope middleware
Flags Explorer Overrides	Override actions in the Flags Explorer toolbar	One per "apply" click; 150/mo free all plans	Frequent manual flag flipping, one-at-a-time applies	Batch flags into one apply, stay under 150/mo
Flag Requests	Requests reading the flags configuration	$30 per 1M ($0.00003 each); multiple flags, one source = 1	High traffic reading flags, multiple flag sources per request	Consolidate to one source, evaluate on cached paths
BotID Requests	`checkBotId()` Deep Analysis calls	$1 per 1,000 (Pro/Enterprise); Basic check is free	Deep Analysis on high-traffic or many routes	Deep Analysis only on high-value routes, WAF bypass
Trace Spans	Spans captured via Tracing	One per span; 1M/mo per team, all plans	Broad tracing, `@vercel/otel`, many custom spans, many fetches	Trace selectively, judicious custom spans, sample drains
Private Data Transfer	Egress leaving the Secure Compute network to the public internet	$0.15/GB; VPC-peered traffic is free	DB/API calls over public internet, high egress	Route to AWS over VPC peering, compress, cache externally
Sandbox Creations	Each `Sandbox.create()` microVM	$0.60 per 1M	Many short-lived sandboxes, fresh sandbox per task	Reuse sandboxes, use snapshots
Sandbox Active CPU	CPU time the sandbox actively uses	CPU-hours, $0.128/hr; I/O wait excluded	CPU-bound work, more vCPUs, long compute	Right-size vCPUs, `stop()` promptly, short timeouts
Sandbox Provisioned Memory	Allocated memory × runtime	GB-hours, $0.0212/GB-hr, 1-min minimum	High vCPU/memory, long runtime, idle provisioning	Right-size, stop promptly, snapshots
Sandbox Data Transfer	In/out network (packages, APIs, ports)	GB, $0.15/GB	Large dependency downloads, big datasets, chatty APIs	Snapshots to skip re-downloads, compress, reuse warm sandbox
Sandbox Storage	Snapshot storage	GB-month, $0.08/GB-mo	Many/large snapshots, long retention	Lean snapshots, default 30-day expiry, delete unused

A few included allowances I want on record: Web Analytics gives Hobby 50,000 events/month (then a grace period, then a pause; Pro bundles none and bills $3 per 100,000). Speed Insights caps Hobby at 10,000 data points/month across one project. Trace Spans allow 1,000,000 spans/month per team on every plan, including Hobby — that one genuinely surprised me. Blob includes 1 GB of storage, 10,000 simple ops, 2,000 advanced ops, and 10 GB of data transfer on Hobby. And the whole Sandbox surface includes generous Hobby tiers (5,000 creations, 5 CPU-hours, 420 GB-hours, 20 GB transfer) that pause rather than bill when exceeded. Note a couple of these are not billed at all — Throttles and the three Artifacts meters are pure signals.

How to read your own dashboard

Now that the meters make sense, the dashboard reads completely differently. The key UI element is what Vercel calls the allotment indicator: for each resource it shows how much you've consumed this cycle and the projected cost. That's your early-warning system — it's how you spot a resource approaching 100% of its allowance before it pauses (Hobby) or starts billing on-demand (Pro).

A few things I now do every time I look:

Switch the view. The dashboard offers Count, Project, Region, Ratio, and Average groupings. The Ratio view is the one that finds waste — cached vs uncached requests, successful vs errored vs timed-out invocations, uploaded vs downloaded artifacts. A bad cache-hit ratio jumps out instantly.
Use the Project and Region breakdowns. Because many resources are regionally priced, the Region view tells you whether one expensive region is driving a metric. The Project view pinpoints which app is responsible.
Look at the last 30 days, not just today, to catch patterns rather than spikes.

Remember the plan asymmetry while you read: Pro raises the ceilings (1 TB vs 100 GB transfer, 10M vs 1M Edge Requests, 16 vs 4 Active CPU hours, and so on) and turns a "pause" into "on-demand billing." On Pro you also get Spend Management, which lets you set a dollar cap that either alerts you or auto-pauses deployments — the safety valve Hobby doesn't need because Hobby can't spend.

An optimization playbook

Reading all fifty meters, the same handful of levers kept reappearing. If I had to compress everything above into a playbook, it's this:

Cache aggressively. This is the single highest-leverage move because it hits multiple meters at once. A response served from the CDN cache is a cache hit — it doesn't count as a Function Invocation and it doesn't run the function. The same caching that cuts Invocations also cuts Edge Requests' downstream chain and keeps requests from falling through to billed durable ISR/Data Cache reads. One Cache-Control header, several meters lower.

Shrink payloads. Fast Data Transfer is billed on the full size of every response — body, headers, URL, the lot. Smaller pages, trimmed JS/CSS bundles, and returning only the fields an API actually needs all pull this number down directly. Smaller responses also mean fewer 8 KB units when those responses get cached.

Fewer image variants, longer TTLs. Every distinct (source × width × quality × format) combination is a separately billable transformation on first request, and each one writes to the global image cache in 8 KB Write Units. Dropping from two formats to one roughly halves transforms; trimming device sizes and qualities cuts them further; and a long cache TTL (up to 31 days) stops you paying to re-transform and re-write the same variants. Lower quality even shrinks the per-fetch 8 KB read/write counts.

Turborepo remote cache for builds. Since remote cache artifacts aren't billed, the whole win is build time — and on Pro, build time is money (Build Minutes or Build CPU Minutes). Replaying unchanged tasks from the remote cache, skipping unaffected monorepo projects, and not caching .next/cache keep builds short.

Kill non-determinism in cached/regenerated code. new Date() and Math.random() in ISR or Data Cache output make every revalidation look "changed," forcing a write even when the meaningful content is identical. Make the output deterministic and those writes drop to zero.

Closing

What changed for me wasn't a dollar figure — I'm still on Hobby, still paying nothing. What changed is that the usage dashboard stopped being noise. I now read it as my application's bill of materials: a precise inventory of every byte transferred, every request processed, every CPU-millisecond spent, every image transformed. Each meter is a sentence about how my code actually behaves in production — where it's chatty, where it's wasteful, where it caches well.

That's the part I didn't anticipate. Understanding the meters didn't just prepare me for an eventual bill; it made me a better builder. A low cache-hit ratio, a function that quietly waits 400 ms on every request, an image config generating four formats nobody asked for — these are engineering problems that happen to show up on a billing page. Reading the meters is just reading your own system honestly. And whether the column on the right says "free" or "dollars," that's a skill worth having.