The ad auction — GSP, VCG, quality score, reserve

The last stage of an ads funnel. The mechanism design here is where ads diverge from recsys for good — the platform stops asking "what does the user want?" and starts asking "who pays, and how much?".

The economic question

For every impression, two coupled decisions: allocation (which ad gets which slot) and pricing (what each winner pays). Four constraints pull against each other:

Incentive compatibility. If truth-telling loses, advertisers spend their effort guessing competitors' bids instead of working out an impression's true value. Bids get noisier and less informative.
Revenue. The platform extracts surplus without destroying participation.
User welfare. Bad ads cost long-term engagement.
Latency. Auction runs once per impression, in milliseconds, on the same request as ranking.

Everything below is a different way of trading these off.

First-price: the obvious mechanism, and why it is hard

Highest bid wins, winner pays their bid. Allocation is efficient if bids equal values — but they won't. A bidder who values an impression at v = $1.00 and bids $1.00 makes zero surplus. So they shade — bid below value, near their estimate of the second-highest bid. Bidding becomes strategic:

Sophisticated bidders out-bid naive ones at the same true value — the platform rewards bidding-engine investment, not advertiser quality.
Pacing across the day gets harder, because each shade depends on competitors' shades, which depend on competitors' budgets.
Revenue per query is noisier because equilibrium is in mixed strategies.

The 2017 plot twist

Programmatic display moved back to sealed first-price around 2017–2019, for transparency. The strategic-bidding burden didn't vanish — it was shifted to autobidders (lesson 10), which solve shading on every impression. So "first-price is hard" is true of advertisers but no longer true of the systems advertisers use.

Second-price (Vickrey): truthful, single-slot

Highest bid wins, winner pays the second-highest bid (Vickrey 1961). The property: bidding true value is a dominant strategy. Fix value v and highest competing bid b:

Bid above v: you only change the outcome when b lands between v and your bid. In that range you win and pay b > v — lose money. Never strictly helps.
Bid below v: you only change the outcome when b lands between your bid and v. In that range you lose, but would have won at b < v — positive surplus walked away from. Never strictly helps.

So single-slot Vickrey learns true values for free, ranks correctly, and charges the externality. Beautiful — and almost useless on a real ads page.

The multi-slot problem

Real search results have multiple ad slots: mainline (three or four above organic), sidebar, sometimes a bottom block. Each slot has its own click-through rate; top slots get sometimes 5–10× the lower slots. Call the position-CTR multipliers α₁ > α₂ > … > α_K.

A naive "second-price per slot" — winner of slot k pays the (k+1)-th bid — is not truthful. A bidder who would win slot 2 can sometimes profit by underbidding to drop into slot 3 at a cheaper price, capturing surplus exceeding the lost CTR. Two mechanisms recover most of what we wanted.

GSP — Generalized Second Price

Run by Google Search Ads for ~15 years and still the standard mental model. Rank ads by rank_score = bid × pCTR (Ad Rank). Slot k goes to the k-th-ranked ad. The price for slot k is the lowest bid that would have kept the winner in slot k — equivalently, the rank score of slot k+1 divided by the slot-k winner's pCTR:

price_k = rank_score_{k+1} / pCTR_k

GSP is not dominant-strategy truthful (Edelman, Ostrovsky, Schwarz 2007). But there is a Bayesian-Nash equilibrium of GSP whose revenue and allocation match VCG, and the formula is simpler to explain to advertisers — that's why GSP shipped.

VCG — Vickrey-Clarke-Groves

Generalises Vickrey to arbitrary allocation. Each winner pays the externality they impose — the welfare other bidders lose because this winner exists. In multi-slot ads:

VCG_price_k (per click) = [ Σ_{j>k} (α_{j-1} − α_j) · bid_j · pCTR_j ] / (α_k · pCTR_k)

Each lower bidder whose slot got bumped down contributes a displacement term weighted by the position-CTR gap and by that bidder's own pCTR — the externality is on clicks, not slots. Bidding true value is dominant. The cost is conceptual complexity and (sometimes) lower revenue at the same bids. The simulator below reflects bidder-heterogeneous pCTRs.

	First-price	GSP	VCG	+ Myerson reserve
Truthful?	No — shading	No — Edelman 2007	Yes — dominant	Yes (any truthful base mechanism)
Revenue (equilibrium)	Equal to VCG by revenue equivalence under symmetric independent-private-values with risk-neutral bidders; binding reserves break it	Equal to VCG at one BNE	Baseline	Strictly > VCG when reserve binds
Advertiser complexity	High — must model competitors	Medium — must model competitors weakly	Low — just bid value	Low
Transparency	Highest — pay what you bid	Medium — formula is published	Lower — externality is hard to attribute	Lower
Implementation	Trivial	Easy	Easy for slots; harder for combinatorial	Add a per-query reserve

Google migrated several of its non-Search ad surfaces (contextual, YouTube) to VCG-style mechanisms around 2019–2020; Search Ads still runs a GSP-family mechanism with quality-score modifications. The reason VCG appealed on those surfaces was autobidders: once most bids come from ML systems, "easy to explain to humans" stops mattering and "easy to debug an autobidder against" starts mattering. A truthful mechanism is easier to debug because the autobidder's output has a single intended meaning — the marginal value of a click — rather than a strategic shade against an estimate of competitors' shades.

Quality score: rank by bid × pCTR, not by bid alone

Bid-only ranking gives the slot to whoever pays most per click even if nobody clicks. They pay nothing per impression, the user ignores the ad, and the platform earns nothing. Rank instead by expected revenue per impression:

rank_score = bid × pCTR

where pCTR is the calibrated click probability from the ranking model (lesson 5). For conversion-bid campaigns the chain extends to bid_cpa × pCTR × pCVR_given_click.

Calibration is non-optional. Uniform 2× miscalibration doesn't change ranking (every ad scaled). But per-segment miscalibration (e.g., new advertisers with little data) systematically outranks competitors — the auction mis-allocates. Lesson 5 calibration metrics are inputs to the price-setting machinery, not a nicety.
Quality score is a policy lever. Production "Ad Rank" usually folds in landing-page experience, ad relevance to query, historical CTR. Each addition is a policy choice between immediate revenue and long-term user welfare.

Reserve prices

A reserve is the minimum bid that can win. Three flavors:

Type	How it's set	What it does
Static	One global floor (e.g., per click).	Prevents nuisance ads; weak revenue effect.
Myerson / query-level	Reserve r(query) tied to that query's value distribution.	On low-competition queries, raise the reserve so only high-value ads show; the winners pay more because the alternative is "no ad" rather than "lowest-value ad". Strictly increases expected revenue.
Personalized	Reserve r(query, advertiser, segment).	Per-advertiser monopoly pricing. Captures more surplus, at the cost of complaints and antitrust risk if leaked.

Myerson (1981): for a single-item auction with known value distributions, the revenue-maximizing mechanism is second-price with a per-bidder reserve at the virtual-value zero-crossing. Production systems don't run textbook Myerson, but the intuition — reserves extract revenue when competition is thin — is the lever for long-tail queries.

Ad load — the upstream decision

Even before the auction, something decides how many slots to fill. Four mainline ads earn more per impression than two but push organic results down and reduce long-term engagement. Textbook approach: each candidate slot has a marginal revenue (from the auction) and a marginal long-term cost (from a user-engagement model). Show slot k iff the former exceeds the latter — usually a separate controller calibrated against a long-running holdout.

Where this composes with earlier lessons

The auction is the only ads-specific layer, but every layer feeds it. Retrieval defines the candidate set. Ranking and calibration define the rank score and the pCTR going into it. Negative sampling and debiasing determine whether pCTR is honest. Evaluation decides whether an auction change is launchable. A senior MLE talks about the auction as the consumer of everything else.

Interactive · GSP vs VCG on a two-slot page

Three advertisers, two slots. Pick bids and pCTRs; pick the mechanism. Watch which advertiser lands in which slot, what each pays, and whether deviating from a truthful bid would have helped them.

Multi-slot auction simulator

Slot 1 gets all clicks at multiplier α₁; slot 2 gets a fraction α₂. Rank score is bid × pCTR. Use the truthfulness probe to see whether advertiser A could have done better by misreporting.

A — bid ($): 3.00 pCTR: 0.10 B — bid ($): 2.00 pCTR: 0.15 C — bid ($): 1.50 pCTR: 0.12 slot-1 CTR mult α₁: 1.00 slot-2 mult α₂: 0.40 mechanism: GSP VCG truthfulness probe — A's reported bid ($): 3.00

slot 1 winner

—

slot 1 price (per click)

—

slot 2 winner

—

slot 2 price (per click)

—

expected platform revenue

—

A's surplus (truthful)

—

A's surplus (probed bid)

—

A's true value used

—

Reading

—

Three failure modes worth knowing

Failure	Symptom	What to look at
pCTR miscalibration in a segment	One ad class (new advertisers, new format) wins more slots than its actual CTR justifies.	Calibration plots stratified by cohort and format. Auction mis-allocation is downstream of a ranking bug.
Reserve too low on long-tail queries	Low-volume queries earn near-zero revenue and show ads users ignore.	Distribution of winning bids by query frequency. Long tail with one bidder is the Myerson-reserve case.
Mechanism change × autobidders	GSP → VCG flips autobidder bids more than predicted; revenue moves both directions before settling.	Whether the autobidder's bid is meant to be "value per click" (VCG) or "max willingness to pay" (GSP). Migrations need a parallel acceptance period.

Interview prompts you should be ready for

"Walk me through why second-price single-slot is truthful." (Two-case proof on overbidding and underbidding. Don't skip the "you only change the outcome when…" framing — that's the load-bearing step.)
"Why does Google rank by bid × pCTR rather than bid alone?" (Expected revenue per impression is what the platform actually cares about; ranking by bid alone gives the slot to ads nobody clicks. Tie it to calibration: pCTR has to be calibrated, not just correctly ordered.)
"Your CTR predictions are systematically 50% too high for one ad category. What happens to the auction?" (That category outranks competitors by ~50%, wins more slots, but pays at the next-best ad's rank score divided by the inflated pCTR — so per-click prices come out artificially low (since payment = next_score / your_pCTR — inflating the denominator shrinks the price), so revenue collapses while the ranker still ranks correctly. The mis-allocation shows up as low actual CTR on shown ads.)
"Why did GSP win over VCG historically, and why is that changing?" (Historical: simpler to explain, revenue-equivalent in BNE, legacy. Changing: autobidders absorb the strategic-bidding burden, so truthfulness becomes valuable as a debug property of the autobidder pipeline rather than a property humans use.)
"Design a reserve-price strategy for a long-tail query where you only ever see one bidder." (With one bidder, second-price degenerates to "pays whatever the reserve is". The Myerson-style answer: set reserve at the median or higher of that query's historical winning-bid distribution, refresh it with smoothing across similar queries.)
"Your CTR model gets recalibrated and the auction's ad mix shifts dramatically. Is that fine?" (Maybe. If the old model was miscalibrated unevenly across segments, the shift is a correction. If the new model is mis-shifted in the other direction, the shift is a regression. The honest answer is "I'd look at calibration error per segment before and after, and at the realized CTR on winning ads", not "yes" or "no".)

Takeaway

The auction is mechanism design wearing an ML hat. The truthfulness debate (GSP vs VCG) is about whose strategic burden you absorb — the advertiser's, the autobidder's, or the platform's. The quality-score multiplier is where the calibrated CTR from lesson 5 becomes load-bearing for revenue. Reserves are how you trade revenue against participation on thin markets. Every line of the auction code is downstream of a ranking decision; every ranking decision lives or dies by what the auction does with it.