← /writing/tech·2026 · 02 · 01·8 min read

Pessimistic vs optimistic concurrency for slot reservation: a benchmark and a decision

The standard 'use OCC' advice fits the median case. Under sustained contention with users who've already entered payment details, the honest choice is different.

↔ companion: Building a Multi-Location Turf Booking Platform Across India

Three players hit the same Friday-evening slot within 800ms of each other. One booking should land. Two should see "already booked." None should see a confirmation page that turns out to be a lie.

This is the load case for any real-world booking platform. Conflict is not the rare exception - Friday evenings are when conflicts happen. Building a booking engine for this load case is a different exercise from building one for the median case.

The internet's standard advice is "use optimistic concurrency control with a version column." We benchmarked it. We didn't pick it. Here's why.

The decision

We used pessimistic locking at the slot level - a SELECT ... FOR UPDATE inside the booking transaction, and a strict ordering of operations such that nothing else can interleave between "check availability" and "write the booking."

The standard alternative is optimistic concurrency control (OCC): each slot has a version column, you read it, you do your booking work, you write back with WHERE version = <what we read>, and if the row was changed under you (by another booking) the update returns 0 rows and you retry.

Both work in the abstract. They have different real-world behaviours under our specific load.

The benchmark

We ran a synthetic load test: 10 concurrent clients each trying to book the same slot, repeated 1000 times.

Approach	Successful bookings (should be 1)	Avg latency under contention	Tail behaviour
Pessimistic (`FOR UPDATE`)	1 every time	22ms p50, 80ms p99	Waits queue cleanly
OCC with retry (3 attempts)	1 every time	18ms p50, 240ms p99	Tail bloats with retry storm
OCC with no retry	1 every time	15ms p50, 35ms p99	9 of 10 see "try again"
Naive (no concurrency control)	2-4 per run, randomly	8ms p50	Double-bookings - broken

The numbers tell a specific story. OCC with retry is faster if every retry succeeds, but under sustained contention the retry storm makes the tail worse than pessimistic locking. OCC with no retry is faster but pushes "try again" to the user, which is the wrong UX for a booking flow - the user has already entered their details, they shouldn't be asked to redo it.

Pessimistic locking serialises the contended path with predictable latency. Under our load (peak ~50 conflicts/second on Friday evenings), it was the right shape.

The schema and the lock

sqlcreate table turfs (
  id          uuid primary key,
  location_id uuid not null,
  name        text not null,
  surface     text not null,         -- 'turf' | 'concrete' | 'grass'
  active      boolean not null default true
);

create table bookings (
  id              uuid primary key,
  turf_id         uuid not null references turfs(id),
  player_id       uuid not null,
  slot_start      timestamptz not null,
  slot_end        timestamptz not null,
  status          text not null,    -- 'reserved' | 'confirmed' | 'cancelled' | 'no_show'
  amount_paise    bigint not null,
  payment_id      text,
  created_at      timestamptz not null default now(),
  cancelled_at    timestamptz,
  -- The exclusion constraint enforces no two confirmed bookings overlap
  -- on the same turf
  exclude using gist (
    turf_id with =,
    tstzrange(slot_start, slot_end, '[)') with &&
  ) where (status in ('reserved', 'confirmed'))
);

The exclusion constraint is the database-level safety net. Even if the application-level locking fails, the database will not accept two overlapping reserved or confirmed bookings on the same turf. This is the "in case our concurrency reasoning is wrong, what protects us" layer.

The booking transaction:

tsasync function reserveSlot(params: {
  turfId: string;
  playerId: string;
  slotStart: Date;
  slotEnd: Date;
}): Promise<BookingResult> {
  return await db.transaction(async (tx) => {
    // 1. Lock the turf row to serialise concurrent bookings on this turf
    const turf = await tx.query(
      "select * from turfs where id = $1 for update",
      [params.turfId]
    );
    if (!turf || !turf.active) {
      return { ok: false, reason: "turf_inactive" };
    }

    // 2. Check overlap (with the same predicate as the exclusion constraint)
    const conflict = await tx.query(
      `select 1 from bookings
       where turf_id = $1
         and status in ('reserved', 'confirmed')
         and tstzrange(slot_start, slot_end, '[)') &&
             tstzrange($2::timestamptz, $3::timestamptz, '[)')
       limit 1`,
      [params.turfId, params.slotStart, params.slotEnd]
    );
    if (conflict.length > 0) {
      return { ok: false, reason: "slot_taken" };
    }

    // 3. Compute price (peak-hour rules etc - see below)
    const amount = await computePrice(params);

    // 4. Insert the reservation. The exclusion constraint is the final safety net.
    const booking = await tx.query(
      `insert into bookings (
         id, turf_id, player_id, slot_start, slot_end, status, amount_paise
       ) values ($1, $2, $3, $4, $5, 'reserved', $6) returning *`,
      [uuid(), params.turfId, params.playerId, params.slotStart, params.slotEnd, amount]
    );

    return { ok: true, bookingId: booking[0].id, amountPaise: amount };
  });
}

The for update on the turf row is the lock. Two concurrent transactions targeting the same turf serialise - the second waits for the first to commit. The wait is short (the booking transaction is small), and the queue is fair.

The for update on the turf, not the bookings table, was deliberate. Locking the turf row is a coarser lock - it serialises all bookings on that turf, not just on the contested slot. The slot-level alternative would be a row in a turf_slots table, which we considered. Coarse turf-level locking was simpler, the contention was acceptable (a single turf has at most a handful of concurrent booking attempts at peak), and the alternative table added joins everywhere.

Pricing as a function, not a column

A common mistake: store price on each slot. It looks fine until you change pricing.

Slots don't have prices. Slot bookings have prices, computed at booking time:

tstype PricingContext = {
  turfId: string;
  slotStart: Date;
  playerId: string;
  promoCode?: string;
};

async function computePrice(ctx: PricingContext): Promise<number> {
  const turf = await db.turfs.findById(ctx.turfId);
  const baseRate = turf.baseRatePaise;

  let multiplier = 1.0;

  // Peak-hour rule
  const hour = ctx.slotStart.getHours();
  const day = ctx.slotStart.getDay();
  if ([5, 6].includes(day) && hour >= 18 && hour < 22) {
    multiplier *= 1.5;  // Friday/Saturday evening peak
  }

  // Member discount
  const player = await db.players.findById(ctx.playerId);
  if (player.membershipTier === "premium") {
    multiplier *= 0.85;
  }

  // Promo code
  if (ctx.promoCode) {
    const promo = await validatePromoCode(ctx.promoCode, ctx);
    if (promo) multiplier *= 1 - promo.discount;
  }

  return Math.round(baseRate * multiplier);
}

The function takes a context and returns a price. The price is recomputed every time. Operators can change the underlying rules (turf rate, peak hours, member tiers) without touching anything booking-side.

This was crucial when the central operator wanted to A/B test peak-hour multipliers across cities. We changed the function. We didn't migrate slot rows.

Cancellation rules: also a function

Same pattern for refund logic on cancellation:

tsfunction refundAmount(booking: Booking, cancelledAt: Date): number {
  const hoursUntilSlot = hoursBetween(cancelledAt, booking.slotStart);
  if (hoursUntilSlot >= 24) return booking.amountPaise;       // full refund
  if (hoursUntilSlot >= 6) return Math.round(booking.amountPaise * 0.5);  // 50%
  return 0;  // no refund
}

The refund function reads booking metadata and a clock. It does not look at any "refund_eligibility" flag. The flag would have rotted; the function did not.

Operator-discretion overrides existed (weather cancellation, regular customer goodwill) but were always additive - the function returned the policy refund, the operator could choose to grant more on top, and that choice was logged with a reason.

Buffer-time enforcement

Most "slot calendars" let bookings butt up against each other. Real cleaning needs 15 minutes. We enforced this at the slot-availability check:

tsconst BUFFER_MINUTES = 15;

function effectiveSlotRange(slotStart: Date, slotEnd: Date) {
  return {
    start: addMinutes(slotStart, -BUFFER_MINUTES),
    end: addMinutes(slotEnd, BUFFER_MINUTES),
  };
}

// Used in both the conflict check and the exclusion constraint

A 7pm-to-8pm booking on a turf made the 6:45pm-to-8:15pm window unavailable. The exclusion constraint enforced it. The UI showed it. The cleaning crew got their 15 minutes.

The constant lived in one place. When the operator wanted to test 10-minute buffers in some cities, we made the constant per-location and the rest of the code didn't change.

What surprised me

Database-level constraints saved us once. Early in the build, an application-level race got past the lock under a specific timing. The exclusion constraint caught it - the second booking failed with a constraint violation, the user saw "slot taken," nobody got a double-booking. The constraint is the safety net you hope to never need; we needed it once and were glad for it.

Cancellation races were as bad as booking races. Two players cancel the same booking simultaneously (one from a phone, one from a desktop, both refreshing the page). Without locking, both refunds get issued. We applied the same for update pattern on cancellation. Boring, important, easy to forget.

Peak-pricing felt obvious in hindsight; took a week to convince stakeholders. The "we should charge more for Friday evenings" rule was unintuitive to the operator at first. We shipped it as a per-city configurable multiplier. Adoption was rapid once they saw the revenue lift on Friday-evening slots.

What I'd do differently

Slot-level rows from day one. We started with turf-level locking and bookings as the only table. Eventually we added a turf_slots table for performance and richer per-slot metadata (operator notes, weather closures). Earlier introduction would have made some queries simpler.

Idempotent booking endpoint. A user double-tapping the "Book" button could in theory create two reservations. Application-level deduplication caught this in 99% of cases; making the endpoint truly idempotent (with a client-supplied idempotency key) would have closed the last 1%.

Booking holds with explicit TTL. We had reservations that auto-cancelled after 5 minutes if payment didn't complete. The TTL was implemented as a job that ran every minute. Better is to have an explicit expires_at column, the booking-availability query treats expired-without-payment reservations as available, and the cleanup job is strictly cosmetic.

The load case is Friday at 7pm. Pick for that, not for the median row.

← back to /writing → companion: Building a Multi-Location Turf Booking Platform Across Indiaset in fraunces · geist · geist mono