Overview

Festival spikes caused app/API slowdowns and order failures. The platform implemented layered caching, autoscaling, and queue‑based workflows to stabilize.

Key Changes

  • CDN + edge caching for menus/assets; server‑side caching for hot categories.
  • Database optimization: read replicas, slow query fixes, and connection pooling.
  • Message queues for order events (placed → accepted → picked → delivered); resilient retries.
  • Dispatch optimization: geofencing and batched assignments to riders.

Outcomes

  • P95 latency down 45%; failed orders reduced by 70% during peak hours.
  • More stable rider utilization; fewer customer support tickets.

Lessons (Unit 3 lens)

  • Caching + queues are fundamental under bursty, location‑dependent traffic.
  • Separate read vs. write paths; monitor capacity in advance of seasonal peaks.

Chapters covered

  • Web performance and scalability (3.2–3.4)
  • Mobile considerations for high‑traffic events (3.5)