Video rooms on a LAN with zero infrastructure: no internet, no broker, no
local server. QR codes bootstrap the first connection; everything after rides
the connections themselves. Designed 2026-06-07 in conversation; not yet built.
The idea in one line
A QR code replaces the signaling broker — but only for one link per person;
once the first data channel exists, all further signaling (roster, chat, mesh
formation) travels through it, exactly like the online room star transposed
onto RTCDataChannels.
Design
Layer 1 — pair bootstrap (the "phone kiss")
Browsers can't discover each other on a LAN (deliberately: every discovery
primitive is a surveillance primitive), and can't open raw sockets. The only
peer transport is RTCPeerConnection, which requires a mutual exchange of:
field
size
why it can't be omitted
ICE ufrag + pwd
~30 B
proves the connection is mutually intended
DTLS fingerprint
32 B
THE security: authenticates mandatory encryption, blocks LAN MITM
host candidate ip:port
~30 B
where to send packets
Full SDPs are ~2 KB but ~95% reconstructable boilerplate: both sides inflate a
fixed per-browser template and the QR carries only the ~100-byte variable
part (known "minimal SDP" technique). Trickle OFF — gather all candidates
before rendering the QR. MAC addresses: unknowable from JS and unnecessary
(ARP lives below IP).
Each bootstrap = two QRs (the handshake is mutual: each side needs the
OTHER's fingerprint/credentials). UX collapses both into one gesture: each
screen shows its QR on top and runs the scanner below, so two phones facing
each other complete the exchange hands-free.
Every bootstrapped connection carries: audio track + placeholder video track
(the no-churn architecture, verbatim) + one data channel ("the wire").
⚠️ The brittle bit: SDP templates drift across browser versions. Needs a
template per browser family (Chromium/WebKit/Gecko) and a test matrix; the
fallback is the lazy mode (full ~2 KB SDP in a denser QR).
Layer 2 — the offline room (founder star over data channels)
The FIRST device ("founder") is the offline analog of the room authority:
keeps the roster, broadcasts it on change (reuse the protocol.ts message
shapes: roster / chat / ping)
relays chat (attribution by roster, never the wire — same rule as online)
relays signaling for joiner↔joiner pair formation: when C joins, C's
offers to B (and answers back) travel C → founder → B over the wires.
These relayed handshakes have no QR constraints — full SDP, trickle ICE.
Glare-free rule for pair formation: smaller peer id initiates (same as
mesh.ts). Result: N people = N−1 QR kisses (each newcomer kisses only
the founder); the full mesh assembles itself.
Layer 3 — media (unchanged)
Pairwise mesh, never through the founder. All battle-tested rules apply
unchanged: placeholder video lane, replaceTrack-only camera toggles,
offerToReceive both sections, cam-flag tile gating. LAN is the best-case mesh
environment (gigabit, ~0 ms RTT, no upload caps) — 6 tiles run better in one
house than online across town.
Reused vs new
Reused: media.ts whole; mesh principles; protocol.ts message shapes;
the entire call UI (Tile/CallSurface/widget panel, chat pane); the existing QR
scanner (BarcodeDetector + jsQR fallback, already battle-tested on iOS).
New: minimal-SDP template codec (the hard 20%); QR render/scan screens;
founder relay (a port of room.ts authority onto data channels); offline
entry UI; persisted DTLS certificate (IndexedDB) so repeat pairs shrink
future QRs further (fingerprint already known).
Setup (what users need before going offline)
The app, available offline: install kibitz.chat as a PWA while online
(service worker caches it; secure origin is retained offline, which
satisfies getUserMedia's HTTPS requirement). Alternative: any LAN machine
serving dist/ (the "Kibitz Local" bundle — separate design).
A common IP network: shared Wi-Fi — or the founder's phone opens a
hotspot and others join it (a LAN made from nothing; works in a field).
Permissions: camera + mic, granted during the flow (the camera doubles
as the QR scanner).
Flow
Duo ("the phone kiss") — ~30 s
A: "Start offline call" → grant cam/mic → screen = QR (top) + scanner (bottom)
B: "Join offline call" → grant cam/mic → same split screen
Face the phones at each other: B reads A's QR → B's screen flips to its
answer QR → A's scanner reads it → connected, normal call UI
Drop/Wi-Fi blip → re-kiss (no broker = no auto-reconnect)
Group ("tickets at the door") — N−1 kisses
Founder: "Open an offline table" → shows the persistent door QR
Each newcomer: "Join" → kiss the founder once → roster updates, mesh
self-forms through relayed signaling (joiner↔joiner links appear in ~1 s,
no further ceremony)
Founder leaves → existing calls CONTINUE (media is pairwise) but the door
closes: no new joins, roster frozen. v2: any member re-founds and others
re-kiss.
Topologies (best first, per situation)
Hotspot + cellular present — not offline mode at all: regular Kibitz;
signaling sips cellular (KBs), ICE routes the media LOCALLY over the
hotspot LAN. Full features, near-zero data. Always prefer when ≥1 bar.
Hotspot, zero bars — ANDROID HOSTS ONLY (field-verified 2026-06-07):
iPhone Personal Hotspot is a cellular-sharing feature; with no cellular
(dead zone / airplane mode) iOS disables it entirely — an iPhone cannot
create a local network, ever. Android's local-only hotspot works SIM-less
in airplane mode. So the zero-infrastructure shape needs one Android (or a
battery travel router) in the party; iPhones join fine as clients. Costs:
founder battery, and some Android hotspots enforce AP isolation — test.
Blips: ICE's own recovery heals disconnected states in seconds — no
signaling needed. Covers most real-world drops for free.
Failed link in a group: ICE restart (fresh offer/answer) relayed over
ANY surviving data channel — design the relay as any-member (not
founder-only) so the room self-heals while the survivors form a connected
path. Robustness grows with group size.
Total loss (duo / full partition): physically unreconnectable — zero
channels means no way to deliver a new handshake (the online broker is
precisely what was traded away). Mitigation: both devices detect the death
and flip straight back to the kiss screen; with persisted certificates the
re-pair QR is tiny and the ritual takes ~5 s.
Limits (honest)
Mesh ceiling ~6 people (best-case environment though)
Template maintenance tax across browser updates
No ringing/notifications (nothing to ring through)
Founder is the door (not the call) — see group flow
Both displays needed for a kiss (accessibility fallback: read the ~100-byte
code aloud / share-sheet via AirDrop-Nearby Share, which also rides the OS's
own LAN discovery)
Effort estimate
Duo mode: 2–3 days (the template codec + cross-browser test matrix dominates)
Group mode: +2–3 days (founder relay = room.ts authority port + mesh wiring)