How the platform is built
A technical overview of the runtime, data model, agent pipeline, and guardrails behind BRH.AI's voice and SMS agents.
The Coming AI Divide
Enterprise AI adoption is accelerating fast. Large organizations have the capital to hire ML engineers, negotiate API contracts, and build internal orchestration layers. For local business — contractors, clinics, hospitality operators, trades — the barrier to entry is effectively a wall: no in-house technical staff, no data science team, and no appetite for six-figure integration projects.
The result will be a two-tier economy. National chains and well-capitalized incumbents will deploy AI at scale, compressing margins and response times, while small and medium businesses watch from the sidelines. The gap is not capability; it is access. The tooling exists, but it is packaged for engineers, not for a roofing company in Petaluma or a dental clinic in Healdsburg.
This platform is built on the premise that the gap can be closed with the right architecture: config-driven agents, no-code dashboards, and a deployment model that treats every town as its own tenant. The long-term thesis is deployable infrastructure — a system light enough to spin up in a matter of minutes, rigorous enough to handle real customer traffic, and repeatable enough to franchise into every small and medium-sized market.
The Guided Deployment Model
We are in the dial-up era of AI. What is being deployed today — chatbots, voice agents, retrieval pipelines — represents a fraction of a percent of what the technology will eventually be capable of. In the same way that a 28.8k modem and an AOL chat room were simple, early expressions of the internet, today's AI tools are primitive compared to what is coming.
That matters because the trajectory is steep. As AI capability compounds, the gap between what is possible and what the average business owner can self-implement will widen, not narrow. Most local operators do not have time to learn prompt engineering, evaluate models, or wire together CRM integrations. They need a trusted, local expert who can handle the full lifecycle: discovery, scoping, build, testing, and deployment.
This platform is designed to be deployed by a Product Provider — a local partner who understands the community, the trade, and the business. The provider runs the provisioning flow, configures the agent to match local hours and pricing, tests the handoff rules, and trains the owner on the dashboard. The business owner never touches a config file. The result is faster adoption, lower resistance, and a human layer of accountability that pure self-serve software cannot replicate.
Local Product Provider
A single point of contact who handles sales, onboarding, configuration, and ongoing support — turning a technical product into a local service.
Dial-Up Analogy
Early internet needed ISPs and local technicians to get households online. Early AI needs the same human layer to bridge the gap between capability and adoption.
Faster Widespread Adoption
A distributed network of local providers can onboard businesses faster than any centralized SaaS funnel, overcoming resistance through trust and proximity.
Design Goals
The platform is designed around three constraints: low-latency response on the inbound message path, deterministic behavior at the tool-call boundary, and tenant-level isolation enforced at the database — not in application code.
Everything below — the edge runtime choice, the typed RPC layer, the row-level security model, the config-as-data agent definition — follows from those constraints. The dashboard, admin tools, and integrations are built on the same primitives the agents use.
System Map
Marketing site, authenticated portal, agent runtime, and admin tooling share a single codebase, router, and database — wired together by typed server functions and a small set of public webhook endpoints. The same stack can be replicated per market without forking code.
Marketing Website
brh-ai.com — SEO-optimized, multi-page, lead-generating
AI Receptionist
Voice AI Agent with client portal
- Dedicated client login + KPI dashboard
- Real-time call handling & booking
- Admin controls for hours, pricing, voice
- Call transcripts & lead analytics
AI SMS Text Agent
Conversational SMS AI with client portal
- Dedicated client login + KPI dashboard
- Stateful multi-turn SMS conversations
- Admin controls for responses & guardrails
- Lead qualification & auto-booking
Unified Infrastructure
Supabase Auth with role-based access
Postgres + RLS, per-client data scoping
Cross-product support & issue tracking
Single cognitive layer powering both agents
Platform at a Glance
A production platform built as deployable infrastructure. The marketing site is SEO-optimized with AI-assisted content strategy and live paid campaigns driving qualified local leads. Under the hood: TanStack Start frontend architecture with Tailwind and shadcn/ui, edge-deployed serverless functions handling Twilio voice and SMS webhooks, a multi-tenant Postgres database with row-level security, and a unified AI orchestration layer that powers both SaaS products through a single cognitive engine. Ticket system, client portals with real-time KPI dashboards, full admin configuration, RBAC, CRM connectors, and every API route were designed as one cohesive, repeatable system — built to scale town by town without code divergence.
Request lifecycle
Each inbound message flows through three layers: Perception (webhook ingest, signature verification, envelope normalization), Reasoning (context rehydration, prompt compilation, model call), and Action (validated tool calls, idempotent side-effects, response dispatch).
Dashboard & Admin Surface
The portal is a thin client over the same typed server functions the agents use. Reads go through TanStack Query loaders; writes are Zod-validated server functions guarded by RLS and role checks. This is the layer that makes the system usable by non-technical operators.
KPI surface
Call volume, booking conversion, lead capture, and median response latency aggregated from the message log and rendered through TanStack Query with live invalidation on new events.
Agent configuration
Hours, service areas, pricing, and guardrails are editable rows with Zod-validated forms. Writes invalidate the agent's compiled prompt cache so the next inbound message uses the new config.
Conversation log
Full transcripts with model inputs, tool calls, and outputs preserved per turn. Flagged turns feed back into the prompt regression set.
Roles & access
Roles live in a dedicated user_roles table and are checked server-side via a SECURITY DEFINER function — never trusted from the client.
Billing
Subscription state synced from Stripe via signature-verified webhooks; entitlements derived server-side and joined into the session claims.
Support tickets
Tickets are first-class rows with status transitions, threaded messages, and the same RLS model as the rest of the platform.
Runtime Stack
TanStack Start (React 19, Vite 7) on Cloudflare Workers; Postgres with Row Level Security as the system of record; Tailwind v4 for the design layer; Zod at every trust boundary.
Orchestration Layer
Per-tenant system prompts compiled from a structured config (services, hours, pricing, geofence). LLM calls route through a thin adapter so providers are swappable; tool calls are constrained to a typed, allow-listed schema.
Communications Layer
Inbound SMS and voice handled via webhook endpoints under /api/public/*. Signature-verified at the edge, normalized into a single message envelope, and persisted before any model call so conversations are replayable.
Action & Integration Layer
Side-effects (calendar writes, CRM upserts, notifications) run as typed server functions with idempotency keys. OAuth tokens are stored encrypted; failed actions retry with exponential backoff and surface in the admin log.
Provisioning Flow
Sign-up through activation is four steps, each a discrete transaction executed by a Product Provider on behalf of the client. There is no separate deploy pipeline — agent definitions are data, so going live is a row-state change, not a build. A new location can be onboarded in minutes.
Account provisioning
Email + OAuth sign-up creates an auth user, a tenant row, and a default role assignment in a single transaction.
Agent configuration
Services, hours, pricing, and voice preferences are written through validated forms; the prompt compiler produces a versioned agent definition.
Integrations
OAuth flows for Google Workspace; webhook + secret pairs for CRMs and telephony. Credentials encrypted at rest, scoped per tenant.
Activation
Flipping the agent live registers the inbound number routes and unblocks the public webhook endpoints. No redeploy step — minutes, not months.
How Deployment Works
A deployment is not a build-and-ship event. It is the act of wiring a tenant's configuration, integrations, and phone numbers into the live runtime. Everything below is data — no code is forked, no server is provisioned.
Agents
Voice and SMS agent definitions are compiled from tenant config rows (services, hours, pricing, geofence) into versioned system prompts. Each agent is a deterministic state machine: perception → reasoning → action. No model weights are deployed; only prompt templates and tool schemas change.
Integrations
Twilio phone numbers and webhook endpoints are mapped per tenant. Google Workspace OAuth provides calendar write access. CRM connectors push lead data via signed webhook POSTs. Credentials are encrypted at rest with tenant-scoped keys; integration health is checked before activation.
Testing
Shadow mode runs the agent against real inbound traffic without dispatching responses, logging divergence between model output and human baseline. Conversation replay rehydrates historical threads against new prompt versions. Escalation simulation validates handoff triggers end-to-end before the agent handles live customers.
Release
Going live is a row-state update: the tenant's agent_status flips to active, Twilio number routes to the production webhook, and the public endpoint begins accepting traffic. Rollback is the inverse — one atomic write. There is no build pipeline, no container restart, and no DNS switch.
Engineering Principles
The codebase is opinionated: strict types end-to-end, server-side validation as a non-negotiable, and migrations that fail loudly. LLM-assisted tooling is used during authoring; the runtime is plain, auditable TypeScript.
Typed end-to-end
TypeScript strict mode across the client, server functions, and the generated DB types. The router, RPC layer, and Postgres schema share types so a column rename surfaces as a compile error, not a 500.
Edge-first runtime
SSR and server functions run on Cloudflare Workers (workerd) with nodejs_compat. Cold starts are sub-100ms; static assets are served from the edge cache; webhooks resolve close to the originating carrier.
Config-driven agents
Agent behavior is data, not code. Hours, pricing, escalation rules, and service areas live in Postgres rows that compile into the prompt at request time, so updates propagate without a redeploy.
Guardrails
Safety lives in the system boundary, not the prompt. These are the structural controls that hold whether or not the model behaves.
| Control | Implementation | Effect |
|---|---|---|
Scoped tool surface | LLM tool calls validated against a Zod schema; out-of-scope intents are rejected before the model can act. | The agent cannot quote prices, promise services, or write to systems it was never given access to. |
Durable conversation state | Messages stored in Postgres keyed by (tenant_id, channel, peer_id); context window rehydrated per turn with a bounded token budget. | A customer can reply hours later and the agent resumes the thread with full prior context. |
Deterministic handoff | Classifier + rule layer flags escalation triggers (sentiment, keywords, repeated failures) and pauses the agent atomically while paging the owner. | Edge cases reach a human instead of being improvised by the model. |
Tenant isolation | Row Level Security on every public table; server functions execute under the caller's JWT, admin paths require an explicit has_role check. | One tenant's data is structurally unreachable from another tenant's session. |
Replicable Local Infrastructure
Because the stack is tenant-isolated, config-driven, and edge-deployed, it is structurally replicable. The same codebase can serve a roofing contractor in Santa Rosa and a dental clinic in Marin with zero code divergence — only rows in Postgres change. That means a deployment model that scales town by town, not server by server.
The barrier for a local operator is not the runtime; it is the setup. The platform compresses that setup into a guided flow: sign up, configure, connect, activate. In under ten minutes a business has a working AI receptionist and SMS agent with a live dashboard. That speed changes the economics of local AI adoption.
The end-state is a network of locally deployed, centrally maintained agent instances — each one independent, each one auditable, each one operated by people who understand the community they serve. National players optimize for scale. This is built to optimize for place.

Concept Deployment Model — Town as Tenant Neural Network
Enterprise Gap
Large organizations deploy AI at scale. Small business lacks the technical staff and capital to keep pace. The divide widens unless the tooling is rebuilt for operators, not engineers.
Product Provider Network
Local experts handle the full lifecycle — sales, configuration, testing, deployment, and support. Human trust accelerates adoption where self-serve software stalls.
Network Effects
As more local businesses adopt, shared infrastructure (model adapters, guardrails, CRM connectors) improves for everyone. The platform gets stronger as the network grows.
Want a technical walkthrough?
Happy to screen-share the codebase, schema, and a live agent trace — useful for prospective partners, technical collaborators, and hiring conversations alike.