LLM Council — Local Multi-LLM Orchestrator

Local-first multi-LLM consensus system using Ollama: parallel answers → anonymized peer review → chairman synthesis.

Role: Systems engineering demoTimeframe: Prototype buildStack: Node.js • TypeScript • Ollama • Zod • SQLite • Vite
Local AIDistributed SystemsOllamaZodSQLiteObservabilityLAN Deployment
LLM Council — Local Multi-LLM Orchestrator
At a glance
  • Problem
    Local-first multi-LLM consensus system using Ollama: parallel answers → anonymized peer review → chairman synthesis.
  • Role
    Systems engineering demo
  • Timeframe
    Prototype build
  • Stack
    Node.js • TypeScript • Ollama • Zod
  • Focus
    Local AI • Distributed Systems • Ollama
  • Results
    A local-first orchestration pattern that turns multi-LLM disagreement into a traceable, reviewable consensus flow without relying on cloud inference.

Problem

Single-model answers can be brittle, biased, or inconsistent. Teams struggle to compare models in a repeatable way, and it’s hard to explain why a final answer was chosen.

Local-first matters: sensitive data stays on the machine, inference works offline, costs stay predictable, and you avoid vendor lock-in.

Executive summary
  • Local-only inference (Ollama) with no cloud calls
  • Strict JSON/Zod contracts between services
  • Observability + run history for reproducible demos

Context

Architecture at a glance
LLM Council architectureLocal-first multi-LLM consensus flow with orchestrator, Ollama peers, and chairman synthesis.Client UIprompt + policyOrchestratorZod contractsOllama peersanonymized votesChairmansynthesis + JSONSQLiterun historyHealth/HeartbeatLAN statusLocal onlyNo cloud inference

Architecture at a glance — A client UI submits a prompt and policy to the orchestrator. The orchestrator fans out to Ollama peers, anonymizes answers, collects reviews, and passes a ranked signal to the chairman service for a final JSON response. Health/heartbeat metrics and run history persist locally.

How to verify
  • Quickstart (local)
    npm install
    npm --prefix orchestrator install
    npm --prefix chairman install
    npm --prefix ui install
    
    cp .env.example .env
    cp orchestrator/.env.example orchestrator/.env
    cp chairman/.env.example chairman/.env
    cp ui/.env.example ui/.env
    
    ./scripts/run_all_local.sh
    open http://localhost:5173

Local multi-LLM orchestration with Ollama and strict JSON contracts

LLM Council runs fully on-device, coordinating multiple models with Zod-validated JSON between services.

This keeps inference private, offline-capable, and reproducible.

Consensus pipeline: anonymized peer review and chairman synthesis

Responses are anonymized, reviewed, and ranked before the chairman produces the final structured answer.

This makes the selection process transparent and auditable.

Architecture

  1. Pipeline stages (Stage 1 → 3)
    • Stage 1: council members generate independent answers.
    • Stage 2: anonymized peer review ranks responses without identity bias.
    • Stage 3: chairman synthesizes the final response with strict JSON output.
    • Partial failure tolerance: missing peers can be skipped without blocking the run.
  2. Service boundaries & contracts
    • Orchestrator coordinates stages, anonymizes responses, and persists run metadata.
    • Each service exchanges validated JSON payloads via Zod schemas to prevent drift.
    • Run IDs and latency telemetry make outputs traceable for audits and demos.
  3. Observability & run history
    • UI surfaces stage timing, peer health, and final synthesis details.
    • SQLite stores run summaries, rankings, and exportable JSON artifacts.

Security / Threat Model

  • Local-only inference via Ollama; no cloud model calls or external data egress.
  • Run history is stored in SQLite and can be disabled via environment settings for sensitive demos.
  • No auth or rate limiting by default (appropriate for local demos; add gateway controls for shared LAN use).

Results

A local-first orchestration pattern that turns multi-LLM disagreement into a traceable, reviewable consensus flow without relying on cloud inference.

Stack

Node.jsTypeScriptOllamaZodSQLiteVite

FAQ

Why local-first instead of a cloud LLM API?

Local inference keeps sensitive data on-device, works offline, and avoids variable per‑request costs or vendor lock‑in.

Does the orchestrator call LLMs directly?

No. The orchestrator coordinates requests and aggregates responses, while member services handle LLM calls.

What makes the results reproducible?

Strict JSON/Zod contracts plus SQLite run history make it easy to replay and audit outputs.

    LLM Council — Local Multi-LLM Orchestrator — Case Study