← Home · docs/guides/architecture.md

Architecture Guide

Three-Layer Architecture

Singing Bird separates concerns into three strict layers:

+--------------------------------------------------+
|  Content Pack (YAML)                             |
|  Locations, entities, synths, processes,          |
|  latent facts, initial conditions                 |
+--------------------------------------------------+
                      |
                      v
+--------------------------------------------------+
|  Ruleset (physical_social_v1)                    |
|  Component schemas, affordance definitions,       |
|  validation rules                                 |
+--------------------------------------------------+
                      |
                      v
+--------------------------------------------------+
|  Kernel                                          |
|  Scheduling, transactions, event propagation,     |
|  perception delivery, snapshots, replay           |
+--------------------------------------------------+

The kernel does not know what any scenario entity means. It knows locations, adjacency, entities with typed components, sensory channels, and world processes. The ruleset defines what component types are valid. The content pack defines the specific world.

A content pack references a ruleset_ref. The loader validates all components against the ruleset schemas at load time. Unknown component types generate warnings; missing required properties generate errors.

The Director Cycle

Each cycle follows the normative execution sequence from the spec:

1. Deep-copy canonical state into working copies
2. Advance simulation clock on working copy
3. Evaluate due world processes (statechart transitions)
4. Begin fixed-point microcycle loop:
   a. Compose sensory bundles (channel-filtered + action surface)
   b. Run synth cognitive cycles concurrently (LLM calls)
   c. Collect intents from synths
   d. Resolve communication intents -> CommunicationEvents
   e. Resolve other intents sequentially (priority order)
   f. Check for new events (communication, world changes)
   g. If new events: re-compose affected bundles, continue loop
   h. If quiescent or budget exhausted: exit loop
5. Validate invariants against working state
6. On success: promote working copies to canonical, commit snapshot
7. On failure: discard working copies (canonical state unchanged)

Transactional rollback is real: the runtime deep-copies world, director, synths, clock, and pending outcomes before any mutation. If invariant validation fails, the working copies are discarded. The canonical state on self.world, self.director, self.synths, self.clock, and self._pending_outcomes is never touched until commit.

Fixed-point microcycles allow intra-cycle propagation. If Thomas speaks to Margaret in microcycle 1, she hears it and can respond in microcycle 2 of the same director cycle. The loop continues until no new events are generated (quiescent) or the event budget is exhausted.

Perception Pipeline

Perception enforces epistemic separation (SYS-03): a synth never reads canonical state directly.

Stage 1: Structured Observations

The perception engine scans the synth's canonical location and adjacent locations, filtered by the synth's sensory capabilities:

A blind synth (no visual channel) receives zero visual observations. This is enforced at the perception layer, not by prompt tricks.

Stage 2: Action Surface

The director builds a grounded action menu for each synth each cycle:

Available actions:
  - move_to_<uuid>: Move to Ground Floor
  - inspect_<uuid>: Examine Beacon Lamp closely
  - repair_<uuid>: Attempt to repair Beacon Lamp (jammed)
  - switch_<uuid>: Switch Flashlight on
  - take_<uuid>: Pick up Wrench
  - open_<uuid>: Open Tool Chest
  - wait: Do nothing this turn
  - communicate: Speak (Margaret, James can hear you)

Every action has a stable symbolic action_id. Every entity ref is a UUID. The synth LLM selects from this menu by ID. The resolver looks up refs directly -- no substring matching, no English parsing.

Stage 3: Perception Policy (Optional)

A PerceptionPolicy hook can modify observations before delivery. The default is a no-op pass-through. An example LowLightAmbiguityPolicy reduces visual salience in dim conditions.

Intent Resolution

The resolver is component-based. It dispatches on action_id prefix:

Tool-gated repair: if a Repairable component specifies repair_requirements.tools, the resolver checks whether the synth's body entity carries entities with matching Tool components. Missing tools result in blocked outcome.

Conflict resolution: intents are resolved sequentially in priority order. If two synths try to take the same item, the first succeeds and the second gets blocked (item already carried).

LLM Integration

Single-Call Architecture

Each synth gets one LLM call per microcycle. The system prompt IS the synth's current mind:

System prompt = identity + body + inner state + beliefs + projects + mental models + memories + instructions
User message  = structured observations + action surface + previous outcomes
Response      = InnerMonologue + belief updates + mood + intents + memories + theory of mind

No conversation history is maintained. State is carried in the system prompt via beliefs, memories, and theory of mind. This keeps token costs bounded.

Structured Output

The LLM returns a SynthCognitiveResponse via OpenAI's .parse() with a pydantic response model. All fields are typed and validated. The InnerMonologue (react -> interpret -> deliberate -> resolve) forces genuine cognitive processing before action selection.

Live vs Replay Modes

In live mode, the world's random_seed is passed to OpenAI calls for approximate reproducibility. True determinism requires replay mode.

Containment Model

Containment is first-class in the canonical world model:

Opening a container sets open=True on the container component (canonical state change). The action surface respects this: an already-open container does not re-offer the "open" affordance.

Theory of Mind

Each synth carries thin mental models of other known synths:

MentalModel:
  name, believed_location, believed_mood, believed_goals,
  believed_knowledge, believed_beliefs_about_me, confidence

The recursive layer is believed_beliefs_about_me -- what I think they think about me. This is populated from the LLM's TheoryOfMindEntry response and fed back into the next cycle's system prompt. Depth is capped at 1.

Snapshot and Persistence

Module Map

Models (src/singing_bird/models/)

File What it defines
enums.py All enumerations used across the system
components.py Component/affordance ontology and query utilities
world.py World, LocationNode, EntityNode, WorldProcess, LatentFact, Environment
synth.py Synth, Identity, Belief, MentalModel, Phenomenology, Affect, Project, Memory
director.py Director configuration and runtime state
payloads.py SensoryBundle, ActionSurface, ActuationIntent, ActionOutcome, CommunicationEvent
ruleset.py Ruleset with ComponentSchema registry; PHYSICAL_SOCIAL_V1 definition

Engine (src/singing_bird/engine/)

File What it does
director_runtime.py The cycle orchestrator: rollback, microcycles, commit
clock.py Simulation clock with configurable chronon and turn advance
perception.py Channel-filtered observation assembly + action surface generation
perception_policy.py PerceptionPolicy hook + LowLightAmbiguityPolicy example
communication.py Speech resolution: communicate intent -> CommunicationEvent
resolver.py Symbolic intent resolution against canonical state
processes.py Statechart-driven world process evaluation
statechart.py Generic hierarchical state machine interpreter
invariants.py Scenario-agnostic invariant checks + containment consistency

LLM (src/singing_bird/llm/)

File What it does
client.py Async OpenAI client with live/replay modes and call logging
schemas.py Pydantic response models for structured LLM output
prompts.py System prompt builder and templates

Synth (src/singing_bird/synth/)

File What it does
agent.py Cognitive cycle: build prompt, call LLM, apply state updates, convert intents
theory_of_mind.py Mental model management from LLM responses

Content (src/singing_bird/content/)

File What it does
loader.py YAML scenario loader with stable ID compilation and ruleset validation

Persistence (src/singing_bird/persistence/)

File What it does
snapshot.py Save/load/restore full system state as JSON
audit.py JSONL event log + token/cost summary
migration.py Schema migration registry and chain

API Control Plane (src/singing_bird/api/)

File What it does
app.py FastAPI application with all routes under /v1/
session_manager.py Session lifecycle: create, list, close, snapshot restore
refs.py Stable ref resolution (name → UUID lookup)
auth.py Bearer token auth with read/operate/admin scopes
diff.py Change computation: what changed since a cursor or snapshot
streaming.py SSE event stream generator
services/observation.py Overview, object detail, events, cycle reports
services/intervention.py Typed admin patches with dry_run/commit/commit_and_step

API Control Plane

The API layer sits above the kernel and exposes it as an HTTP/JSON service. It does not reimplement simulation logic.

Three Execution Lanes

Query Lane (GET)          Turn Lane (POST /turns)       Admin Patch Lane (POST /patches)
  |                         |                              |
  | read-only               | director-mediated            | working copy
  | no mutation              | payload → stimulus           | invariant validation
  | no cycle                 | triggers cycle               | commit or discard
  |                         | full rollback semantics       | optional step-after
  v                         v                              v
  JSON response             CycleResult                    PatchResult

Session Model

One session = one simulation lineage. Each session owns a scenario, a DirectorRuntime, a snapshot chain, and an audit trail. Sessions start paused. Snapshot restore provides undo.

Stimulus Queue

External stimuli are queued on the session and delivered through compose_sensory_bundle on the next step or turn. They enter through the same mediated perception path as all other observations — the API never writes directly into synth state.

Auth and Scopes

Enabled via SINGING_BIRD_API_TOKEN environment variable. Open access when auth is disabled.