Architecture Guide

Three-Layer Architecture

Singing Bird separates concerns into three strict layers:

+--------------------------------------------------+
|  Content Pack (YAML)                             |
|  Locations, entities, synths, processes,          |
|  latent facts, initial conditions                 |
+--------------------------------------------------+
                      |
                      v
+--------------------------------------------------+
|  Ruleset (physical_social_v1)                    |
|  Component schemas, affordance definitions,       |
|  validation rules                                 |
+--------------------------------------------------+
                      |
                      v
+--------------------------------------------------+
|  Kernel                                          |
|  Scheduling, transactions, event propagation,     |
|  perception delivery, snapshots, replay           |
+--------------------------------------------------+

The kernel does not know what any scenario entity means. It knows locations, adjacency, entities with typed components, sensory channels, and world processes. The ruleset defines what component types are valid. The content pack defines the specific world.

A content pack references a ruleset_ref. The loader validates all components against the ruleset schemas at load time. Unknown component types generate warnings; missing required properties generate errors.

The Director Cycle

Each cycle follows the normative execution sequence from the spec:

1. Deep-copy canonical state into working copies
2. Advance simulation clock on working copy
3. Evaluate due world processes (statechart transitions)
4. Begin fixed-point microcycle loop:
   a. Compose sensory bundles (channel-filtered + action surface)
   b. Run synth cognitive cycles concurrently (LLM calls)
   c. Collect intents from synths
   d. Resolve communication intents -> CommunicationEvents
   e. Resolve other intents sequentially (priority order)
   f. Check for new events (communication, world changes)
   g. If new events: re-compose affected bundles, continue loop
   h. If quiescent or budget exhausted: exit loop
5. Validate invariants against working state
6. On success: promote working copies to canonical, commit snapshot
7. On failure: discard working copies (canonical state unchanged)

Transactional rollback is real: the runtime deep-copies world, director, synths, clock, and pending outcomes before any mutation. If invariant validation fails, the working copies are discarded. The canonical state on self.world, self.director, self.synths, self.clock, and self._pending_outcomes is never touched until commit.

Fixed-point microcycles allow intra-cycle propagation. If Thomas speaks to Margaret in microcycle 1, she hears it and can respond in microcycle 2 of the same director cycle. The loop continues until no new events are generated (quiescent) or the event budget is exhausted.

Perception Pipeline

Perception enforces epistemic separation (SYS-03): a synth never reads canonical state directly.

Stage 1: Structured Observations

The perception engine scans the synth's canonical location and adjacent locations, filtered by the synth's sensory capabilities:

Visual: entities, light sources, ambient light, celestial state (only if synth has visual channel)
Auditory: sound sources, speech events, weather sounds (only if synth has auditory channel)
Tactile: temperature, environmental touch (only if synth has tactile channel)
Proprioceptive: self-location awareness (always available)

A blind synth (no visual channel) receives zero visual observations. This is enforced at the perception layer, not by prompt tricks.

Stage 2: Action Surface

The director builds a grounded action menu for each synth each cycle:

Available actions:
  - move_to_<uuid>: Move to Ground Floor
  - inspect_<uuid>: Examine Beacon Lamp closely
  - repair_<uuid>: Attempt to repair Beacon Lamp (jammed)
  - switch_<uuid>: Switch Flashlight on
  - take_<uuid>: Pick up Wrench
  - open_<uuid>: Open Tool Chest
  - wait: Do nothing this turn
  - communicate: Speak (Margaret, James can hear you)

Every action has a stable symbolic action_id. Every entity ref is a UUID. The synth LLM selects from this menu by ID. The resolver looks up refs directly -- no substring matching, no English parsing.

Stage 3: Perception Policy (Optional)

A PerceptionPolicy hook can modify observations before delivery. The default is a no-op pass-through. An example LowLightAmbiguityPolicy reduces visual salience in dim conditions.

Intent Resolution

The resolver is component-based. It dispatches on action_id prefix:

repair_<ref> -> check Repairable component, enforce tool requirements, apply cascade to linked LightSource/Switchable
switch_<ref> -> toggle Switchable component, cascade to linked LightSource
take_<ref> -> check Portable, check not already carried, set canonical carrying relationship
consume_<ref> -> decrement Consumable uses, apply effect
open_<ref> -> set container open=True (canonical state change), reveal contents
move_to_<ref> -> check adjacency, check constraints, move entity + carried items

Tool-gated repair: if a Repairable component specifies repair_requirements.tools, the resolver checks whether the synth's body entity carries entities with matching Tool components. Missing tools result in blocked outcome.

Conflict resolution: intents are resolved sequentially in priority order. If two synths try to take the same item, the first succeeds and the second gets blocked (item already carried).

LLM Integration

Single-Call Architecture

Each synth gets one LLM call per microcycle. The system prompt IS the synth's current mind:

System prompt = identity + body + inner state + beliefs + projects + mental models + memories + instructions
User message  = structured observations + action surface + previous outcomes
Response      = InnerMonologue + belief updates + mood + intents + memories + theory of mind

No conversation history is maintained. State is carried in the system prompt via beliefs, memories, and theory of mind. This keeps token costs bounded.

Structured Output

The LLM returns a SynthCognitiveResponse via OpenAI's .parse() with a pydantic response model. All fields are typed and validated. The InnerMonologue (react -> interpret -> deliberate -> resolve) forces genuine cognitive processing before action selection.

Live vs Replay Modes

Live mode: calls OpenAI, logs every prompt + response to audit/llm_calls.jsonl
Replay mode: reads logged responses, no API calls, deterministic

In live mode, the world's random_seed is passed to OpenAI calls for approximate reproducibility. True determinism requires replay mode.

Containment Model

Containment is first-class in the canonical world model:

Location contains entity: LocationNode.contained_entity_ids
Entity contains entity: EntityNode.contained_entity_ids
Synth carries entity: EntityNode.carried_by_synth_id + carrier's contained_entity_ids
Carried items move with carrier: move resolution updates carried entity locations automatically

Opening a container sets open=True on the container component (canonical state change). The action surface respects this: an already-open container does not re-offer the "open" affordance.

Theory of Mind

Each synth carries thin mental models of other known synths:

MentalModel:
  name, believed_location, believed_mood, believed_goals,
  believed_knowledge, believed_beliefs_about_me, confidence

The recursive layer is believed_beliefs_about_me -- what I think they think about me. This is populated from the LLM's TheoryOfMindEntry response and fed back into the next cycle's system prompt. Depth is capped at 1.

Snapshot and Persistence

Snapshots: full system state (World + Director + Synths) serialized as JSON, atomic write via temp-file-then-rename
Audit log: JSONL append-only event trail (cycle events, synth responses, intent resolutions, communications)
LLM call log: JSONL with full prompt, model, seed, temperature, response for every LLM call
Schema migration: migration registry chains versioned transforms (0.1.0 -> 0.2.0 -> 0.3.0); snapshot loader auto-migrates

Module Map

Models (`src/singing_bird/models/`)

File	What it defines
`enums.py`	All enumerations used across the system
`components.py`	Component/affordance ontology and query utilities
`world.py`	World, LocationNode, EntityNode, WorldProcess, LatentFact, Environment
`synth.py`	Synth, Identity, Belief, MentalModel, Phenomenology, Affect, Project, Memory
`director.py`	Director configuration and runtime state
`payloads.py`	SensoryBundle, ActionSurface, ActuationIntent, ActionOutcome, CommunicationEvent
`ruleset.py`	Ruleset with ComponentSchema registry; PHYSICAL_SOCIAL_V1 definition

Engine (`src/singing_bird/engine/`)

File	What it does
`director_runtime.py`	The cycle orchestrator: rollback, microcycles, commit
`clock.py`	Simulation clock with configurable chronon and turn advance
`perception.py`	Channel-filtered observation assembly + action surface generation
`perception_policy.py`	PerceptionPolicy hook + LowLightAmbiguityPolicy example
`communication.py`	Speech resolution: communicate intent -> CommunicationEvent
`resolver.py`	Symbolic intent resolution against canonical state
`processes.py`	Statechart-driven world process evaluation
`statechart.py`	Generic hierarchical state machine interpreter
`invariants.py`	Scenario-agnostic invariant checks + containment consistency

LLM (`src/singing_bird/llm/`)

File	What it does
`client.py`	Async OpenAI client with live/replay modes and call logging
`schemas.py`	Pydantic response models for structured LLM output
`prompts.py`	System prompt builder and templates

Synth (`src/singing_bird/synth/`)

File	What it does
`agent.py`	Cognitive cycle: build prompt, call LLM, apply state updates, convert intents
`theory_of_mind.py`	Mental model management from LLM responses

Content (`src/singing_bird/content/`)

File	What it does
`loader.py`	YAML scenario loader with stable ID compilation and ruleset validation

Persistence (`src/singing_bird/persistence/`)

File	What it does
`snapshot.py`	Save/load/restore full system state as JSON
`audit.py`	JSONL event log + token/cost summary
`migration.py`	Schema migration registry and chain

API Control Plane (`src/singing_bird/api/`)

File	What it does
`app.py`	FastAPI application with all routes under `/v1/`
`session_manager.py`	Session lifecycle: create, list, close, snapshot restore
`refs.py`	Stable ref resolution (name → UUID lookup)
`auth.py`	Bearer token auth with read/operate/admin scopes
`diff.py`	Change computation: what changed since a cursor or snapshot
`streaming.py`	SSE event stream generator
`services/observation.py`	Overview, object detail, events, cycle reports
`services/intervention.py`	Typed admin patches with dry_run/commit/commit_and_step

API Control Plane

The API layer sits above the kernel and exposes it as an HTTP/JSON service. It does not reimplement simulation logic.

Three Execution Lanes

Query Lane (GET)          Turn Lane (POST /turns)       Admin Patch Lane (POST /patches)
  |                         |                              |
  | read-only               | director-mediated            | working copy
  | no mutation              | payload → stimulus           | invariant validation
  | no cycle                 | triggers cycle               | commit or discard
  |                         | full rollback semantics       | optional step-after
  v                         v                              v
  JSON response             CycleResult                    PatchResult

Query: overview, resolve, object detail, events, changes, reports, SSE stream
Turn: external input converted to mediated stimulus, delivered through perception
Admin patch: typed operations (move_entity, create_entity, set_component_property, add_belief, etc.) applied to working copy, validated, committed as snapshot

Session Model

One session = one simulation lineage. Each session owns a scenario, a DirectorRuntime, a snapshot chain, and an audit trail. Sessions start paused. Snapshot restore provides undo.

Stimulus Queue

External stimuli are queued on the session and delivered through compose_sensory_bundle on the next step or turn. They enter through the same mediated perception path as all other observations — the API never writes directly into synth state.

Auth and Scopes

read — query lane only
operate — query + control (step, pause, turns)
admin — full access including patches, stimuli, inner monologue

Enabled via SINGING_BIRD_API_TOKEN environment variable. Open access when auth is disabled.