Architecture Guide
Three-Layer Architecture
Singing Bird separates concerns into three strict layers:
+--------------------------------------------------+
| Content Pack (YAML) |
| Locations, entities, synths, processes, |
| latent facts, initial conditions |
+--------------------------------------------------+
|
v
+--------------------------------------------------+
| Ruleset (physical_social_v1) |
| Component schemas, affordance definitions, |
| validation rules |
+--------------------------------------------------+
|
v
+--------------------------------------------------+
| Kernel |
| Scheduling, transactions, event propagation, |
| perception delivery, snapshots, replay |
+--------------------------------------------------+
The kernel does not know what any scenario entity means. It knows locations, adjacency, entities with typed components, sensory channels, and world processes. The ruleset defines what component types are valid. The content pack defines the specific world.
A content pack references a ruleset_ref. The loader validates all components against the ruleset schemas at load time. Unknown component types generate warnings; missing required properties generate errors.
The Director Cycle
Each cycle follows the normative execution sequence from the spec:
1. Deep-copy canonical state into working copies
2. Advance simulation clock on working copy
3. Evaluate due world processes (statechart transitions)
4. Begin fixed-point microcycle loop:
a. Compose sensory bundles (channel-filtered + action surface)
b. Run synth cognitive cycles concurrently (LLM calls)
c. Collect intents from synths
d. Resolve communication intents -> CommunicationEvents
e. Resolve other intents sequentially (priority order)
f. Check for new events (communication, world changes)
g. If new events: re-compose affected bundles, continue loop
h. If quiescent or budget exhausted: exit loop
5. Validate invariants against working state
6. On success: promote working copies to canonical, commit snapshot
7. On failure: discard working copies (canonical state unchanged)
Transactional rollback is real: the runtime deep-copies world, director, synths, clock, and pending outcomes before any mutation. If invariant validation fails, the working copies are discarded. The canonical state on self.world, self.director, self.synths, self.clock, and self._pending_outcomes is never touched until commit.
Fixed-point microcycles allow intra-cycle propagation. If Thomas speaks to Margaret in microcycle 1, she hears it and can respond in microcycle 2 of the same director cycle. The loop continues until no new events are generated (quiescent) or the event budget is exhausted.
Perception Pipeline
Perception enforces epistemic separation (SYS-03): a synth never reads canonical state directly.
Stage 1: Structured Observations
The perception engine scans the synth's canonical location and adjacent locations, filtered by the synth's sensory capabilities:
- Visual: entities, light sources, ambient light, celestial state (only if synth has visual channel)
- Auditory: sound sources, speech events, weather sounds (only if synth has auditory channel)
- Tactile: temperature, environmental touch (only if synth has tactile channel)
- Proprioceptive: self-location awareness (always available)
A blind synth (no visual channel) receives zero visual observations. This is enforced at the perception layer, not by prompt tricks.
Stage 2: Action Surface
The director builds a grounded action menu for each synth each cycle:
Available actions:
- move_to_<uuid>: Move to Ground Floor
- inspect_<uuid>: Examine Beacon Lamp closely
- repair_<uuid>: Attempt to repair Beacon Lamp (jammed)
- switch_<uuid>: Switch Flashlight on
- take_<uuid>: Pick up Wrench
- open_<uuid>: Open Tool Chest
- wait: Do nothing this turn
- communicate: Speak (Margaret, James can hear you)
Every action has a stable symbolic action_id. Every entity ref is a UUID. The synth LLM selects from this menu by ID. The resolver looks up refs directly -- no substring matching, no English parsing.
Stage 3: Perception Policy (Optional)
A PerceptionPolicy hook can modify observations before delivery. The default is a no-op pass-through. An example LowLightAmbiguityPolicy reduces visual salience in dim conditions.
Intent Resolution
The resolver is component-based. It dispatches on action_id prefix:
repair_<ref>-> check Repairable component, enforce tool requirements, apply cascade to linked LightSource/Switchableswitch_<ref>-> toggle Switchable component, cascade to linked LightSourcetake_<ref>-> check Portable, check not already carried, set canonical carrying relationshipconsume_<ref>-> decrement Consumable uses, apply effectopen_<ref>-> set containeropen=True(canonical state change), reveal contentsmove_to_<ref>-> check adjacency, check constraints, move entity + carried items
Tool-gated repair: if a Repairable component specifies repair_requirements.tools, the resolver checks whether the synth's body entity carries entities with matching Tool components. Missing tools result in blocked outcome.
Conflict resolution: intents are resolved sequentially in priority order. If two synths try to take the same item, the first succeeds and the second gets blocked (item already carried).
LLM Integration
Single-Call Architecture
Each synth gets one LLM call per microcycle. The system prompt IS the synth's current mind:
System prompt = identity + body + inner state + beliefs + projects + mental models + memories + instructions
User message = structured observations + action surface + previous outcomes
Response = InnerMonologue + belief updates + mood + intents + memories + theory of mind
No conversation history is maintained. State is carried in the system prompt via beliefs, memories, and theory of mind. This keeps token costs bounded.
Structured Output
The LLM returns a SynthCognitiveResponse via OpenAI's .parse() with a pydantic response model. All fields are typed and validated. The InnerMonologue (react -> interpret -> deliberate -> resolve) forces genuine cognitive processing before action selection.
Live vs Replay Modes
- Live mode: calls OpenAI, logs every prompt + response to
audit/llm_calls.jsonl - Replay mode: reads logged responses, no API calls, deterministic
In live mode, the world's random_seed is passed to OpenAI calls for approximate reproducibility. True determinism requires replay mode.
Containment Model
Containment is first-class in the canonical world model:
- Location contains entity:
LocationNode.contained_entity_ids - Entity contains entity:
EntityNode.contained_entity_ids - Synth carries entity:
EntityNode.carried_by_synth_id+ carrier'scontained_entity_ids - Carried items move with carrier: move resolution updates carried entity locations automatically
Opening a container sets open=True on the container component (canonical state change). The action surface respects this: an already-open container does not re-offer the "open" affordance.
Theory of Mind
Each synth carries thin mental models of other known synths:
MentalModel:
name, believed_location, believed_mood, believed_goals,
believed_knowledge, believed_beliefs_about_me, confidence
The recursive layer is believed_beliefs_about_me -- what I think they think about me. This is populated from the LLM's TheoryOfMindEntry response and fed back into the next cycle's system prompt. Depth is capped at 1.
Snapshot and Persistence
- Snapshots: full system state (World + Director + Synths) serialized as JSON, atomic write via temp-file-then-rename
- Audit log: JSONL append-only event trail (cycle events, synth responses, intent resolutions, communications)
- LLM call log: JSONL with full prompt, model, seed, temperature, response for every LLM call
- Schema migration: migration registry chains versioned transforms (0.1.0 -> 0.2.0 -> 0.3.0); snapshot loader auto-migrates
Module Map
Models (src/singing_bird/models/)
| File | What it defines |
|---|---|
enums.py |
All enumerations used across the system |
components.py |
Component/affordance ontology and query utilities |
world.py |
World, LocationNode, EntityNode, WorldProcess, LatentFact, Environment |
synth.py |
Synth, Identity, Belief, MentalModel, Phenomenology, Affect, Project, Memory |
director.py |
Director configuration and runtime state |
payloads.py |
SensoryBundle, ActionSurface, ActuationIntent, ActionOutcome, CommunicationEvent |
ruleset.py |
Ruleset with ComponentSchema registry; PHYSICAL_SOCIAL_V1 definition |
Engine (src/singing_bird/engine/)
| File | What it does |
|---|---|
director_runtime.py |
The cycle orchestrator: rollback, microcycles, commit |
clock.py |
Simulation clock with configurable chronon and turn advance |
perception.py |
Channel-filtered observation assembly + action surface generation |
perception_policy.py |
PerceptionPolicy hook + LowLightAmbiguityPolicy example |
communication.py |
Speech resolution: communicate intent -> CommunicationEvent |
resolver.py |
Symbolic intent resolution against canonical state |
processes.py |
Statechart-driven world process evaluation |
statechart.py |
Generic hierarchical state machine interpreter |
invariants.py |
Scenario-agnostic invariant checks + containment consistency |
LLM (src/singing_bird/llm/)
| File | What it does |
|---|---|
client.py |
Async OpenAI client with live/replay modes and call logging |
schemas.py |
Pydantic response models for structured LLM output |
prompts.py |
System prompt builder and templates |
Synth (src/singing_bird/synth/)
| File | What it does |
|---|---|
agent.py |
Cognitive cycle: build prompt, call LLM, apply state updates, convert intents |
theory_of_mind.py |
Mental model management from LLM responses |
Content (src/singing_bird/content/)
| File | What it does |
|---|---|
loader.py |
YAML scenario loader with stable ID compilation and ruleset validation |
Persistence (src/singing_bird/persistence/)
| File | What it does |
|---|---|
snapshot.py |
Save/load/restore full system state as JSON |
audit.py |
JSONL event log + token/cost summary |
migration.py |
Schema migration registry and chain |
API Control Plane (src/singing_bird/api/)
| File | What it does |
|---|---|
app.py |
FastAPI application with all routes under /v1/ |
session_manager.py |
Session lifecycle: create, list, close, snapshot restore |
refs.py |
Stable ref resolution (name → UUID lookup) |
auth.py |
Bearer token auth with read/operate/admin scopes |
diff.py |
Change computation: what changed since a cursor or snapshot |
streaming.py |
SSE event stream generator |
services/observation.py |
Overview, object detail, events, cycle reports |
services/intervention.py |
Typed admin patches with dry_run/commit/commit_and_step |
API Control Plane
The API layer sits above the kernel and exposes it as an HTTP/JSON service. It does not reimplement simulation logic.
Three Execution Lanes
Query Lane (GET) Turn Lane (POST /turns) Admin Patch Lane (POST /patches)
| | |
| read-only | director-mediated | working copy
| no mutation | payload → stimulus | invariant validation
| no cycle | triggers cycle | commit or discard
| | full rollback semantics | optional step-after
v v v
JSON response CycleResult PatchResult
- Query: overview, resolve, object detail, events, changes, reports, SSE stream
- Turn: external input converted to mediated stimulus, delivered through perception
- Admin patch: typed operations (move_entity, create_entity, set_component_property, add_belief, etc.) applied to working copy, validated, committed as snapshot
Session Model
One session = one simulation lineage. Each session owns a scenario, a DirectorRuntime, a snapshot chain, and an audit trail. Sessions start paused. Snapshot restore provides undo.
Stimulus Queue
External stimuli are queued on the session and delivered through compose_sensory_bundle on the next step or turn. They enter through the same mediated perception path as all other observations — the API never writes directly into synth state.
Auth and Scopes
read— query lane onlyoperate— query + control (step, pause, turns)admin— full access including patches, stimuli, inner monologue
Enabled via SINGING_BIRD_API_TOKEN environment variable. Open access when auth is disabled.