← Home · README.md

Singing Bird

A declarative, scenario-driven, LLM-mediated embodied simulation kernel.

Singing Bird simulates canonical worlds containing synthetic agents (synths) whose subjective experience may differ from reality. A director maintains authoritative world truth. Synths receive filtered sensory input, deliberate via real LLM calls, and emit grounded action intents. The engine resolves those intents against canonical physics, propagates consequences, and commits transactional snapshots.

Scenarios are declarative YAML content packs. The engine does not know what a lighthouse is, or a clinic, or a dock. It knows locations, adjacency, entities with typed components, sensory channels, and world processes. All scenario-specific content lives in the packs, not in engine code.

Key Concepts

Concept What it is
Director The sole authority over canonical world truth. Runs the cycle, delivers perceptions, resolves actions, commits snapshots.
World The canonical simulation state: topology, entities, environment, processes. Ground truth.
Synth A subjective agent with identity, beliefs, mood, projects, memories, and theory of mind. Receives sensory bundles, emits intents. Never directly reads canonical state.
Content Pack A YAML file defining a complete scenario: locations, entities, synths, processes, latent facts. Loaded and validated against a ruleset.
Ruleset The ontology layer: defines what component types exist, validates content at load time. Currently ships physical_social_v1.
Action Surface A grounded menu of available actions presented to each synth each cycle. Synths select by symbolic ID, not free text.

Quick Start

# By default, singing_bird calls the Centroid LLM router (llm.benac.dev).
# The router picks whichever backend is currently scaled up (local GGUF or
# OpenAI pipe). Models are addressed by capability alias — "@chat" in the
# shipped scenarios — so consumers don't need to be reconfigured when the
# backend changes.
#
# Override with: export OPENAI_BASE_URL=https://api.openai.com/v1  (direct)
#                export OPENAI_API_KEY=sk-...                     (needed for direct)

# Run the lighthouse scenario (3 cycles)
PYTHONPATH=src python -m singing_bird.cli run --scenario scenarios/lighthouse.yaml --cycles 3

# Run the workshop blackout (different world, same kernel)
PYTHONPATH=src python -m singing_bird.cli run --scenario scenarios/workshop_blackout.yaml --cycles 3

# Start the API control plane (observe, step, intervene via HTTP)
PYTHONPATH=src uvicorn singing_bird.api.app:app --host 0.0.0.0 --port 8420

# Run unit tests (no API key needed)
PYTHONPATH=src pytest tests/ -m "not llm" -v

# Run full test suite including LLM integration (needs API key)
PYTHONPATH=src pytest tests/ --run-llm -v

Architecture

Content Pack (YAML) --> Loader/Compiler --> Initial Snapshot
                                                  |
                                           DirectorRuntime
                                        (pure transition kernel)
                                                  |
                   +------------------------------+------------------------------+
                   |                              |                              |
              Perception                     SynthAgent                     Resolver
          (channel-filtered,              (LLM cognition,             (component-based,
           action surface)                structured I/O)              symbolic refs)
                   |                              |                              |
              SensoryBundle              CognitiveResponse               ActionOutcome
          (observations +                (intents, beliefs,            (success/failure +
           grounded menu)                speech, memories)             canonical effects)

Three layers, strictly separated:

See Architecture Guide for details.

API Control Plane

An HTTP/JSON service lets external agents observe, control, and intervene in running simulations:

# Start the server
PYTHONPATH=src uvicorn singing_bird.api.app:app --host 0.0.0.0 --port 8420

# Create a session, step it, see what happened
curl -X POST localhost:8420/v1/sessions -d '{"scenario_path":"scenarios/lighthouse.yaml"}'
curl -X POST localhost:8420/v1/sessions/{id}/step -d '{"n":1}'
curl localhost:8420/v1/sessions/{id}/reports/last-cycle

Three execution lanes: - Query: read-only inspection (overview, resolve, object detail, events, changes, reports) - Turn: external input routed through the director as mediated stimuli - Admin patch: typed interventions (move/create entities, change weather, inject beliefs) with dry_run/commit/commit_and_step modes

Plus SSE live event streaming, queued stimuli, auth scopes, idempotency, and a 20-tool manifest for LLM agent consumption. See the Observer Guide for full details.

Scenario Packs

Six scenarios ship with the project, all running on the same kernel with zero engine code changes:

Scenario Setting Synths Focus
lighthouse Island lighthouse, 1847 Thomas (keeper), Margaret (blind wife), James (castaway) Blind embodiment, repair, weather, epistemic asymmetry
park_bench Lakeside boardwalk Alice (artist), Bob (retiree) Social deliberation, low-stakes conversation
workshop_blackout Auto repair shop Mara (owner), Eli (apprentice), June (courier) Local repair, switchable light, weather pressure
storm_cellar Farmhouse in a storm Ruth, Ben, Alma Shelter, consumption, testimony, worsening weather
night_clinic Late-night clinic Ana (nurse), Victor (patient), Hale (orderly) Care, waiting, low-light, small-object handling
dockside_outpost Harbor dock at dawn Inez, Pavel, Sora Fog, signal maintenance, mixed interior/exterior

See the Scenario Authoring Contract for how to write new packs.

Test Suite

107 tests total (96 unit + 11 LLM integration):

# Unit tests only (fast, no API key)
PYTHONPATH=src pytest tests/ -m "not llm" -v

# Full suite (slower, needs OPENAI_API_KEY)
PYTHONPATH=src pytest tests/ --run-llm -v
Test file What it proves
test_models Serialization, components, belief provenance, theory of mind
test_statechart Generic HSM: events, timers, guards, history states
test_perception Channel filtering, communication delivery, action surface grounding
test_content_loader YAML loading, referential integrity, ruleset validation
test_invariants Engine invariants, rollback isolation
test_kernel Component validation, anonymized scenario, symbolic grounding, replay log
test_hardening Snapshot roundtrip, forced rollback, container reveal, tool-gated repair, containment invariants
test_preplay Schema migration, conflict arbitration, cascade regression
test_scenario_matrix All 6 packs load and run on same kernel; stable IDs across loads
test_engine Full LLM integration: single cycle, blind synth, intent production, multi-scenario
test_api API control plane: sessions, observation, intervention, stimuli, idempotency, auth

Project Status

Version: 0.3.0 (experimental)

Category Met Partial Total
System requirements 18 2 20
Non-functional 7 1 8
Director 15 1 16
Synth 14 2 16
Hierarchical state 7 1 8
Invariants 8 0 8
Total 69 7 76

See the full Compliance Matrix.

What works: Content decoupling, symbolic grounding, transactional rollback, replay infrastructure, six-scenario matrix, containment model, tool-gated repair, schema migration.

What is partial: Percept distortion (hook exists, rich system not built), recursive cascade (communication + world events, not all types), synth internal statechart (field exists, not engine-evaluated), sensory range/occlusion (channel + location heuristics, not true physics).

Documentation

Document Audience What it covers
Architecture Guide Developers How the system works, module map, design decisions
Development Guide Developers Setup, testing, extending, debugging, cost management
Scenario Authoring Contract Scenario authors Pack format, component types, what works and what doesn't
Requirements Spec Architects Full system/director/synth requirements
Compliance Matrix QA/reviewers Requirement-by-requirement implementation status
Observer Guide Operators/agents API control plane: observe, step, intervene, stream
Contributing Contributors How to work with the codebase, areas for expansion

Dependencies