Singing Bird
A declarative, scenario-driven, LLM-mediated embodied simulation kernel.
Singing Bird simulates canonical worlds containing synthetic agents (synths) whose subjective experience may differ from reality. A director maintains authoritative world truth. Synths receive filtered sensory input, deliberate via real LLM calls, and emit grounded action intents. The engine resolves those intents against canonical physics, propagates consequences, and commits transactional snapshots.
Scenarios are declarative YAML content packs. The engine does not know what a lighthouse is, or a clinic, or a dock. It knows locations, adjacency, entities with typed components, sensory channels, and world processes. All scenario-specific content lives in the packs, not in engine code.
Key Concepts
| Concept | What it is |
|---|---|
| Director | The sole authority over canonical world truth. Runs the cycle, delivers perceptions, resolves actions, commits snapshots. |
| World | The canonical simulation state: topology, entities, environment, processes. Ground truth. |
| Synth | A subjective agent with identity, beliefs, mood, projects, memories, and theory of mind. Receives sensory bundles, emits intents. Never directly reads canonical state. |
| Content Pack | A YAML file defining a complete scenario: locations, entities, synths, processes, latent facts. Loaded and validated against a ruleset. |
| Ruleset | The ontology layer: defines what component types exist, validates content at load time. Currently ships physical_social_v1. |
| Action Surface | A grounded menu of available actions presented to each synth each cycle. Synths select by symbolic ID, not free text. |
Quick Start
# By default, singing_bird calls the Centroid LLM router (llm.benac.dev).
# The router picks whichever backend is currently scaled up (local GGUF or
# OpenAI pipe). Models are addressed by capability alias — "@chat" in the
# shipped scenarios — so consumers don't need to be reconfigured when the
# backend changes.
#
# Override with: export OPENAI_BASE_URL=https://api.openai.com/v1 (direct)
# export OPENAI_API_KEY=sk-... (needed for direct)
# Run the lighthouse scenario (3 cycles)
PYTHONPATH=src python -m singing_bird.cli run --scenario scenarios/lighthouse.yaml --cycles 3
# Run the workshop blackout (different world, same kernel)
PYTHONPATH=src python -m singing_bird.cli run --scenario scenarios/workshop_blackout.yaml --cycles 3
# Start the API control plane (observe, step, intervene via HTTP)
PYTHONPATH=src uvicorn singing_bird.api.app:app --host 0.0.0.0 --port 8420
# Run unit tests (no API key needed)
PYTHONPATH=src pytest tests/ -m "not llm" -v
# Run full test suite including LLM integration (needs API key)
PYTHONPATH=src pytest tests/ --run-llm -v
Architecture
Content Pack (YAML) --> Loader/Compiler --> Initial Snapshot
|
DirectorRuntime
(pure transition kernel)
|
+------------------------------+------------------------------+
| | |
Perception SynthAgent Resolver
(channel-filtered, (LLM cognition, (component-based,
action surface) structured I/O) symbolic refs)
| | |
SensoryBundle CognitiveResponse ActionOutcome
(observations + (intents, beliefs, (success/failure +
grounded menu) speech, memories) canonical effects)
Three layers, strictly separated:
- Kernel: scheduling, transactions, event propagation, perception delivery, snapshots, replay
- Ruleset: component schemas, affordance definitions, validation rules
- Content Pack: locations, entities, synths, processes, initial conditions
See Architecture Guide for details.
API Control Plane
An HTTP/JSON service lets external agents observe, control, and intervene in running simulations:
# Start the server
PYTHONPATH=src uvicorn singing_bird.api.app:app --host 0.0.0.0 --port 8420
# Create a session, step it, see what happened
curl -X POST localhost:8420/v1/sessions -d '{"scenario_path":"scenarios/lighthouse.yaml"}'
curl -X POST localhost:8420/v1/sessions/{id}/step -d '{"n":1}'
curl localhost:8420/v1/sessions/{id}/reports/last-cycle
Three execution lanes: - Query: read-only inspection (overview, resolve, object detail, events, changes, reports) - Turn: external input routed through the director as mediated stimuli - Admin patch: typed interventions (move/create entities, change weather, inject beliefs) with dry_run/commit/commit_and_step modes
Plus SSE live event streaming, queued stimuli, auth scopes, idempotency, and a 20-tool manifest for LLM agent consumption. See the Observer Guide for full details.
Scenario Packs
Six scenarios ship with the project, all running on the same kernel with zero engine code changes:
| Scenario | Setting | Synths | Focus |
|---|---|---|---|
lighthouse |
Island lighthouse, 1847 | Thomas (keeper), Margaret (blind wife), James (castaway) | Blind embodiment, repair, weather, epistemic asymmetry |
park_bench |
Lakeside boardwalk | Alice (artist), Bob (retiree) | Social deliberation, low-stakes conversation |
workshop_blackout |
Auto repair shop | Mara (owner), Eli (apprentice), June (courier) | Local repair, switchable light, weather pressure |
storm_cellar |
Farmhouse in a storm | Ruth, Ben, Alma | Shelter, consumption, testimony, worsening weather |
night_clinic |
Late-night clinic | Ana (nurse), Victor (patient), Hale (orderly) | Care, waiting, low-light, small-object handling |
dockside_outpost |
Harbor dock at dawn | Inez, Pavel, Sora | Fog, signal maintenance, mixed interior/exterior |
See the Scenario Authoring Contract for how to write new packs.
Test Suite
107 tests total (96 unit + 11 LLM integration):
# Unit tests only (fast, no API key)
PYTHONPATH=src pytest tests/ -m "not llm" -v
# Full suite (slower, needs OPENAI_API_KEY)
PYTHONPATH=src pytest tests/ --run-llm -v
| Test file | What it proves |
|---|---|
test_models |
Serialization, components, belief provenance, theory of mind |
test_statechart |
Generic HSM: events, timers, guards, history states |
test_perception |
Channel filtering, communication delivery, action surface grounding |
test_content_loader |
YAML loading, referential integrity, ruleset validation |
test_invariants |
Engine invariants, rollback isolation |
test_kernel |
Component validation, anonymized scenario, symbolic grounding, replay log |
test_hardening |
Snapshot roundtrip, forced rollback, container reveal, tool-gated repair, containment invariants |
test_preplay |
Schema migration, conflict arbitration, cascade regression |
test_scenario_matrix |
All 6 packs load and run on same kernel; stable IDs across loads |
test_engine |
Full LLM integration: single cycle, blind synth, intent production, multi-scenario |
test_api |
API control plane: sessions, observation, intervention, stimuli, idempotency, auth |
Project Status
Version: 0.3.0 (experimental)
| Category | Met | Partial | Total |
|---|---|---|---|
| System requirements | 18 | 2 | 20 |
| Non-functional | 7 | 1 | 8 |
| Director | 15 | 1 | 16 |
| Synth | 14 | 2 | 16 |
| Hierarchical state | 7 | 1 | 8 |
| Invariants | 8 | 0 | 8 |
| Total | 69 | 7 | 76 |
See the full Compliance Matrix.
What works: Content decoupling, symbolic grounding, transactional rollback, replay infrastructure, six-scenario matrix, containment model, tool-gated repair, schema migration.
What is partial: Percept distortion (hook exists, rich system not built), recursive cascade (communication + world events, not all types), synth internal statechart (field exists, not engine-evaluated), sensory range/occlusion (channel + location heuristics, not true physics).
Documentation
| Document | Audience | What it covers |
|---|---|---|
| Architecture Guide | Developers | How the system works, module map, design decisions |
| Development Guide | Developers | Setup, testing, extending, debugging, cost management |
| Scenario Authoring Contract | Scenario authors | Pack format, component types, what works and what doesn't |
| Requirements Spec | Architects | Full system/director/synth requirements |
| Compliance Matrix | QA/reviewers | Requirement-by-requirement implementation status |
| Observer Guide | Operators/agents | API control plane: observe, step, intervene, stream |
| Contributing | Contributors | How to work with the codebase, areas for expansion |
Dependencies
- Python 3.10+
openai>= 1.50pydantic>= 1.10, < 2pyyaml>= 6.0fastapi>= 0.99, < 0.100 (API control plane)uvicorn>= 0.20 (API server)OPENAI_API_KEYor Centroid LLM router (for LLM-powered simulation)