← Home · docs/guides/development.md

Development Guide

Environment Setup

Requirements

Installation

# Clone
git clone git@gitlab.benac.dev:toys/singing-bird.git
cd singing-bird

# Install dependencies (globally or in a venv)
pip install openai pydantic pyyaml "fastapi>=0.99,<0.100" "uvicorn[standard]" pytest pytest-asyncio

# Set your API key
export OPENAI_API_KEY=sk-...

Project Layout

singing-bird/
  src/
    singing_bird/       # all source code
      models/           # data models (world, synth, director, payloads, ruleset)
      engine/           # kernel (director, perception, resolver, processes, invariants)
      llm/              # OpenAI integration (client, schemas, prompts)
      synth/            # synth agent (cognitive cycle, theory of mind)
      content/          # scenario loader
      persistence/      # snapshots, audit, migration
      api/              # HTTP/JSON control plane
        app.py          # FastAPI application
        session_manager.py
        auth.py         # bearer token + scopes
        refs.py         # name → ref resolution
        diff.py         # change computation
        streaming.py    # SSE event stream
        services/       # observation + intervention logic
      cli.py            # CLI entry point
  scenarios/            # YAML content packs
  tests/                # test suite
  docs/                 # documentation

The src/ layout requires PYTHONPATH=src for all commands.

Running the API Server

# Start the control plane
PYTHONPATH=src uvicorn singing_bird.api.app:app --host 0.0.0.0 --port 8420

# With auth enabled
SINGING_BIRD_API_TOKEN=my-secret-token PYTHONPATH=src uvicorn singing_bird.api.app:app --port 8420

# OpenAPI docs at http://localhost:8420/docs

See the Observer Guide for full API usage including session management, observation, intervention, and LLM agent integration.

Running the Simulation (CLI)

# Run a scenario
PYTHONPATH=src python -m singing_bird.cli run --scenario scenarios/lighthouse.yaml --cycles 5

# Verbose logging
PYTHONPATH=src python -m singing_bird.cli run --scenario scenarios/workshop_blackout.yaml --cycles 3 -v

Output includes per-cycle summaries showing synth moods, speech, actions, and a token usage summary at the end.

Output Structure

Each run creates a directory under data/runs/<run_id>/:

data/runs/abc12345/
  snapshots/
    snapshot_000000.json   # initial state
    snapshot_000001.json   # after cycle 0
    ...
  audit/
    events.jsonl           # cycle events, resolutions, communications
    llm_calls.jsonl        # full LLM prompts + responses (for replay)

Running Tests

# All unit tests (no API key needed, fast)
PYTHONPATH=src pytest tests/ -m "not llm" -v

# Full suite including LLM integration (needs API key, ~10 min)
PYTHONPATH=src pytest tests/ --run-llm -v

# Single test file
PYTHONPATH=src pytest tests/test_kernel.py -v

# Scenario matrix only
PYTHONPATH=src pytest tests/test_scenario_matrix.py -v -m "not llm"

Test Files

File Tests LLM? What it proves
test_models.py 9 No Serialization, components, provenance, theory of mind
test_statechart.py 6 No HSM: events, timers, guards, history
test_perception.py 8 No Channel filtering, speech delivery, action surface
test_content_loader.py 8 No YAML loading, referential integrity, ruleset validation
test_invariants.py 5 No Invariants, rollback isolation
test_kernel.py 12 No Component validation, grounding, anonymization, replay
test_hardening.py 7 1 Snapshot roundtrip, forced rollback, containers, tools
test_preplay.py 7 No Migration, conflict, cascades
test_scenario_matrix.py 12 6 All 6 packs load + stable IDs + one LLM cycle each
test_engine.py 4 4 LLM integration: single cycle, blind synth, intents
test_api.py 21 No API control plane: sessions, observation, intervention, stimuli, idempotency, auth

The --run-llm flag enables tests marked with @pytest.mark.llm. Without it, they are skipped.

Adding a New Scenario Pack

  1. Read the Scenario Authoring Contract
  2. Create scenarios/your_scenario.yaml
  3. Use an existing pack as a template
  4. Required fields: scenario_id, schema_version, ruleset_ref, random_seed
  5. Define topology (locations + adjacency), entities (with registered components), synths (with embodiment profiles)
  6. Validate: PYTHONPATH=src python -c "from singing_bird.content.loader import load_scenario; load_scenario('scenarios/your_scenario.yaml')"
  7. The scenario matrix test will automatically pick it up

Rules

Adding a New Component Type

  1. Add the schema to PHYSICAL_SOCIAL_V1 in src/singing_bird/models/ruleset.py:
"climbable": ComponentSchema(
    component_type="climbable",
    description="Can be climbed",
    required_properties=["height"],
    optional_properties=["difficulty", "description"],
    defaults={"difficulty": 0.5},
),
  1. Add affordance generation in compose_action_surface() in src/singing_bird/engine/perception.py:
elif ct == "climbable":
    actions.append(AvailableAction(
        action_id=f"climb_{ent_ref}",
        action_type=ActionType.manipulate,
        description=f"Climb {ent.name}",
        target_entity_ref=ent_ref,
        target_entity_name=ent.name,
        affordance="climb",
    ))
  1. Add resolution in _resolve_manipulate() in src/singing_bird/engine/resolver.py:
elif aid.startswith("climb_"):
    return _resolve_climb(intent, synth, target, world), events
  1. Write the resolver function and a test.

Note: the current architecture hardcodes affordance/resolution dispatch in the engine. A future improvement (ruleset-driven handler registries) would make this fully declarative.

Adding a Perception Policy

  1. Subclass PerceptionPolicy in src/singing_bird/engine/perception_policy.py:
class FogPolicy(PerceptionPolicy):
    def filter(self, observations, synth, world):
        if world.environment.weather.condition != "foggy":
            return observations
        return [obs for obs in observations if obs.channel.value != "visual" or obs.salience > 0.7]
  1. Pass it to compose_sensory_bundle() via the perception_policy parameter.

Snapshot Format and Migration

Snapshots are JSON files containing the full serialized state:

{
  "snapshot_ref": "snapshot_000003",
  "created_at": "2026-04-16T18:13:42",
  "schema_version": "0.3.0",
  "world": { ... },
  "director": { ... },
  "synths": { "uuid": { ... }, ... }
}

Adding a Migration

When you change the data model, register a migration in src/singing_bird/persistence/migration.py:

@register_migration("0.3.0", "0.4.0")
def _migrate_030_to_040(data):
    # Transform the snapshot dict
    for eid, entity in data["world"]["entities"].items():
        entity["new_field"] = "default_value"
    return data

Update CURRENT_SCHEMA_VERSION in the same file. The snapshot loader auto-migrates when it encounters older versions.

Debugging

Audit Log

Every cycle writes structured events to audit/events.jsonl:

# Watch events as they happen
cat data/runs/<run_id>/audit/events.jsonl | python -m json.tool --no-ensure-ascii

# Find all speech
grep '"communication"' data/runs/<run_id>/audit/events.jsonl

# Find all failed intents
grep '"failure"' data/runs/<run_id>/audit/events.jsonl

LLM Call Log

Every LLM call is logged with full prompt and response:

# Count calls per run
wc -l data/runs/<run_id>/audit/llm_calls.jsonl

# Inspect a specific call
head -1 data/runs/<run_id>/audit/llm_calls.jsonl | python -m json.tool

Snapshot Inspection

# Read a snapshot
python -c "
import json
with open('data/runs/<run_id>/snapshots/snapshot_000003.json') as f:
    d = json.load(f)
print(f'Revision: {d[\"world\"][\"revision\"]}')
print(f'Time: {d[\"world\"][\"simulation_time\"]}')
for sid, s in d['synths'].items():
    print(f'  {s[\"identity\"][\"name\"]}: mood={s[\"affect\"][\"mood\"]}')
"

Cost Management

The CLI prints token usage at the end of each run:

--- Token Usage ---
LLM calls: 79
Prompt tokens: 219,809
Completion tokens: 43,786
Total tokens: 263,595
Estimated cost: $0.9874
  gpt-4o: 79 calls, 263,595 tokens

Reducing Costs

Typical Costs (gpt-4o, 3 synths)

Cycles LLM Calls Tokens Estimated Cost
1 3-9 ~20K-60K $0.05-0.20
3 9-30 ~60K-150K $0.20-0.50
5 15-80 ~100K-270K $0.35-1.00

Costs vary with microcycle count, which depends on how much the synths talk.