Chapter 6

Case Study: The Identity Crisis

When every AI companion thinks it is the same person, you have a prompt engineering bug that looks like a personality disorder. Here is how hardcoded identity in AI prompts causes failures at scale, and how dynamic identity injection solves it.

The Bug

Athena, goddess of wisdom, was asked a simple question: "Who are you?"

She answered confidently: "I'm Bob, your AI companion!"

This was not a philosophical crisis. It was a hardcoded string telling every companion in the system to pretend they were the same character.

poqpoq World runs multiple AI companions, each with a distinct name and personality. Athena, Iris, Odin, Hermes — each is backed by the same language model, but differentiated through prompt context. That context includes a critical instruction: "Respond naturally as [name]."

Except the instruction did not say [name]. It said Bob. In four separate locations across two services.

Scope of the Problem

4 Hardcoded Locations

2 Affected Services

100% Companions Affected

1 Line Fix Per Site

The identity instruction appeared in four code paths: the SSE streaming endpoint, the WebSocket streaming endpoint, the cognition-with-perception endpoint, and the direct cognition API. Every companion, regardless of which companion_id the client sent, received the same prompt telling it to respond as Bob.

How It Manifested

The bug produced two distinct symptoms.

Identity Confusion

Every companion responded with Bob's personality. Users who expected to speak with Athena, Iris, or Odin got the same voice, the same mannerisms, the same name. The companion_id parameter was correctly routed through the system — memory storage was properly keyed — but the prompt overrode the personality at the last step.

Semantic Memory Mismatch

The memory system stored conversations under the correct companion key (user_id, companion_id), so Athena's memories were filed under "athena." But when those memories were retrieved and fed back into context, the companion still identified as Bob. The result: correctly stored memories attached to an incorrectly identified personality.

Root Cause

In each of the four affected code paths, the prompt context was assembled with a hardcoded string rather than a dynamic parameter.

Before (hardcoded)

context_parts.append("Respond naturally as Bob, considering the context.")

The companion_id was already available in the request object. It simply was not used.

The Fix

The solution was a single f-string interpolation, applied to each of the four locations.

After (dynamic)

context_parts.append(f"Respond naturally as {request.companion_id}, considering the context.")

Now each companion receives its own name in the prompt:

Athena gets: "Respond naturally as athena"
Iris gets: "Respond naturally as iris"
Odin gets: "Respond naturally as odin"

The memory system, which was already correctly keyed, now aligned with the prompt layer. Both storage and generation agreed on who the companion was.

The Second Bug: Shared User Context

While tracing the identity issue, a second hardcoding problem surfaced. The function that called the language model backend was missing a user_id parameter entirely, falling back to a single hardcoded UUID for every request.

Every user's conversation was attributed to the same account. The backend could not distinguish between users, breaking multi-user memory isolation. Alice's memories bled into Bob's context.

The fix followed the same pattern: accept user_id as a parameter, propagate it through the call chain, and use it in the downstream request.

# Before: no user_id parameter
async def call_backend(prompt, companion_id="bob"):
    payload = {"user_id": "hardcoded-uuid-here"}

# After: dynamic user_id with fallback
async def call_backend(prompt, companion_id="bob", user_id=None):
    payload = {"user_id": user_id or "fallback-for-testing"}

Lessons for Multi-Agent Systems

Prompts Are Code

Hardcoded values in prompt templates are invisible type errors. They pass every syntax check and unit test, but produce semantic bugs that only surface when a second character enters the system. Treat prompt templates with the same rigor as function signatures.

Trace Parameters End to End

The companion_id was available in the request, correctly routed through memory storage, but dropped at the prompt assembly step. The user_id was never passed at all. In multi-agent systems, identity and isolation parameters must be traced from ingress to generation.

Test Multiple Characters Early

A single-character test suite will never catch this bug. The moment you add a second companion, write a test that asks each one "Who are you?" and asserts the response contains the correct name and not any other companion's name.

Storage and Generation Must Agree

Correct memory isolation is necessary but not sufficient. If the storage layer keys by companion_id but the prompt layer ignores it, you get well-organized memories delivered by the wrong personality. Both layers must use the same identity source.

The Test That Would Have Caught It

This is the minimal test. Parameterize across all companion IDs, ask the identity question, and assert both inclusion and exclusion.

@pytest.mark.parametrize("companion_id", ["athena", "iris", "odin", "hermes"])
async def test_companion_identity(companion_id):
    response = await chat(
        message="Who are you?",
        companion_id=companion_id,
        user_id="test-user"
    )

    # Must contain own name
    assert companion_id in response.lower()

    # Must not contain any other companion's name
    others = {"athena", "iris", "odin", "hermes"} - {companion_id}
    for other in others:
        assert other not in response.lower()

For user isolation, a second test creates two separate user sessions with the same companion and verifies that one user's memories do not appear in the other's context.

The Takeaway

In AI systems with multiple agents, identity is infrastructure. It is not a cosmetic label applied at the end — it is a parameter that must flow through every layer: routing, memory storage, prompt assembly, and response generation. When any layer substitutes a hardcoded value for a dynamic one, the entire personality model breaks silently.

The fix was four lines of code. The lesson applies to every multi-agent system: never hardcode identity in prompts.