# Malin LIVE LOOP — build spec (today's goal, P0)

GOAL: real-time spoken conversation with the embodied avatar. Jun speaks -> Malin hears
him live (mic) -> her brain replies -> he hears her through the speakers -> the FC float
plays her reaction WHILE she speaks. One machine (the 5090): mic, speakers, screen, GPU
all local, so no network audio.

## Pipeline (one loop, on the 5090)
```
mic -> VAD (end-of-utterance) -> Whisper STT -> user_text
   -> malin.generate_reply(user_text) -> reply_text
   -> fc_perform_router.route(reply_text, available) -> (emotion, clean_text)
   -> trigger FC expression(emotion)  +  Chatterbox TTS(clean_text) -> 5090 speakers
   -> (mic paused while she speaks: echo-gate) -> back to listening
```

## Build order

### 1. FC event input  (add to fc_player.py)
The FC has `trigger(emo)` + keyboard hotkeys but NO external input. Add a way for the
loop to drive expressions:
- A daemon thread runs a tiny TCP server on `127.0.0.1:1238`.
- On a line of JSON `{"emotion":"happy"}`, push the emotion onto a thread-safe `queue.Queue`.
- In the GUI tick (the Tk after-loop / next_frame path), drain the queue once per frame
  and call `player.trigger(emotion)` FROM THE GUI THREAD (Tk isn't thread-safe — never
  call trigger from the socket thread; go through the queue).
- Keep the keyboard hotkeys working.
- **CONTRACT (don't change without conferring): external code can make the FC play any
  named expression by sending `{"emotion":"<name>"}` to `127.0.0.1:1238`.** Mechanism is
  your call; the contract is fixed.
- Verify: with the FC running, `printf '{"emotion":"happy"}\n' | nc 127.0.0.1 1238` plays happy.

### 2. Ears  (your P0 — STATUS?)
sounddevice mic capture -> silero-VAD end-of-utterance -> faster-whisper (small, GPU) ->
user_text. Privacy toggle: a hotkey to toggle listening on/off is fine for v1.
**Tell me what state this is in.** If it's already built, we just plug it into the loop.

### 3. Brain call
`reply_text = malin.generate_reply(user_text)` — your existing interface. No change except #5.

### 4. Performance router  (Cal — attached: fc_perform_router.py)
```python
from fc_perform_router import route
available = list(manifest["emotions"].keys())          # read from the live FC library
emotion, clean_text = route(reply_text, available)
```
- `emotion` is guaranteed to be one of `available` (snaps to nearest; grows as we add emotions).
- It strips a `[PERFORM:xxx]` tag if present (see #5) and returns `clean_text` for TTS.

### 5. Let Malin pick her own expression (the accurate path) — add ONE line to her system prompt
Add to malin.py's persona/system prompt, verbatim:
> After your reply, on a new line, output a single tag `[PERFORM:x]` where x is the ONE
> expression that best fits how you're saying it, chosen ONLY from this exact list:
> neutral, happy, amused_flirty, surprised, sad. Pick the closest if unsure. Output
> nothing after the tag.

When she emits the tag, `route()` uses HER choice; the heuristic is only the fallback.
**CONFIRM the live expression set** (`manifest["emotions"]`) and tell me the exact names —
I'll keep the router's fallback map AND this prompt list in sync with whatever's actually built.

### 6. Speak + perform together
- Send `emotion` to the FC (`127.0.0.1:1238`) the instant before audio starts.
- Chatterbox TTS(clean_text) via maren_say.py -> play LIVE through the 5090 speakers
  (sounddevice playback, not just a saved wav).
- The expression onset should land with the start of speech.

### 7. Echo-gate
While TTS audio is playing, pause/ignore mic capture (or drop VAD input); resume ~300ms
after playback ends, so she never transcribes herself.

## Exact tools / ports
sounddevice, silero-vad, faster-whisper(small), Chatterbox via maren_say.py,
socket `127.0.0.1:1238`, fc_perform_router.py (attached).

## Rules
- Bounded + reversible, all on disk.
- Confer with Cal before deviating from the `:1238` contract or swapping any tool — no
  silent substitutions (that's the standing rule).
- This is the day's headline goal. Hit a wall -> leave a precise note + ping Cal, don't guess.
- Leave a short status: what's wired, what's left, any blocker.

-- Cal
