# MIL Per-Emotion Render Template (Playbook)

Set 2026-06-07. The repeatable spec for rendering ANY emotion's MIL so it's seamless, in-sync, and on-brand. Validate it on 2 teaching emotions (1 engaged + 1 withdrawn), then Hermes runs the rest of the library SOLO against this template; Cal spot-checks. Goal: get the emotion library to self-serve so we move to the next project without leaving quality to memory.

## 1. MIL structure (the proven shape)
`neutral still → ONSET (animates OFF the neutral still) → seamless FORWARD-only MIL loop → POST-SPEECH WIND-DOWN → slow OFFSET → settle EXACTLY on the neutral still.`
- Loop = FLF2V with the SAME settled-mood anchor as BOTH first & last frame; trim seam-duplicate frames. FORWARD only, never ping-pong/reverse.
- Fire onset at COMPREHENSION via a LEADING `[PERFORM:emotion]` tag (she emotes during gen-latency).
- Speech-adaptive: loop the MIL while she talks → short tail → offset. Never truncate the onset; always play the offset.

## 2. Gaze rule (per emotion) — see mil_gaze_rule.md
- DEFAULT = eyes LOCKED FORWARD (engaged: happy, flirty, curious, confident, affectionate, positive-surprise...).
- AVERTED gaze ONLY for the WITHDRAWN family (sadness, disappointment, shyness, distracted).

## 3. Post-speech wind-down — CONTAINED REST POSE (not a direct drop to neutral)
ROOT PROBLEM: landing on FULL NEUTRAL reads as "becoming unhappy" — neutral is far more contained than an engaged expression, so [expression] → neutral is a drastic emotional DROP regardless of tempo.
FIX: insert a CONTAINED (less-extreme) version of the expression as a resting buffer:
`talking → lips ease closed → CONTAINED RESTING POSE of the expression (e.g. happy → a soft warm closed-mouth resting smile, eyes still engaged) → HOLD a beat, kept alive with A BLINK OR TWO → gentle ease to neutral.`
- The rest pose is CHEAP: a near-static held frame of the contained expression + a blink or two (reads ALIVE, not frozen). Does NOT need full animation.
- Every expression gets its OWN contained resting version as the buffer before neutral.
- This makes the exact return-TEMPO far less important — the contained rest pose does the emotional work, not the speed.

## 4. Offset tempo
- Slow the offset settle via frame INTERPOLATION (minterpolate; NO duplicated/held frames — those fail Check B).
- Target factor: the value Jun dials on happy (started at 3.46x = too slow; he's tuning to ~"second + 15% faster"). LOCK the exact factor once happy is confirmed, then REUSE it for every emotion.
- Lands soft on neutral (final-frames MSE low, no snap).

## 5. Lip-sync
- Use the REAL-TIME-DEPLOYABLE model (confirmed via the happy POC — NOT an offline model like LatentSync, which can't run live). Mouth-only; preserve the validated motion + the gaze direction.

## 6. TTS lines
- Spell "Jun" as "June" in any spoken line (he pronounces it like the month).
- No em-dashes / mojibake (they garble TTS).

## 7. Self-QA GATE — run BEFORE handing Jun any render; post the numbers next to the video
- CHECK A: NOT a palindrome (no reverse/ping-pong loop).
- CHECK B: clean frame-diff curve (0 frozen/duplicate pairs; clean bridges; soft landing).
- CHECK C: settles on neutral (first & last frame ≈ the neutral anchor).
- PLUS: gaze correct for the emotion (locked vs averted per §2); wind-down present (§3); tempo matches the locked factor (§4).
- A render that fails any check bounces back to Hermes, NOT to Jun.

## 8. CONFER before deviating (mandatory)
- If the specced model / method / tempo won't work, STOP and flag Cal — do NOT silently substitute. (The LatentSync-model and ping-pong-loop misses both came from skipping this.)

## 9. Deliver
- ONE emotion at a time, validate-before-code: render the example → run the QA gate → post to @theCounsel3 with the numbers + the video → wait for the verdict before scaling.