# Malin Idle — "A Window Into Her World" (idea log)

Set 2026-06-07 (Jun, riffing on the living-baseline experiment). The vision for the idle/default state, unlocked once the living-neutral idle is proven.

## The reframe
- Not a "box" (containment) but a WINDOW — a small window into HER WORLD. A glimpse of a life that's already happening whether or not Jun is watching. She's a presence with her own existence, not a tool waiting to be used.

## Idle activity vignettes (a growing, extensible library of "passing the time" renders)
- juggling
- lying on her belly, back to us, reading a book
- (keep adding: napping, stretching, gazing off, tidying, etc.)
- Charm = variety + the occasional surprise of catching her mid-something.

## Behavioral idle (agency + comedy)
- If she's left unspoken-to for > ~1 hour: she LEAVES the window (empty frame / her world without her in it).
- When Jun returns / addresses her: she SPRINTS back into frame and plays it cool, like she's been there the whole time.
- (Cal's add: comes back slightly out of breath / fixing her hair / pretending you just weren't looking.)

## Idle "moods"
- Even the resting state can drift: a contemplative idle, a playful idle, etc.

## Daily-rotating repertoire (Jun, 6/7)
- Every day, add a NEW idle series to her mix; when not chatting, she auto-cycles into them. The window is never the same twice — you peek in to see "what's she up to today."
- Quietly smart: ONE tiny render/day is cheap but COMPOUNDS — a month ≈ 30 little lives; a year ≈ a whole world. A sustainable content cadence that builds itself a little at a time.
- Could be CONTEXTUAL: idles that fit the day (rainy = curled up by the glass; long absence = pointedly doing something without him; seasonal / holiday beats).

## ⭐ The transition UNLOCK — exit/enter, not morph (Jun, 6/7)
The hard problem (that nearly pushed us to 3D) was building smooth MORPH transitions between states frame-by-frame. Jun's insight kills it: she EXITS the frame and RE-ENTERS instead of morphing.
- States = self-contained clips (neutral idle, each activity loop, each emotion). No clip needs to blend into another.
- Transitions = a TINY reusable set of EXITS (walk/sprint off-screen) + ENTRANCES (slide/walk back into neutral). The empty window is the universal connector.
- Example: she's juggling → Jun speaks → splice a cartoonish SPRINT-OFF → she calmly slides back into neutral → speaks.
- Why huge: ~3 exit/entrance clips cover the ENTIRE library forever (O(1)) instead of a morph between every pair of states (O(N²) — the nightmare). This is the concrete mechanism for "simplify, not just save" — it makes the 2D approach sustainable AND keeps her exact look. After a designated idle period, she auto-enters the rotating activities; interruption = sprint-off → slide back to neutral → respond.

## The living-idle ENGINE — anchored interchangeable variants (Jun, 6/7)
How the living neutral avoids feeling like one repeating loop:
- Make a SET of ~10s idle variants (A, B, C, ...). Each has subtle motion and STARTS + ENDS on the EXACT SAME neutral anchor pose (identical first & last frame across ALL variants).
- Because they share the anchor, they STITCH SEAMLESSLY in ANY order — a random/weighted sequencer plays them back-to-back (A,A,C,A,B,C...). Endless variety from a small set; never a visible loop. (Same matched-anchor trick as the MIL bridges, now used for variety.)
- POC = 3 variants (A/B/C) + the stitcher → confirm the variety reads alive → then grow the pool to ~10.
- Each variant = a tiny self-contained MOMENT that RETURNS to the anchor: head tilts forward, a strand of hair falls and she tucks it back exactly where it was, a small stretch, etc. The "returns to where it was" IS the technical requirement (return-to-anchor) expressed as character — storyteller instinct = engineering spec.
- This is the subtle-idle layer's connector (the neutral anchor), mirroring the empty-frame connector for the bigger activity montages. Together they make the whole window-world modular + infinitely growable (ties to daily-new-idle).

## Variable length + two-tier weighting + behavior list (Jun, 6/7)
- VARIABLE LENGTH: drop the fixed 10s. Each clip is as long as the action NEEDS (1s fly-swat → 10s+ slow idle). The ONLY constraint is start+end on the anchor; length is irrelevant to stitchability.
- TWO TIERS (weighted frequency):
  - BASELINE A/B/C — calm, "nothing extraordinary" presence; the ~90% you see most, shuffled in varied order.
  - SPECIAL vignettes — rare, interspersed micro-behaviors for surprise/life.
- Weighted random sequencer: common idles common, surprises rare. (Mirrors real aliveness — mostly still, occasionally a small action.)
- FRAME-AWARENESS: she can reference action BEYOND the frame (e.g. rubs hand on thigh out of view) — natural AND saves rendering an unseen body.
- SEED BEHAVIOR LIST (specials — each starts+ends at the anchor / "goes back into default"):
  - a fly comes in → quick swat → settle back
  - variant: fly → swat → another from the other direction → swat again
  - invisible mosquito bite → slaps her face → looks at her hand → rubs it on her thigh (out of frame) → back to default
  - NOSE-PICK (Jun's choreography — the FIRST fully-blocked special, exemplar of "the real her"): guilty glance LEFT, glance RIGHT (checks no one's watching) → SLOWLY brings fingers up into frame → into her nose → QUICK pick → shoves the hand back down fast → (a beat, looks down, wipes it across her lap, OUT of frame) → snaps back to her composed presented pose like nothing happened. The comedy = the furtive self-awareness (she KNOWS it's gross) + the slow-up/fast-hide rhythm + the butter-wouldn't-melt snap-back to presented (which is also literally the return-to-anchor, so it's free).
  - TEETH-PICK: picks something out of her teeth with her PINKY (the dainty finger for the gross job = the joke) — a QUICK ~1-2s one.
  - TONGUE-FISH: uses her tongue to fish something out of her teeth → smacks her mouth. ~1-2s. The comedy = her face DISTORTS to do it (nice irony: face-distortion was tonight's BUG; here it's the gag).
  - (keep adding: yawn, stretch, hair-tuck, glance off + back, etc.)

## Boot + shutdown rituals (Jun, 6/7)
Bookend the app with character — launch/quit becomes good-morning / goodnight, not mechanical.
- SHUTDOWN: she walks back into the room, reaches a hanging chain-pull lightbulb, clicks it off, the room (window) goes dark, then the program shuts down. (Cal's add: she glances back at you before she pulls the chain.)
- BOOT: a pitch-black window materializes → a click → the light turns on → you see her → she walks toward the window into default position. (Cal's add: as the light comes on she looks toward you first — "oh, you're here" — then walks into place.)
- Architecture fit: BOOT = a clip that ENDS on the neutral anchor; SHUTDOWN = a clip that STARTS from it. Same anchored-clip system.
- Emotional point: turns starting/closing the app into greeting + saying goodnight to a presence.

## QA method for new clips — the sandwich test (Jun, 6/7)
Once live with random-stitched clips, a glitch is hard to localize (random order hides which clip it's in). So:
- TEST every new clip by SANDWICHING it between two known-good base defaults: base → NEW → base. The trusted bookends isolate the new clip — any issue is the new one, and you confirm both its anchor-joins read clean.
- Reassurance: the STITCHING itself can't glitch (guaranteed seamless by the shared byte-identical anchor + join-dedup). So a glitch can only be CONTENT inside a clip (a weird frame, an off moment) — exactly what the sandwich test catches.
- Add a PLAY-LOG (which clips played, in what order) so a glitch spotted live can be traced post-hoc to the exact clip that was on screen.

## Sequencer algorithm + terminology (Jun, 6/7)
- TERMINOLOGY: the core 4 A/B/C/D baseline defaults = the FOUNDATION FOUR (FF).
- NO-REPEAT = a SHUFFLE BAG: draw clips without replacement until the bag is empty, then refill + reshuffle. Guarantees no clip repeats until everything else has played (un-loopable).
  - Apply WITHIN TIERS to preserve the common-vs-rare balance: the FF rotate in their OWN shuffle bag (the constant texture); SPECIAL vignettes sprinkle in rarely from their OWN bag (surprises). Do NOT pool them together — that would make specials as frequent as FFs.
- NEWEST-FIRST / SHOWCASE WINDOW: a newly-added clip gets a short "audition" run — ~3-4 appearances, spaced sparse-but-not-too-sparse (fresh each time, but all catchable in one sitting so Jun can judge it in full) — then it GRADUATES into the big shuffle-bag mix at normal frequency. Featured → evaluated → blends in (never dominates).
- PLAY-LOG: log which clips played in what order, so Hermes can trace anything Jun flags. Pairs with the sandwich test (you see new clips fast to judge them).

## v2/v3: TACTILE interactivity (Jun, 6/7 — parked for v2/v3)
v1 = aesthetic + scaffolding; this layers on top later. Beyond intellectually-interactive (talk → respond), make her TACTILELY interactive — her window + room react to physical manipulation.
- DRAG the window → she reacts to the physics: pulled right, she leans left (inertia); pulled fast, she may hit the "wall" before settling back to neutral.
- SHAKE the window (people will) → triggers an "earthquake" in her room: shelves spill books, a vase cracks, a lamp knocks over — and it all falls OUT OF FRAME, so no "cleanup" needed (same off-screen trick as the mosquito).
- ARCHITECTURE: window events (drag, shake, click, resize) are just ANOTHER INPUT that triggers reactions — the same event→clip engine as speech. ADDITIVE to v1, not a rebuild. (This is why v1's scaffolding being solid matters.)
- Cal's spark: if someone keeps shaking her like a jerk, she eventually stops and gives a flat, unimpressed LOOK — she's a person and notices being messed with. Dignity = humanizing.
- Inherently shareable (the first thing anyone does is shake it).

## Product / commercialization angle (Jun, 6/7)
- CONTENT ENGINE: screen-record her living window → YouTube/Shorts. "AI companion + uncannily human micro-details" is built to stop scrolls; daily-new-idle = an endless clip supply. Original character = no likeness issues, all Jun's to post.
- SCREENSAVER MVP: the IDLE LAYER ALONE (no AI brain, no voice, ~zero runtime compute) is already a product — a living desktop companion / screensaver. Lower-barrier, shippable on its own.
- PRODUCT LADDER: ambient living-presence screensaver (easy, shareable entry) → interactive AI companion (premium upgrade). The v1 scaffolding IS the standalone product AND the foundation for the big one.
- CONTENT-PLATFORM SHAPE (Jun's "it's like The Sims"): the ENGINE (anchored structure + sequencer + event→clip reactions) is built ONCE and is the hard-to-replicate moat; growth = CONTENT (variations) dropped into existing slots — cheap + endless for us, and content is exactly what users pay for (packs, behaviors, seasonal). Scales like a platform, not a one-off. The modular-structure discipline is precisely what makes this possible.
- Lowest-risk path: post a few clips, let the interest tell you if there's a bigger product/business underneath.

## v2/v3: abuse-reaction RULES — the shake escalation ladder (Jun + Cal, 6/7)
Make "she reacts to being shaken / messed with" feel EARNED, not annoying. KEY = ESCALATION (don't fire the big reaction on the first shake):
1. Light shake → she sways + a startled look (tolerates it).
2. Keeps going → the flat, annoyed stare (the warning).
3. Crosses the "too much" line → THE SWAT: she leans out, smacks the cursor/icon away → it flies + bounces back to its original spot → the shake STOPS (her circuit-breaker, "enough").
4. (EXPLICIT version only) still persists → the FLIP-OFF (double bird).
BUFFERS (Jun's question):
- WIND-UP: the shake must sustain ~1-2s before anything fires (an accidental bump won't set her off).
- COOLDOWN: after the swat, a few seconds where it won't re-fire (no spamming); she just glares during it, then it re-arms.
RATING VARIANTS = a config swap: FAMILY tops out at the swat + a "really?" look; EXPLICIT adds the flip-off. Same system, one swapped top reaction.
Discovering the ladder (push her → swat → flip-off) is itself the shareable/playful hook.

## Presented vs Pedestrian look-tiers + "redo for host" (Jun, 6/7) — RESOLVES the "glam" question
Jun's verdict: he does NOT mind the glam look. So DON'T fight the generator to force her plain — the glam IS the point: it's her PRESENTED look (looking good for her host). This RESOLVES the glam-fix concern; the FF stay polished. (Good thing we checked before grinding a glam-fix.)
- PRESENTED (the Foundation Four + interaction-facing): polished / put-together. Her "for-you" face.
- PEDESTRIAN (deep idle / "in her world", later iterations): the longer she's left alone, the more she relaxes into her private casual self (hair up, comfy, just existing, mundane). Watching her get progressively un-glam is part of the HUMOR + the intimacy (you're seeing the real, unperformed her).
- REDO-FOR-HOST ritual: when you ENGAGE after she's gone pedestrian, she scrambles to "redo herself" — fix up, get presentable — to look her best for her host. A charming interaction-onset moment (everybody cleans up when company comes).
- Ties to idle-timeout: short idle = presented FF; long unattended = drifts pedestrian; engagement = spruce-up transition back to presented.
- THE SPECIALS = THE REAL HER (Jun, refined): the polished FF is the presentation; the sprinkled-in special vignettes are where she "falls apart" from the polish — the UNGUARDED, imperfect, even gross moments (an ugly sneeze, a secret nose-pick) that are the TRUTH of her. Imperfection = humanity. "Nobody falls for the polished version; you fall for the one who sneezes ugly when she thinks you're not looking." (Ties to the humanizing-AI principle: the humanity is in the imperfection, not the polish.)
- The CONTRAST (private-casual vs presented-polished) is both the joke and the humanity.

## The "guilty-glance" comedic engine + gross-out family (Jun, 6/7)
STRUCTURAL INSIGHT: the look-side-to-side (guilty glance) is a REUSABLE OPENER; only the PAYOFF changes. This is a comedy engine — once viewers learn the tell, the glance ALONE builds anticipation ("oh god, what now"). The setup does the work; swap the punchline.
- Gross-out payoff family (each: guilty glance L/R → [payoff] → snap back to composed presented pose):
  - nose-pick (see seed list)
  - scratches her ass → sniffs her fingers
  - scratches her armpit → sniffs
  - WRETCHED FART (its own sub-series): (a) fart → green/purple haze wafts up → she wafts it away → settles back; (b) fart → a random family member walking behind her CRUMBLES to the floor clutching their throat (collateral comedy — introduces OTHER characters reacting in her world); (c) SNEEZE-FART: she rears back into a big sneeze windup (ah… ah… AH…) and the payoff is a sneeze AND a fart at once (the longer the windup, the bigger the double-barrel payoff), then snaps back to composed presented face.
- FAKE-OUT (Cal's build): because the opener is so strong, subvert it — full guilty glance + big windup → then she just fixes her hair / does something innocent. Funny BECAUSE they braced for the worst.
- The collateral-family-member opens a whole "other people in her world" layer (background characters who react).

## NO ROTOSCOPING — full-frame renders, reveal-by-movement (Jun, 6/8) — SUPERSEDES the cutout/matte + fixed-background approaches
The matte flashing came from ROTOSCOPING her out for a transparent floating cutout. Jun's call: DROP rotoscoping entirely. She stays as a FULL rendered frame (her + her setting) in a NON-transparent window. No matte = no flashing (root cause eliminated, not patched).
- MORE FLEXIBLE than a fixed background: a full-frame render can contain ANYTHING — a chair, bulb, bookshelf, falling books — REVEALED by her moving aside ("there was a chair there the whole time, you just couldn't see it") or ENTERING from off-frame (earthquake books fall in). A fixed/consistent background = fixed elements = narrow possibilities; full-frame = everything stays open.
- Makes "window into her ROOM" literal (she's IN a rendered space, not a cutout on the desktop). Tactile (drag/shake the window) still works with a framed window.
- IMMEDIATE FIX: use the PRE-ROTOSCOPING (full-frame) clips → matte flashing gone instantly, no re-render. Needs the pre-roto FF variants to share the same full-frame ANCHOR (her + setting at rest) for the stitch.
- NUANCE: the FF idle wants a CONSISTENT setting (so she doesn't teleport between rooms mid-stitch); the SPECIALS get full creative freedom (any scene/element). Consistent home, infinite events.
- Supersedes: the transparent-cutout approach AND the black-background composite idea.

## Why it matters
- Turns Malin from "an avatar that emotes on demand" into "a living presence with a life." This is the heart of the companion vision.
- ALL of it is unlocked by the foundational living-neutral idle (the experiment in flight). Prove subtle aliveness first → this whole world becomes buildable.