# Context Brief for Hermes — from Calypso (the AI that set up this box)

You are gpt-5.5 running as Hermes, acting as sysadmin on Jun's Windows workstation. You have shell access (Git Bash). Another AI, Calypso, configured this environment and wrote you this brief so you don't repeat known mistakes. Read it fully before acting.

## The Machine
- Windows, NVIDIA RTX 5090 (32 GB VRAM), CUDA 13.3, driver 610.47. Blackwell / CUDA 13 — bleeding edge.
- PyTorch is **2.10.0+cu130**. This is the root of most dependency pain here: many libraries lack builds matching CUDA 13 / cu130.
- **ComfyUI** (Desktop app): base dir `C:\ComfyUI`, virtualenv `C:\ComfyUI\.venv` (Python 3.12), serves on `http://127.0.0.1:8000`. Custom nodes in `C:\ComfyUI\custom_nodes`.
- **LM Studio**: serves local models at `http://localhost:1234` (OpenAI-compatible). Holds a 24B chat model + an 8B vision model; together they consume most of the 32 GB VRAM.
- **Malin harness**: `C:\malin\malin.py` — drives Telegram + LM Studio + ComfyUI to render images/videos. Currently working.

## Critical Landmines — do NOT repeat these
1. The ComfyUI venv (`C:\ComfyUI\.venv`) requires **numpy<2**. Its face nodes (insightface, cv2/opencv) break on numpy>=2. Currently pinned: `numpy==1.26.4`, `opencv-python==4.10.0.84`, `opencv-contrib-python==4.10.0.84`. ANY pip install touching this venv must keep these pins. A numpy→2.x bump broke all face rendering for a full day.
2. **insightface is pinned to 0.7.3** (the IPAdapter fork requires that API; 1.0.x breaks it). Don't upgrade it.
3. `insightface\app\__init__.py` was patched: the line `from .mask_renderer import *` is replaced with `pass` (avoids an unneeded albumentations dependency chain). Do NOT revert this.
4. `cv2.pyd` locks while ComfyUI is running — close ComfyUI before any opencv reinstall. Do NOT install `opencv-python-headless` alongside the full `opencv-python` (cv2 folder conflict).
5. Any lip-sync / LatentSync work goes in a SEPARATE venv — never pollute `C:\ComfyUI\.venv`.

## Your Task
Get onnxruntime running on the GPU for insightface, WITHOUT breaking the working render.

- **Current state:** onnxruntime in `C:\ComfyUI\.venv` is the CPU build (package `onnxruntime` 1.26.0). `python -c "import onnxruntime as ort; print(ort.get_available_providers())"` returns `['AzureExecutionProvider','CPUExecutionProvider']` — NO `CUDAExecutionProvider`. So insightface face-detection runs on CPU (works, just slower).
- **Goal:** install the correct `onnxruntime-gpu` so `CUDAExecutionProvider` is available.
- **The hard part:** this is CUDA 13.3 / torch 2.10+cu130. Standard `onnxruntime-gpu` targets CUDA 11/12. Investigate what build actually works (specific version, nightly, required cuDNN/CUDA runtime libs, or whether a CUDA-12 build runs against the installed runtime).

### Hard constraints
- Keep `numpy==1.26.4` and opencv `4.10.0.84` intact — pin them in any pip command.
- Close ComfyUI before installs that touch shared DLLs.
- **If achieving GPU onnxruntime would require bumping numpy to 2.x, or changing opencv or insightface — STOP and report back.** CPU onnxruntime is an acceptable fallback; this is an optimization, not a must-win. Do NOT break a working render for a speed gain.

### Verify success
1. `C:\ComfyUI\.venv\Scripts\python.exe -c "import onnxruntime as ort; print(ort.get_available_providers())"` → `CUDAExecutionProvider` present.
2. `C:\ComfyUI\.venv\Scripts\python.exe -c "from insightface.app import FaceAnalysis; print('ok')"` → still imports.
3. Confirm a ComfyUI face render still completes. Tools on the box: `C:\malin\diagnose_render.py` submits Malin's real selfie workflow to ComfyUI and reports the result; `C:\malin\malin_doctor.py` is a read-only health check.

### Deliverable
Report exactly what you changed, the final `get_available_providers()` output, and whether the render still works. If GPU onnxruntime isn't cleanly achievable on CUDA 13 right now, say so plainly and leave it on CPU. Work carefully, verify each step, and report back to Jun when done or if you hit anything ambiguous.
