RESEARCH.

AZIT LORA PIPELINE

Every choice in the AZIT pipeline was structured first as a written document, then implemented in code, then verified with a benchmark. Documents are in Korean (project context); English summaries provided here. Click any title to open the full document.

Pipeline Manual ↗ OPEN

Top-level execution map · 12 KB

Single-page index that maps the nine pipeline stages to specific scripts, lists key results (model, optimal checkpoint per character, ArcFace deltas, optimal CFG/Steps/Shift), and shows the file tree so anyone joining the project can orient in one read. Anchor doc.

CONTAINS

▸ 9-stage flow chart from data prep to overfit watch
▸ File-to-stage mapping table
▸ Key results: AVG 0.732 at 2200 steps for female actor
▸ Final hyperparameters: CFG 3.0, Steps 70, max_shift 0.5

Dataset Bible ↗ OPEN

Constructive principles for realistic-person LoRA datasets · 47 KB

The longest document in the project. Synthesizes consensus from FPHam (500+ LoRAs trained), djdante's verified Klein 9B run, CivitAI threads, and Reddit/SD community findings. Distinguishes what to keep constant (identity) versus what to vary (angle, distance, expression, lighting, wardrobe), and explains why caption-only diversity cannot replace image-level diversity.

CONTAINS

▸ Image count guide: 10-30 optimal, 50+ degrades
▸ 'Identity constant, everything else varied' rule
▸ Caption mechanics: tagged features become conditional, untagged become identity
▸ Common pitfalls (statistical bias from overly-uniform backgrounds)

Training Strategy (djdante-based) ↗ OPEN

Verified reference setup adapted for our characters · 27 KB

Built on djdante's published Klein 9B training config (29 chest-up images, simple captions, ~4000 steps). Documents the trigger word strategy, caption philosophy, hyperparameter choices, and the deliberate departures we made for our two-character project on RTX 5090.

CONTAINS

▸ Trigger word convention: trigger + descriptive caption
▸ 29-image guideline tested for chest-up identity capture
▸ AdamW · bf16 · qfloat8 quantize · EMA decay 0.99
▸ Reasoned departures from djdante where our use case differed

ArcFace Benchmark Results ↗ OPEN

Full 3000-step experiment log · 19 KB

Per-checkpoint ArcFace identity similarity, measured every 100 steps for both actors. Picks the optimum at 2200 steps for the female actor (AVG 0.732) and 1500 for the male, beating the commercial baseline (Higgsfield Soul 2.0) at 0.5090. Documents the underfit / optimal / overfit transition shape.

CONTAINS

▸ Female actor optimum: Step 2200, AVG 0.732
▸ Male actor optimum: Step 1500, AVG 0.692
▸ LoRA 0.6050 vs commercial baseline (Higgsfield Soul 2.0) 0.5090 (+18.9%)
▸ Curve shape evidence for 'pick balance, not max' rule

Pipeline Execution Report ↗ OPEN

End-to-end run report (female actor v3) · 14 KB

Single-run write-up with timestamps, decisions, deviations from plan, and final selected checkpoint. Functions as both a record for future runs and a reproducibility note for anyone wanting to repeat the experiment.

CONTAINS

▸ Timestamped run log
▸ Decisions and their evidence
▸ Final selected checkpoint and why
▸ Re-run instructions

LoRA Fine-tuning Settings Comparison ↗ OPEN

Side-by-side configuration matrix · 22 KB

Cross-tab of training configurations across runs. Surfaces the tradeoffs between learning rate, optimizer, EMA, batch size, gradient accumulation, and quantization choice. Used to justify the final picked config over alternatives.

CONTAINS

▸ Configurations matrix (lr, optimizer, batch, EMA)
▸ Tradeoff annotations per setting
▸ Justification for picked config

Dataset Guide (Web Document) ↗ OPEN

Operational checklist for actor onboarding · 26 KB

Companion document distilling the dataset bible into a checklist for actor onboarding sessions: pose list, lighting requirements, wardrobe variety, do/don't examples. The on-the-ground operations layer above the bible.

CONTAINS

▸ Pose checklist for shoot day
▸ Lighting / wardrobe requirements
▸ Do / don't examples

← OVERVIEW SCRIPTS ↗