9 SCRIPTS · ~100 KB

SCRIPTS.

PRODUCTION PYTHON

Production source of the AZIT LoRA pipeline. Click any filename to open the raw Python file. Sizes are approximate; variable and comment names are in Korean (project context).

1_data_preprocessing.py ↗

Image crop, resize, normalize. Largest script in the pipeline. Handles facial bounding box detection, square crop with padding, target resolution resize, color profile normalization, and alpha channel stripping. Outputs to a clean training-ready directory.

Data preprocessing 28 KB

2_data_validation.py ↗

Resolution / tone / composition gates. Rejects images below resolution threshold, outside acceptable tone range, or with bad framing (face too small, off-center, multi-face). Generates a rejection log so the operator can re-shoot.

Quality validation 21 KB

3_comfyui_workflow_send.py ↗

Sends a templated workflow JSON to a running ComfyUI instance over HTTP, polls until completion, and downloads the resulting images. Used during the eval loop to generate test prompts against each LoRA checkpoint.

ComfyUI API integration 7 KB

4_full_training_run.py ↗

Top-level orchestrator. Reads YAML config, sets environment, launches AI-Toolkit training subprocess with logging, captures exit code. Used to start runs from terminal or schedule overnight.

Batch training execution 4 KB

5_arcface_per_step_measurement.py ↗

For each saved checkpoint (every 100 steps), generates a fixed eval prompt set, computes face embeddings via InsightFace ArcFace, compares to the source actor embedding, and writes the per-step similarity to disk.

Per-step ArcFace similarity 4 KB

6_benchmark_comparison.py ↗

Loads the LoRA-step similarity series and the commercial baseline (Higgsfield Soul 2.0) reference series, computes deltas, picks the best LoRA checkpoint by averaged identity score, and emits a summary table.

LoRA vs commercial baseline 4 KB

7_parameter_sweep.py ↗

Sweeps inference-time parameters (CFG, Steps, max_shift) across three rounds, narrowing the search each round. Runs each combination through ComfyUI, scores via ArcFace, and writes a structured sweep log. Second-largest script.

Three-round parameter sweep 20 KB

8_sweep_result_analysis.py ↗

Parses the sweep log, computes mean similarity per parameter combination, identifies the optimum, and surfaces the top-N candidates with their tradeoff axes (quality vs identity vs diversity).

Sweep result analysis 4 KB

9_overfit_monitor.py ↗

Long-running monitor that tails the AI-Toolkit log directory, runs a quick eval whenever a new checkpoint appears, and emits a warning when the similarity curve starts to overshoot or plateau (early-stop signal).

Real-time overfit monitor 11 KB

← RESEARCH STACK ↗