Visual Attribute Reliability

A reproducible, reliability-first framework for visual attribute verification. The reference study uses a category-scoped Fashionpedia neckline task to compare a matched frozen SigLIP2 control with an audited vision-attention LoRA adaptation under leakage-aware evaluation and explicit evidence contracts.

Why this repository exists

Vision adaptation results are easy to overstate when task construction, validation access, control design, or evidence provenance are unclear. This repository makes those boundaries explicit:

Train-only task definition with category 33 (neckline), exactly one target attribute, and positive bounding-box area.
Image-group-disjoint development split so an image cannot appear in both train and development.
Matched control comparison between a frozen image encoder with a trainable head and a vision-attention LoRA arm with the same head.
Untouched official validation confirmation after selecting development checkpoints.
Separate hosted-CI and local evidence gates so a green GitHub check is not misrepresented as full model retraining or full-release verification.

Headline result

On the fixed official Fashionpedia validation subset, the audited LoRA arm achieved 0.6681 Macro-F1 versus 0.5798 Macro-F1 for the matched frozen control: an absolute improvement of 0.0883, or +8.83 percentage points.

Final confirmation metric	Matched frozen control	Vision-attention LoRA	Difference
Macro-F1	0.5798	0.6681	+0.0883
Top-label ECE, raw	0.0840	0.0652	-0.0187
Selected development epoch	6	5	—

The final-confirmation subset contains 654 eligible instances across 644 images, with zero image overlap with source train/development data. The source-task reconstruction contract covers 20,800 source train/development pairs.

Scope and claim boundaries

This is a category-scoped, seven-class neckline-attribute verification study, not a full Fashionpedia benchmark, consumer-to-shop retrieval benchmark, or production catalog system.

The seven target attributes are:

round (neck), v-neck, oval (neck), sweetheart (neckline), boat (neck), scoop (neck), and straight across (neck).

The evidence supports the stated frozen-versus-LoRA comparison under the documented protocol. It does not support a post-calibration LoRA-superiority claim, a full-dataset claim, or a claim of deployment performance.

Evidence and reproducibility

Evidence layer	What it verifies	What it does not verify
Hosted CI fixture contracts	Static task, split, metric, claim-boundary, and fixture-hash contracts tracked in Git	Model loading, checkpoints, raw Fashionpedia data, inference, training, or validation rescoring
Local full evidence contracts	Checkpoint presence, staged source-artifact hashes, and full release-file SHA-256 manifest	Hosted retraining or a public model-serving workflow
Evidence report	Experiment protocol, baselines, final confirmation, and error-transition analysis	A general claim beyond the fixed task and evidence boundary

Verify the tracked hosted-CI fixture

python -m pip install -r requirements-ci.txt
python -m pytest -q tests/test_ci_release_fixture_contracts.py
python -B scripts/validate_documentation.py

Verify the local full evidence release

The full release bundle is intentionally excluded from Git history. It must be present locally under dist/ before running the local evidence-contract suite.

.\.venv\Scripts\python.exe -B -m pytest -q `
  tests\test_release_evidence_contracts.py `
  -p no:cacheprovider

See Local evidence release for the release hash, archive structure, and verification boundary.

Repository layout

.github/workflows/      Hosted static-contract CI
configs/                Immutable task and experiment configurations
scripts/                Dataset, model, audit, and documentation utilities
tests/                  Hosted-CI fixture and local full-release contract tests
docs/                   Evidence, CI, release, and reproducibility documentation

Documentation

Data and model access

Raw Fashionpedia images and annotations, Hugging Face model cache, local checkpoints, and full release evidence archives are intentionally not committed to Git. Follow the applicable dataset and pretrained-model terms before obtaining or using those resources.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
configs		configs
docs		docs
scripts		scripts
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements-ci.txt		requirements-ci.txt
requirements-hog-baseline.lock.txt		requirements-hog-baseline.lock.txt
requirements-hog-baseline.txt		requirements-hog-baseline.txt
requirements-local-evidence.lock.txt		requirements-local-evidence.lock.txt
requirements-siglip2-frozen.txt		requirements-siglip2-frozen.txt
requirements-siglip2-lora.lock.txt		requirements-siglip2-lora.lock.txt
requirements-siglip2-lora.txt		requirements-siglip2-lora.txt
run_amazon07_fashionpedia_evidence_freeze_v1.ps1		run_amazon07_fashionpedia_evidence_freeze_v1.ps1
run_amazon07_fashionpedia_final_confirmation_v1.ps1		run_amazon07_fashionpedia_final_confirmation_v1.ps1
run_amazon07_fashionpedia_final_confirmation_v2.ps1		run_amazon07_fashionpedia_final_confirmation_v2.ps1
run_amazon07_fashionpedia_post_validation_analysis_v1.ps1		run_amazon07_fashionpedia_post_validation_analysis_v1.ps1
run_amazon07_fashionpedia_task_rule_forensics_v1.ps1		run_amazon07_fashionpedia_task_rule_forensics_v1.ps1
run_amazon07_fashionpedia_test_hardening_v1.ps1		run_amazon07_fashionpedia_test_hardening_v1.ps1
run_amazon07_fashionpedia_v2_bf16.ps1		run_amazon07_fashionpedia_v2_bf16.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visual Attribute Reliability

Why this repository exists

Headline result

Scope and claim boundaries

Evidence and reproducibility

Verify the tracked hosted-CI fixture

Verify the local full evidence release

Repository layout

Documentation

Data and model access

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Visual Attribute Reliability

Why this repository exists

Headline result

Scope and claim boundaries

Evidence and reproducibility

Verify the tracked hosted-CI fixture

Verify the local full evidence release

Repository layout

Documentation

Data and model access

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages