[AAAI'26] INTENT: Invariance and Discrimination-aware Noise Mitigation for Robust Composed Image Retrieval

1School of Software, Shandong University
*Corresponding author.

Abstract

MY ALT TEXT

Noise Types and Decision Boundaries in CIR

MY ALT TEXT

(a) shows typical Modality-inherent Noise in CIR. (b) reveals Cross-modal Correspondence Noise. (c) presents two cases: left successfully retrieves target with low confidence (50%), while right fails with high confidence (70%), illustrating varying decision boundaries in retrievals.


Framework: Invariance and discrimiNaTion-awarE Noise neTwork (INTENT)

MY ALT TEXT

The framework of our proposed INTENT. We designed (a) Visual Invariant Composition and (b) Bi-Objective Discriminative Learning, to mitigate modality-inherent noise and cross-modal correspondence noise, respectively.


Experiment

MY ALT TEXT

Performance comparison on the FashionIQ validation set in terms of R@K(%). The best and second-best results are highlighted in bold and underline, respectively.


MY ALT TEXT

Performance comparison on the CIRR test set in terms of R@K(%) and Rsub@K(%). The best and second-best results are highlighted in bold and underline, respectively.

MY ALT TEXT

The ablation study for modules of INTENT.

MY ALT TEXT

The ablation study for loss functions of INTENT.


MY ALT TEXT

Case Study on CIRR and FashionIQ.

BibTeX


        @inproceedings{INTENT,
            title={INTENT: Invariance and Discrimination-aware Noise Mitigation for Robust Composed Image Retrieval},
            author={Chen, Zhiwei and Hu, Yupeng and Fu, Zhiheng and Li, Zixu and Huang, Jiale and Huang, Qinlei and Wei, Yinwei},
            booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
            year={2026}
        }