Paper Accepted @CVPR
π Happy to share that our paper "ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering" has been accepted at CVPR 2026 in Denver! π
I completed my Ph.D. at the University of Modena and Reggio Emilia, conducting my research at AImageLab, co-advised by Prof. Rita Cucchiara, Prof.Lorenzo Baraldi and Prof. Marcella Cornia.
My current research focuses on cutting-edge multimodal architectures and their integration with advanced retrieval techniques. I have extensive experience with NLP tasks and foundation models such as CLIP, and have worked extensively with vision-and-language architectures, primarily focusing on their evaluation and addressing the problem of hallucination. Recently, I have been working on the training and development of multimodal large language models, including LLaVA and its derivatives.
During my Ph.D., I also spent six months as a research intern at Amazon in London.
AImageLab, University of Modena and Reggio Emilia
6-month internship at Amazon London.
"Retrieval-augmented Transformer for Image Captioningβ
AImageLab, University of Modena and Reggio Emilia
Premio alla Memoria Davide Rabotti
University of Modena and Reggio Emilia.
π Happy to share that our paper "ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering" has been accepted at CVPR 2026 in Denver! π
π Happy to share that our paper "Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval" has been accepted at CVPR 2025 in Nashville! π
Our project "VISTA β Versatile Intelligent Systems for Tailored and Adaptive Next-Generation Multimodal AI" was accepted for the EuroHPC Extreme Scale grant, with an allocation of almost 1M GPU hours. Read the news on the UNIMORE website
π We are introducing LLaVA-MORE, a family of models that enhances LLaVA by integrating LLaMA 3.1 as the language model. Check out our Github repo! π
Participation to National and European Projects
Contributing to the European initiative for developing multimodal AI systems.
Participating in the Italian National Research Project on multimodal reasoning and vision-language alignment.