I completed my Ph.D. at the University of Modena and Reggio Emilia, conducting my research at AImageLab, co-advised by Prof. Rita Cucchiara, Prof.Lorenzo Baraldi and Prof. Marcella Cornia.

My current research focuses on cutting-edge multimodal architectures and their integration with advanced retrieval techniques. I have extensive experience with NLP tasks and foundation models such as CLIP, and have worked extensively with vision-and-language architectures, primarily focusing on their evaluation and addressing the problem of hallucination. Recently, I have been working on the training and development of multimodal large language models, including LLaVA and its derivatives.

During my Ph.D., I also spent six months as a research intern at Amazon in London.

Recent News

β†’ View All News

Some Important Milestones

  • 2026

    PhD Dissertation

    AImageLab, University of Modena and Reggio Emilia

  • 2025

    Doctoral Consortium at CVPR 2025

  • 2025

    Amazon Research Internship

    6-month internship at Amazon London.

  • 2022

    Best Student Paper Award (CBMI)

    "Retrieval-augmented Transformer for Image Captioning”

  • 2022

    Starting PhD Student

    AImageLab, University of Modena and Reggio Emilia

  • 2022

    Master Thesis Award

    Premio alla Memoria Davide Rabotti

  • 2022

    M.S in Artificial Intelligence

    University of Modena and Reggio Emilia.

All News

Paper Accepted @CVPR

Publication date: 17/03/2025

πŸŽ‰ Happy to share that our paper "ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering" has been accepted at CVPR 2026 in Denver! πŸŽ‰

Paper Accepted @CVPR

Publication date: 22/11/2025

πŸŽ‰ Happy to share that our paper "Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval" has been accepted at CVPR 2025 in Nashville! πŸŽ‰

EuroHPC Extreme Scale grant

Publication date: 10/04/2025

Our project "VISTA – Versatile Intelligent Systems for Tailored and Adaptive Next-Generation Multimodal AI" was accepted for the EuroHPC Extreme Scale grant, with an allocation of almost 1M GPU hours. Read the news on the UNIMORE website

Introducing LLaVA-MORE

Publication date: 08/03/2024

πŸŽ‰ We are introducing LLaVA-MORE, a family of models that enhances LLaVA by integrating LLaMA 3.1 as the language model. Check out our Github repo! πŸŽ‰

Participation to National and European Projects

ELLIOT cover

ELLIOT Project

2025 – Ongoing

Contributing to the European initiative for developing multimodal AI systems.

FAIR project

PRIN 2022: Vision-Language Reasoning

2023 – Ongoing

Participating in the Italian National Research Project on multimodal reasoning and vision-language alignment.