|
J. Chen, Dannong Xu, J. Fei, Chun-Mei Feng, M. Elhoseiny:
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents
CVPR, 2025
|
|
M. Ahmed, J. Fei, J. Ding, E. Abdelrahman, M. Elhoseiny:
Kestrel: 3D Multimodal LLM for Part-Aware Grounded Description
ICCV, 2025
|
|
Zhongyu Yang, J. Chen, Dannong Xu, J. Fei, X. Shen, L. Zhao, Chun-Mei Feng, M. Elhoseiny:
WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation
ICCV, 2025
|
|
Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe, J. Fei, Sayan Nag, Salman Khan, Mohamed Elhoseiny, Dinesh Manocha:
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
NeurIPS, 2025
|