|
Y. Mohamed, Runjia Li, Ibrahim Said Ahmad, K. Haydarov, Philip Torr, Kenneth Church, M. Elhoseiny:
No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages
EMNLP, 2024
Project page
|
|
Botos Csaba, W. Zhang, Matthias Müller, Ser-Nam Lim, M. Elhoseiny, Philip H.S. Torr, Adel Bibi:
Label Delay in Online Continual Learning
NeurIPS, 2024
|
|
M. Ahmed, X. Li, Arpit Prajapati, M. Elhoseiny:
3DCoMPaT200: Language Grounded Large-Scale 3D Vision Dataset for Compositional Recognition, Datasets and Benchmarks
NeurIPS, 2024
|
|
X. Li, J. Ding, M. Elhoseiny:
VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image Understanding
NeurIPS, 2024
|
|
Kirolos Ataallah, X. Shen, E. Abdelrahman, Essam Sleiman, Mingchen Zhuge, J. Ding, D. Zhu, Jürgen Schmidhuber, M. Elhoseiny:
Goldfish: Vision-Language Understanding of Arbitrarily Long Videos
ECCV, 2024
|
|
X. Shen, F. Farooq Khan, Abdelrahman Mohamed, M. Elhoseiny:
EmoTalker: Audio Driven Emotion Aware Talking Head Generation
ACCV, 2024
|
|
Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta, J. Chen, M. Elhoseiny, Ruohan Gao, Dinesh Manocha:
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
ECCV, 2024
|
|
K. Haydarov, X. Shen, Avinash Madasu, Mahmoud Salem, Li-Jia Li, Gamaleldin Elsayed, M. Elhoseiny:
Affective Visual Dialog: A Large-Scale Benchmark for Emotional Reasoning Based on Visually Grounded Conversation
ECCV, 2024
|
|
X. Li, J. Ding, Zhaoyang Chen, M. Elhoseiny:
Uni3DL: A Unified Model for 3D Vision-Language Understanding
ECCV, 2024
|
|
K. Haydarov, Aashiq Muhamed, X. Shen, Jovana Lazarevic, Ivan Skorokhodov, Chamuditha Jayanga Galappaththige, M. Elhoseiny:
Adversarial Text to Continuous Image Generation
CVPR, 2024
|
|
H. Slim, M. Elhoseiny:
ShapeWalk: A Benchmark for Compositional Shape Editing through Language-Guided Chains
CVPR, 2024
|
|
W. Zhang, Paul Janson, Rahaf Aljundi, M. Elhoseiny:
Overcoming Generic Knowledge Loss with Selective Parameter Update
CVPR, 2024
|
|
D. Zhu, J. Chen, K. Haydarov, X. Shen, W. Zhang, M. Elhoseiny:
ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions
TMLR Journal, 2024
|
|
Yuanpeng Li, Joel Hestness, M. Elhoseiny, Liang Zhao, Kenneth Church:
Efficiently Disentangle Causal Representations
CPAL, Proceedings of Machine Learning Research, 2024
|
|
D. Zhu, J. Chen, X. Shen, X. Li, M. Elhoseiny:
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
ICLR, 2024
|
|
E. Abdelrahman, Mohamed Ayman, M. Ahmed, H. Slim, M. Elhoseiny:
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding
ICLR, 2024
|
|
W. Zhang, Y. Mohamed, Bernard Ghanem, Philip Torr, Adel Bibi, M. Elhoseiny:
Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation
ICLR, 2024
|
|
Salman Khan, Izzeddin Teeti, Andrew Bradley, M. Elhoseiny, Fabio Cuzzolin:
A Hybrid Graph Network for Complex Activity Detection in Video
WACV, 2024
|
|
E. Abdelrahman, Pengzhan Sun, Li Erran Li, M. Elhoseiny:
ImageCaptioner^2: Image Captioner for Image Captioning Bias Amplification Assessment
AAAI, 2024
|