AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding
Ahmed Masry,Juan A. Rodriguez,Tianyu Zhang,Suyuchen Wang, Chao Wang,Aarash Feizi, Akshay Kalkunte Suresh, Abhay Puri, Xiangru Jian,Pierre-André Noël, Sathwik Tejaswi Madhusudhan,Marco Pedersoli,Bang Liu,Nicolas Chapados,Yoshua Bengio,Enamul Hoque,Christopher Pal,Issam H. Laradji,David Vazquez,Perouz Taslakian,Spandana Gella,Sai Rajeswar CoRR(2025)
AI 理解论文
溯源树
样例
