Skip to main content

5 docs tagged with "LLM"

Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs

摘要和介绍

GPT4RoI-Instruction Tuning Large Language Model on Region-of-Interest

论文名称：GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest

LLaVA

论文名称：Visual Instruction Tuning

Shikra-Unleashing Multimodal LLM’s Referential Dialogue Magic

论文名称：Shikra: Unleashing Multimodal LLM’s Referential Dialogue Magic

VisionLLM

论文名称：VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks