📑

AI Paper Research

AI 논문 조사 및 정리

Foundations
멀티모달Multimodal AI
Gemini: A Family of Highly Capable Multi...InternVL: Scaling up Vision Foundation M...
Visual Instruction TuningBLIP-2: Bootstrapping Language-Image Pre...CogVLM: Visual Expert for Pretrained Lan...
Flamingo: a Visual Language Model for Fe...
Learning Transferable Visual Models From...Zero-Shot Text-to-Image GenerationScaling Up Visual and Vision-Language Re...
ViLBERT: Pretraining Task-Agnostic Visio...
홈/멀티모달/2019

멀티모달 — 2019

1편의 논문

NeurIPS 20193,000+

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks

ViLBERT: 비전-언어 과제를 위한 과제 비의존적 시각언어 표현 사전학습

Jiasen Lu, Dhruv Batra, Devi Parikh et al. (2019)

← 멀티모달 전체