SimVLM explained | What the paper doesn’t tell you
SimVLM explained. What the authors tell us, what they don’t tell us and how this all works. Enjoy with coffee! Vision & Language Transformer explained (ViLBERT): https://youtu.be/dd7nE4nbxN0 ViT explained: https://youtu.be/DVoHvmww2lQ Thanks to our Patrons who support us in Tier 2, 3, 4: donor, Dres. Trost GbR, Yannik Schneider Paper: Wang, Zirui, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, and Yuan Cao. "SimVLM: Simple Visual Language Model Pretraining with Weak Supervision." arXiv preprint arXiv:2108.10904 (2021). https://arxiv.org/abs/2108.10904 SimVLM AI Google Blog post: https://ai.googleblog.com/2021/10/simvlm-simple-visual-language-model-pre.html Jia, Chao, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, and Tom Duerig. "Scaling up visual and vision-language representation learning with noisy text supervision.&quo
Похожие видео
Показать еще