The workshop schedule is planned as follows (all in Eastern Time (ET)) on Monday, November 28, 2022
- 08:50 - 09:00 - Opening Remarks
- 09:00 - 09:30 - Invited Talk: Danqi Chen: Building Language Models Based on Retrieval
- 09:30 - 10:00 - Invited Talk: Justin Lin: Towards Unified Multimodal Pretraining
- 10:00 - 10:10 - Contributed Oral: What Do All Audio Transformer Models Hear? Probing Acoustic Representations For Language Delivery And Structure
- 10:10 - 10:20 - Contributed Oral: STT: Soft Template Tuning for Few-Shot Adaptation
- 10:20 - 10:40 - Coffee Break
- 10:40 - 11:10 - Invited Talk: Letitia Parcalabescu: VALSE: Phenomenon-Centered Testing of Vision and Language Models
- 11:10 - 11:40 - Invited Talk: Xifeng Yan: Limitations of Language Models in Arithmetic Induction
- 11:40 - 12:30 - Poster Session
- 12:30 - 12:30 - Lunch Break
- 13:30 - 14:00 - Invited Talk: Lu Yuan: TBD
- 14:00 - 14:30 - Invited Talk: Jiasen Lu: Unified-IO: A Unified Model for Vision, Language and Multi-Modal Tasks
- 14:30 - 14:40 - Contributed Oral: Zero-shot Object Detection Through Vision-Language Embedding Alignment
- 14:40 - 14:50 - Contributed Oral: A Multi-level Alignment Training Scheme for Video-and-Language Grounding
- 14:50 - 15:10 - Coffee Break
- 15:10 - 15:40 - Invited Talk: Jason Baldridge: What's Missing in Text-to-Image Generation? Current Models and Paths Forward
- 15:40 - 16:10 - Invited Talk: Tengyu Ma: Toward Understanding Foundation Models
- 16:10 - 17:00 - Panel Discussion: Jianfeng Gao, Trishul Chilimbi, Ruslan Salakhutdinov, Christoph Schuhmann, Ludwig Schmidt, Mohammad Norouzi
- 17:00 - 17:05 - Closing Remarks