Introducing Command R+: Our new, most powerful model in the Command R family.

Learn More

Cohere For AI - Guest Speaker: Kaiyu Yue, Student at Univ. of Maryland


Date: Jun 12, 2024

Time: 4:00 PM - 5:00 PM

Location: Online

Abstract: We present an approach to pose object recognition as next token prediction. The idea is to apply a language decoder that auto-regressively predicts the text tokens from image embeddings to form labels. To ground this prediction process in auto-regression, we customize a non-causal attention mask for the decoder, incorporating two key features: modeling tokens from different labels to be independent, and treating image tokens as a prefix. This masking mechanism inspires an efficient method - one-shot sampling - to simultaneously sample tokens of multiple labels in parallel and rank generated labels by their probabilities during inference. To further enhance the efficiency, we propose a simple strategy to construct a compact decoder by simply discarding the intermediate blocks of a pretrained language model. This approach yields a decoder that matches the full model's performance while being notably more efficient. abs:

Add event to calendar

Apple Google Office 365 Outlook Yahoo