Introducing Aya
A global initiative led by Cohere For AI involving over 3,000 independent researchers across 119 countries. Aya is a state-of-art model and dataset, pushing the boundaries of multilingual AI for 101 languages through open science.
The word Aya is derived from the Twi language meaning “fern” - a symbol of endurance and resourcefulness. Aya embodies our dedication to advancing multilingual AI.
3
Models
513M
Total Release Dataset Size
3K
Independent Researchers
56
Language Ambassadors
119
Countries
204K
Original Human Annotations
101
Languages
31K
Discord Messages
The Aya Models
State of the Art, Accessible Research LLM
Aya 23 - 8B
State of the Art Research LLM
Aya 23 - 35B
Massively Multilingual Research LLM
Aya 101
Multilingual AI Research
Aya provides AI researchers a groundbreaking foundation to accelerate multilingual AI progress. Aya is one of the largest open science endeavors in machine learning to date – redefining the research landscape by collaborating with independent researchers from across the globe. The result is a fully open-sourced dataset and model.
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
The Aya Collection stands as the most extensive assembly of multilingual instruction fine-tuning datasets to date, featuring 513 million prompts and completions across 114 languages. We fully open-source the collection, which includes rare, human-curated annotations from fluent speakers worldwide.
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
The Aya Model is a massively multilingual language model capable of following instructions in 101 languages. Developed using a diverse mix of instructions from the Aya dataset and collection among others, it achieves state-of-the-art performance across numerous multilingual benchmarks.
Aya 23: Open Weight Releases to Further Multilingual Progress
Our technical report shares evaluation results on multiple multilingual NLP benchmarks, and generation quality assessments.
A Step Forward For Multilingual Generative AI
Evaluations for Aya 101, our original, massively multilingual research LLM.
Human evaluation of the Aya model shows consistent gains
The Aya model follows instructions and generates responses of significantly higher quality than mT0x. Based on human evaluations from professional annotators who compared model responses to instructions given in multiple languages, the Aya model is preferred 77% of the time on average.
Aya Press
The New York Times
"When A.I. Fails the Language Test, Who Is Left Out of the Conversation?"
VentureBeat
Cohere launches open weights AI model Aya 23 with support for 23 languages
AI Business
"Build Multilingual AI Solutions with Cohere’s New Aya Model"
Axios
"New AI polyglot launched to help fill massive language gap in field"
The Washington Post
"Helping the second class citizens of the AI boom"
The Globe and Mail
"AI falls short in many languages. A non-profit project aims to fix that"
Aya at a Glance
Learn more about the journey of Aya, from our collaborators, to key breakthroughs & the responsible use of our open source model
What’s next?
Aya 23 Blog
Learn about our latest 8 and 35 Billion Parameter Open Weights Release
Join others in the Aya movement
Connect with people worldwide working towards a multilingual future.
Expedition Aya
Explore the 20 projects from our global, multilingual build challenge.
Cohere For AI
Aya is led by Cohere For AI, a non-profit research lab that seeks to solve complex ML problems. We support fundamental research and are focused on creating more points of entry into ML research.