Introducing Aya
A global initiative led by Cohere For AI to advance the state-of-art in multilingual AI and bridge gaps between people and cultures across the world. Aya is an open science project to create new models and datasets that expand the number of languages covered by AI, involving over 3,000 independent researchers across 119 countries.
The word Aya is derived from the Twi language meaning “fern” - a symbol of endurance and resourcefulness. Aya embodies our dedication to advancing multilingual AI.
The word Aya is derived from the Twi language meaning “fern” - a symbol of endurance and resourcefulness. Aya embodies our dedication to advancing multilingual AI.
3
Models
513M
Total Release Dataset Size
3K
Independent Researchers
250+
Language Ambassadors
119
Countries
204K
Original Human Annotations
101
Languages
81K
Discord Messages
The Aya Models
State of the Art, Accessible Research LLM
Aya Expanse - 8B
State of the Art Research LLM
Aya Expanse - 32B
Massively Multilingual Research LLM
Aya 101
Multilingual AI Research
Aya provides AI researchers a groundbreaking foundation to accelerate multilingual AI progress. Aya is one of the largest open science endeavors in machine learning to date – redefining the research landscape by collaborating with independent researchers from across the globe. The result is a fully open-sourced dataset and model.
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
The Aya Collection stands as the most extensive assembly of multilingual instruction fine-tuning datasets to date, featuring 513 million prompts and completions across 114 languages. We fully open-source the collection, which includes rare, human-curated annotations from fluent speakers worldwide.
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
The Aya Model is a massively multilingual language model capable of following instructions in 101 languages. Developed using a diverse mix of instructions from the Aya dataset and collection among others, it achieves state-of-the-art performance across numerous multilingual benchmarks.
Aya 23: Open Weight Releases to Further Multilingual Progress
Our technical report shares evaluation results on multiple multilingual NLP benchmarks, and generation quality assessments.
A Step Forward For Multilingual Generative AI
The numbers behind Aya, our family of massively multilingual research LLMs.
Advancing the state-of-art for global languages
The Aya Collection and our first Aya model, Aya 101, cover 101 languages. Half of these were completely under-served by pre-existing language models. Aya Expanse offers enhanced performance for 23 of these languages.
Aya Press
VentureBeat
Cohere launches new AI models to bridge global language divide
Silicon Angle
"Cohere announces Aya Expanse multilingual AI model family for researchers"
The New York Times
"When A.I. Fails the Language Test, Who Is Left Out of the Conversation?"
Axios
"New AI polyglot launched to help fill massive language gap in field"
The Washington Post
"Helping the second class citizens of the AI boom"
The Globe and Mail
"AI falls short in many languages. A non-profit project aims to fix that"
Aya at a Glance
Learn more about the journey of Aya, from our collaborators, to key breakthroughs & the responsible use of our open source model
What’s next?
Aya Expanse Blog
Learn about our latest 8 and 32 Billion Parameter Open Weights Release
Join others in the Aya movement
Connect with people worldwide working towards a multilingual future.
Expedition Aya
Explore the 20 projects from our global, multilingual build challenge.
Cohere For AI
Aya is led by Cohere For AI, a non-profit research lab that seeks to solve complex ML problems. We support fundamental research and are focused on creating more points of entry into ML research.