Pricing

Empower your team
to put AI to work

Move from proof of concept into production with our enterprise-ready AI solutions — private, secure, and built to work with your existing systems.

Contact sales

Explore private deployments

Workplace systems
Generative models
Advanced retrieval models
Coding models

North

Our all-in-one AI platform built to empower humans, automate tasks, and innovate safely across your enterprise.

Get in touch for custom enterprise pricing.

Request a demo

Intuitive interface
Purpose-built generative models
Intelligent search
AI agents for routine tasks and complex workflows

Compass

Our intelligent search and discovery system designed to surface insights from across your business.

Get in touch for custom enterprise pricing.

Request a demo

Pre-built data connectors
Intelligent search
Document parsing
Managed index

Model Vault

Model

Performance Tier

Hourly rate per instance

Monthly rate per instance

Embed 4

Small

$4.00

$2,500

Embed 4

Medium

$5.00

$3,250

Rerank 3.5

Medium

$5.00

$3,250

Rerank 4 Fast

Medium

$5.00

$3,250

Rerank 4 Pro

Medium

$5.00

$3,250

Rerank 4 Pro

Large

$10.00

$6,500

Model Vault is our dedicated, fully managed platform to run Cohere models securely, at scale, and with guaranteed performance.

Pricing is determined per instance based on the selected model and its performance tier. Billing can be calculated hourly or through longer-term commitments (monthly or annual).

Model Vault is easy to set up for North or model deployments through your Cohere dashboard, or contact sales to learn more.

Fully managed model deployment
No shared resources or multi-tenancy overhead
Seamless integration with Cohere North
Simple startup and self-serve model access
Fixed or Flex pricing plans available

Cohere Dashboard

Speak to sales

Frequently asked questions

Cohere supports bespoke model customization for our enterprise customers along with private deployment of all our products. Custom pricing is based on your enterprise needs. Learn more about customization here and private deployment here.
When an account is created, we automatically create an Trial API key for you. This API key will be available on the dashboard for you to copy, as well as in the dashboard section called “API Keys.”
To get a Production key, you'll need to have Owner privileges (or ask your organization Owner to complete the following steps). Navigate to the Billing and Usage page in your Cohere dashboard. Click on the Get Your Production key button and fill out the Go to Production workflow.
API calls made from a Trial API key are free. However, trial keys are rate limited and are not permitted to be used for production or commercial purposes. API calls made from a Production API key will be charged on a pay-as-you-go basis. Production API keys are designed for production use at scale.
Every account begins as a personal account and only has access to Trial API keys. As a personal account, you will not be able to add other members until you become part of an organization.
At Cohere, an organization is a group of personal accounts that share a singular billing portal. Organizations are not automatically given Production API key access, and a member of the organization must still fill out our application form for production access. Personal accounts cannot share billing information with other accounts.
Your model selection reflects your relative prioritization of model performance and speed. Larger models offer better performance and are capable of more complex tasks, while smaller models have faster response times.
API calls made from a Trial API key will be free. API calls made from a Production key will be billed on a pay-as-you-go basis. Your bill will be issued at the end of every calendar month or when you reach $250 in outstanding balances.
Language models understand “tokens” rather than characters or bytes. The number of tokens per word depends on the complexity of the text. Simple text may approach 1 token per word on average, while complex texts may use less common words that require 3-4 tokens per word on average. For more details on tokens, refer to this page.
For existing customers:
Command pricing is $1.00/1M tokens for input and $2.00/1M tokens for output
Command-light pricing is $0.30/1M tokens for input and $0.60/1M tokens for output
Command R 03-2024 pricing is $0.50/1M tokens for input and $1.50/1M tokens for output
Command R+ 04-2024 pricing is $3.00/1M tokens for input and $15.00/1M tokens for output
Command R+ 08-2024 pricing is $2.50/1M tokens for input and $10.00/1M tokens for output
Aya Expanse models (8B and 32B) on the API are charged at $0.50/1M tokens for input and $1.50/1M tokens for output. Find more information about the Aya models here.
A single search unit is defined as one query with up to 100 documents to be ranked.

If any document exceeds 500 tokens (including the length of the search query), it is automatically split into multiple chunks. Each chunk is treated as an individual document and counts toward the total number of documents ranked for that search.

This ensures consistent performance and accurate pricing when working with longer documents.

Why enterprises and innovators choose Cohere

“With Cohere's latest highly secure enterprise LLMs, we aim to provide businesses with powerful and adaptable AI solutions that address specific needs and accelerate the adoption of generative AI globally.”

— Vivek Mahajan, Corporate Vice President, CTO and CPO

Aerial view of Tokyo Tower surrounded by skyscrapers during sunset, with a colorful sky in the background.

“With Cohere's latest highly secure enterprise LLMs, we aim to provide businesses with powerful and adaptable AI solutions that address specific needs and accelerate the adoption of generative AI globally.”

— Vivek Mahajan, Corporate Vice President, CTO and CPO

Ready to put AI to work?

Request a demo and see how Cohere's secure and private AI platform can unlock productivity for your business.

See how Cohere's AI models can accommodate your specific enterprise use cases
Determine the best deployment options for your enterprise
Learn how Cohere can swiftly move AI into production

Empower your team
to put AI to work

Model Vault

Frequently asked questions

How do I inquire about model customization and private deployment?

How do I get a Trial API key?

How do I get a Production API key?

What’s the difference between a Trial API key and Production API key?

Are there any account limitations upon signup?

What’s the difference between an organization and a personal account?

Which model should I pick?

When do I get billed?

The endpoint I’m using is billed by token. What is a token?

Where do I find pricing for our legacy models?

What is the cost for accessing the research Aya models via the API?

How is a “search” defined for Rerank pricing?

Why enterprises and innovators choose Cohere

“With Cohere's latest highly secure enterprise LLMs, we aim to provide businesses with powerful and adaptable AI solutions that address specific needs and accelerate the adoption of generative AI globally.”

“With Cohere's latest highly secure enterprise LLMs, we aim to provide businesses with powerful and adaptable AI solutions that address specific needs and accelerate the adoption of generative AI globally.”

Ready to put AI to work?

Empower your teamto put AI to work

Model Vault

Frequently asked questions

How do I inquire about model customization and private deployment?

How do I get a Trial API key?

How do I get a Production API key?

What’s the difference between a Trial API key and Production API key?

Are there any account limitations upon signup?

What’s the difference between an organization and a personal account?

Which model should I pick?

When do I get billed?

The endpoint I’m using is billed by token. What is a token?

Where do I find pricing for our legacy models?

What is the cost for accessing the research Aya models via the API?

How is a “search” defined for Rerank pricing?

Why enterprises and innovators choose Cohere

“With Cohere's latest highly secure enterprise LLMs, we aim to provide businesses with powerful and adaptable AI solutions that address specific needs and accelerate the adoption of generative AI globally.”

“With Cohere's latest highly secure enterprise LLMs, we aim to provide businesses with powerful and adaptable AI solutions that address specific needs and accelerate the adoption of generative AI globally.”

Ready to put AI to work?

Empower your team
to put AI to work