deployment options

AI on your terms

Run our AI anywhere with deployment options tailored to your unique infrastructure, security, and performance needs.

Private

Run our models on-premises or within an isolated virtual private cloud (VPC) environment for complete data sovereignty and governance.

Learn more


  • Highly regulated industries

  • Meeting strict data residency needs

  • Protecting highly sensitive data

Public cloud

Deploy on leading cloud AI platforms such as AWS, Azure, OCI, and GCP for seamless integration, in-built scalability, and robust security.


  • Cloud-first enterprises

  • Companies with variable workloads

  • Organizations requiring global availability

Hybrid cloud

Integrate private infrastructure with public cloud resources to balance compliance requirements with flexibility and scalability.


  • Enterprises needing local control and cloud flexibility

  • Optimizing cost and performance across environments

SaaS

Deploy through our fully managed SaaS platform to scale securely without the cost and complexity of managing your own infrastructure. Available for model deployments only.


  • Small to midsize businesses

  • Handling non-sensitive data

  • Running AI without infrastructure overhead

Compare deployment options

Private

Public cloud

Hybrid cloud

SaaS

Available for

Command, Rerank, Embed, North, Compass

Command, Rerank, Embed, North, Compass

Command, Rerank, Embed, North, Compass

Command, Rerank, Embed

Time to get started

< 1 day

< 1 hour

< 1 day

Instantly

Key benefits

  • Run securely behind your firewall

  • Achieve complete data sovereignty

  • Scale and customize to your exact needs

  • Run on any cloud AI/ML platform

  • Access elastic capacity on demand

  • Protect your data with robust cloud security standards

  • Run each workload in its optimal environment for maximum efficiency

  • Run regulated workloads privately

  • Get started instantly with no setup

  • Simplify operations with fully managed compute

  • Protect data with dedicated instances

Pricing structure

Per model instance

Per token and per instance

Per token and per instance

Per token