PROFESSIONAL SERVICES

Internal AI Knowledge Assistant

CHALLENGE

A large professional services firm wanted to build a knowledge assistant with retrieval-augmented generation (RAG) using its knowledge base of tens of millions of documents, initially for internal users, and ultimately as a product for its clients. Its proof-of-concept had a combination of commercial generative models, open-source embeddings, and reranking solutions for search, but long-term, the customer was looking for a solution that could work in secure customer environments.

CHALLENGE

A large professional services firm wanted to build a knowledge assistant with retrieval-augmented generation (RAG) using its knowledge base of tens of millions of documents, initially for internal users, and ultimately as a product for its clients. Its proof-of-concept had a combination of commercial generative models, open-source embeddings, and reranking solutions for search, but long-term, the customer was looking for a solution that could work in secure customer environments.

SOLUTION

The customer replaced its open-source retrieval models with Cohere Embed and Cohere Rerank after testing them side-by-side and seeing substantial increases in retrieval accuracy. The team is currently replacing their generative model with a fine-tuned Cohere Command , making it possible to deploy this solution anywhere within their own network and on any client environment while staying fully compliant with demanding security requirements.

SOLUTION

The customer replaced its open-source retrieval models with Cohere Embed and Cohere Rerank after testing them side-by-side and seeing substantial increases in retrieval accuracy. The team is currently replacing their generative model with a fine-tuned Cohere Command , making it possible to deploy this solution anywhere within their own network and on any client environment while staying fully compliant with demanding security requirements.

How it works

STEP 1.

Unstructured knowledge base documents are embedded by Embed and stored in a vector database.

STEP 2.

Command interprets a user’s requests and creates queries across legacy and vector search sources.

STEP 3.

Rerank re-orders the responses based on relevance to original queries, improving the accuracy of the search results.

STEP 4.

Command synthesizes a conversational response, along with citations, back to the user.

Impact

Increased staff productivity

More relevant search results

New business offering for clients

The Cohere Difference

Leading model accuracy

Cohere’s retrieval prioritizes accurate responses and citations

Accelerated enterprise deployment

Cohere’s models come with connectors to common data sources

Customization

Cohere’s models can be fine-tuned to further improve domain performance

Scalability

Cohere’s powerful inference frameworks optimize throughput and reduce compute requirements

Flexible deployment

Cohere models can be accessed through a SaaS API, on cloud infrastructure (Amazon SageMaker, Amazon Bedrock, OCI Data Science, Google Vertex AI, Azure AI), and private deployments (virtual private cloud and on-premises)

Multilingual support

Over 100 languages are supported, so the same topics, products, and issues are identified the same way