Not Diamond is a predictive model recommendation framework that enables you to train a custom router to determine which LLM will provide the highest quality response to any given input based on your evaluation data. To help you understand how Not Diamond works, we've trained a cross-domain demo router that you can use in this quickstart example. You can follow along with the Python and TypeScript code below to make your first API request, or try it in Colab.

📘
Demo router
The demo router we'll use in this quickstart will give you an understanding of how Not Diamond works and can be leveraged for lightweight general-purpose use. That said, teams report significant performance lift in their production workflows after training a custom router optimized to their data and evaluation criteria.

Installation

Python: Requires Python 3.9+. It’s recommended that you create and activate a virtualenv prior to installing the package. For this example, we'll be installing the optional additional create dependencies, which you can learn more about here.

⚠️
Deprecation warning
Python 3.9.x will be deprecated starting from notdiamond version >0.4.0

pip install "notdiamond[create]"

npm install notdiamond dotenv

Setting up

Create a .env file with your Not Diamond API key and the API keys of the models you want to route between:

NOTDIAMOND_API_KEY = "YOUR_NOTDIAMOND_API_KEY"
OPENAI_API_KEY = "YOUR_OPENAI_API_KEY"
ANTHROPIC_API_KEY = "YOUR_ANTHROPIC_API_KEY"

You can also define API keys programmatically.

Sending your first Not Diamond API request

Create a new file in the same directory as your .env file and copy and run the code below (you can toggle between Python and TypeScript in the top left of the code block):

from notdiamond import NotDiamond

# Define the Not Diamond routing client
client = NotDiamond()

# The best LLM is determined by Not Diamond based on the messages and specified models
result, session_id, provider = client.chat.completions.create(
    messages=[ 
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Concisely explain merge sort."}  # Adjust as desired
    ],
    model=['openai/gpt-4o', 'openai/gpt-4o-mini', 'anthropic/claude-3-5-sonnet-20240620']
)

print("Not Diamond session ID: ", session_id)  # A unique ID of Not Diamond's recommendation
print("LLM called: ", provider.model)  # The LLM routed to
print("LLM output: ", result.content)  # The LLM response

import { NotDiamond } from 'notdiamond';
import dotenv from 'dotenv';
dotenv.config();

// Initialize the Not Diamond client
const notDiamond = new NotDiamond();

// The best LLM is determined by Not Diamond based on the messages and specified models
const result = await notDiamond.create({
  messages: [
    { role: 'system', content: 'You are a world class programmer.' },
    { role: 'user', content: 'Consiely explain merge sort.' }  // Adjust as desired
  ],
  llmProviders: [
    { provider: 'openai', model: 'gpt-4o' },
    { provider: 'openai', model: 'gpt-4o-mini' },
    { provider: 'anthropic', model: 'claude-3-5-sonnet-20240620' }
  ]
});

if ('detail' in result) {
  console.error('Error:', result.detail);
} 
else {
  console.log('Not Diamond session ID:', result.session_id);  // A unique ID of Not Diamond's recommendation
  console.log('LLM called:', result.providers);  // The LLM routed to
  console.log('LLM output', result.content); // The LLM response
}

Breaking down this example

We first define the routing client, which you can think of as a meta-LLM in which we'll combine multiple LLMs. We can define various clients, each with different configurations for different purposes, throughout our application.

client = NotDiamond()

// Initialize the Not Diamond client
const notDiamond = new NotDiamond();

After initializing the client and defining our LLMs, we next pass in an array of messages and the models we want to route between:

result, session_id, provider = client.chat.completions.create(
    messages=[ 
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Concisely explain merge sort."}  # Adjust as desired
    ],
    model=['openai/gpt-4o', 'openai/gpt-4o-mini', 'anthropic/claude-3-5-sonnet-20240620']
)

const result = await notDiamond.create({
  messages: [
    { role: 'system', content: 'You are a world class programmer.' },
    { role: 'user', content: 'Consiely explain merge sort.' }  // Adjust as desired
  ],
  llmProviders: [
    { provider: 'openai', model: 'gpt-4o' },
    { provider: 'openai', model: 'gpt-4o-mini' },
    { provider: 'anthropic', model: 'claude-3-5-sonnet-20240620' }
  ]
});

This returns a session ID and a recommended model:

Session ID: a unique ID for this specific recommendation. This is useful for submitting feedback on routing decisions.
Provider: the LLM selected by the ND API as the most appropriate for responding to the query.
LLM response (Python only): in addition to returning a recommended LLM, the Not Diamond Python SDK can also facilitate client-side requests to the recommended LLM with the create method. Alternatively, we can use model_select to simply return a session ID and a provider. You can learn more about these two methods here

📘
Good for use cases with diverse inputs
Not Diamond's out-of-the-box router (which we leverage in this example) is most useful for applications in which we are handling diverse inputs, such as a chatbot or a code generation assistant. For narrower tasks, we can train a custom router optimized to our data.

Next steps

In this example, we've learned how to dynamically route an array of messages to best-suited LLM amongst a set of various candidates. To explore all the features that Not Diamond offers, checkout the following guides

📘
Model gateway
If you are already using an OpenAI compatible LLM endpoint, Not Diamond provides a convenient gateway endpoint that is OpenAI compatible. Simply swap the base URL with our gateway URL and you can begin using Not Diamond to optimize your LLM use. Check out the docs.

Demo quickstart

📘
Demo router

Installation

⚠️
Deprecation warning

Setting up

Sending your first Not Diamond API request

Breaking down this example

📘
Good for use cases with diverse inputs

Next steps

📘
Model gateway

📘Demo router

Installation

⚠️Deprecation warning

Setting up

Sending your first Not Diamond API request

Breaking down this example

📘Good for use cases with diverse inputs

Next steps

📘Model gateway

📘
Demo router

⚠️
Deprecation warning

📘
Good for use cases with diverse inputs

📘
Model gateway