notdiamond.llms
notdiamond.llms.client
- class notdiamond.llms.client.NotDiamond(nd_api_url: str | None = 'https://api.notdiamond.ai', user_agent: str | None = 'Python-SDK/0.3.32', *args, api_key: str, llm_configs: List[LLMConfig | str] | None = None, default: LLMConfig | int | str, max_model_depth: int | None = None, latency_tracking: bool, hash_content: bool, tradeoff: str | None = None, preference_id: str | None = None, tools: Sequence[Dict[str, Any] | Callable] | None = None, callbacks: List | None = None, max_retries: int, timeout: float)[source]
Bases:
_NDRouterClient
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- Parameters:
nd_api_url (str | None)
user_agent (str | None)
api_key (str)
llm_configs (List[LLMConfig | str] | None)
default (LLMConfig | int | str)
max_model_depth (int | None)
latency_tracking (bool)
hash_content (bool)
tradeoff (str | None)
preference_id (str | None)
tools (Sequence[Dict[str, Any] | Callable] | None)
callbacks (List | None)
max_retries (int)
timeout (float)
- api_key: str
API key required for making calls to NotDiamond. You can get an API key via our dashboard: https://app.notdiamond.ai If an API key is not set, it will check for NOTDIAMOND_API_KEY in .env file.
- default: LLMConfig | int | str
Set a default LLM, so in case anything goes wrong in the flow, as for example NotDiamond API call fails, your code won’t break and you have a fallback model. There are various ways to configure a default model:
Integer, specifying the index of the default provider from the llm_configs list
String, similar how you can specify llm_configs, of structure ‘provider_name/model_name’
LLMConfig, just directly specify the object of the provider
By default, we will set your first LLM in the list as the default.
- hash_content: bool
Hashing the content before being sent to the NotDiamond API. By default this is False.
- latency_tracking: bool
Tracking and sending latency of LLM call to NotDiamond server as feedback, so we can improve our router. By default this is turned on, set it to False to turn off.
- max_model_depth: int | None
If your top recommended model is down, specify up to which depth of routing you’re willing to go. If max_model_depth is not set, it defaults to the length of the llm_configs list. If max_model_depth is set to 0, the init will fail. If the value is larger than the llm_configs list length, we reset the value to len(llm_configs).
- max_retries: int
The maximum number of retries to make when calling the Not Diamond API.
- nd_api_url: str | None
The URL of the NotDiamond API. Defaults to settings.NOTDIAMOND_API_URL.
- preference_id: str | None
The ID of the router preference that was configured via the Dashboard. Defaults to None.
- timeout: float
The timeout for the Not Diamond API call.
- tools: Sequence[Dict[str, Any] | Callable] | None
Bind tools to the LLM object. The tools will be passed to the LLM object when invoking it.
- tradeoff: str | None
[DEPRECATED] The tradeoff constructor parameter is deprecated and will be removed in a future version. Please specify the tradeoff when using model_select or invocation methods.
Define tradeoff between “cost” and “latency” for the router to determine the best LLM for a given query. If None is specified, then the router will not consider either cost or latency.
The supported values: “cost”, “latency”
Defaults to None.
- user_agent: str | None
notdiamond.llms.config
- class notdiamond.llms.config.EmbeddingConfig(provider: str, model: str, api_key: str | None = None, **kwargs)[source]
Bases:
object
A NotDiamond embedding provider config (or EmbeddingConfig) is represented by a combination of provider and model. Provider refers to the company of the foundational model, such as openai, anthropic, google. The model represents the model name as defined by the owner company, such as text-embedding-3-large Beside this you can also specify the API key for each provider or extra arguments that are also supported by Langchain.
All supported providers and models can be found in our docs.
If the API key is not specified, the Config will try to read the key from an .env file before failing. For example, the Config will look for OPENAI_API_KEY to authenticate any OpenAI provider.
- provider
The name of the LLM provider (e.g., “openai”, “anthropic”). Must be one of the predefined providers in POSSIBLE_EMBEDDING_PROVIDERS.
- Type:
str
- model
The name of the LLM model to use (e.g., “gpt-3.5-turbo”). Must be one of the predefined models in POSSIBLE_MODELS.
- Type:
str
- api_key
The API key for accessing the LLM provider’s services. Defaults to None, in which case it tries to fetch from the environment.
- Type:
Optional[str], optional
- \*\*kwargs
Additional keyword arguments that might be necessary for specific providers or models.
- Raises:
UnsupportedLLMProvider – If the provider or model specified is not supported.
- Parameters:
provider (str)
model (str)
api_key (str | None)
_summary_
- Parameters:
provider (str) – The name of the embedding provider (e.g., “openai”, “anthropic”).
model (str) – The name of the embedding model to use (e.g., “text-embedding-3-large”).
api_key (Optional[str], optional) – The API key for accessing the embedding provider’s services. Defaults to None.
**kwargs – Additional keyword arguments that might be necessary for specific providers or models.
- Raises:
UnsupportedEmbeddingProvider – If the provider or model specified is not supported.
- __init__(provider: str, model: str, api_key: str | None = None, **kwargs)[source]
_summary_
- Parameters:
provider (str) – The name of the embedding provider (e.g., “openai”, “anthropic”).
model (str) – The name of the embedding model to use (e.g., “text-embedding-3-large”).
api_key (Optional[str], optional) – The API key for accessing the embedding provider’s services. Defaults to None.
**kwargs – Additional keyword arguments that might be necessary for specific providers or models.
- Raises:
UnsupportedEmbeddingProvider – If the provider or model specified is not supported.
- classmethod from_string(llm_provider: str)[source]
We allow our users to specify LLM providers for NotDiamond in the string format ‘provider_name/model_name’, for example ‘openai/gpt-3.5-turbo’. Our workflows expect LLMConfig as the base type, so this class method converts a string specification of an LLM provider into an LLMConfig object.
- Parameters:
llm_provider (str) – this is the string definition of the LLM provider
- Returns:
initialized object with correct provider and model
- Return type:
- set_api_key(api_key: str) EmbeddingConfig [source]
- Parameters:
api_key (str)
- Return type:
- class notdiamond.llms.config.LLMConfig(provider: str, model: str, is_custom: bool = False, system_prompt: str | None = None, context_length: int | None = None, input_price: float | None = None, custom_input_price: float | None = None, output_price: float | None = None, custom_output_price: float | None = None, latency: float | None = None, custom_latency: float | None = None, api_key: str | None = None, **kwargs)[source]
Bases:
object
A NotDiamond LLM provider config (or LLMConfig) is represented by a combination of provider and model. Provider refers to the company of the foundational model, such as openai, anthropic, google. The model represents the model name as defined by the owner company, such as gpt-3.5-turbo Beside this you can also specify the API key for each provider, specify extra arguments that are also supported by Langchain (eg. temperature), and a system prmopt to be used with the provider. If the provider is selected during routing, then the system prompt will be used, replacing the one in the message array if there are any.
All supported providers and models can be found in our docs.
If the API key it’s not specified, it will try to pick it up from an .env file before failing. As example for OpenAI it will look for OPENAI_API_KEY.
- provider
The name of the LLM provider (e.g., “openai”, “anthropic”). Must be one of the predefined providers in POSSIBLE_PROVIDERS.
- Type:
str
- model
The name of the LLM model to use (e.g., “gpt-3.5-turbo”). Must be one of the predefined models in POSSIBLE_MODELS.
- Type:
str
- system_prompt
The system prompt to use for the provider. Defaults to None.
- Type:
Optional[str], optional
- api_key
The API key for accessing the LLM provider’s services. Defaults to None, in which case it tries to fetch from the settings.
- Type:
Optional[str], optional
- openrouter_model
The OpenRouter model equivalent for this provider / model
- Type:
str
- \*\*kwargs
Additional keyword arguments that might be necessary for specific providers or models.
- Raises:
UnsupportedLLMProvider – If the provider or model specified is not supported.
- Parameters:
provider (str)
model (str)
is_custom (bool)
system_prompt (str | None)
context_length (int | None)
input_price (float | None)
custom_input_price (float | None)
output_price (float | None)
custom_output_price (float | None)
latency (float | None)
custom_latency (float | None)
api_key (str | None)
_summary_
- Parameters:
provider (str) – The name of the LLM provider (e.g., “openai”, “anthropic”).
model (str) – The name of the LLM model to use (e.g., “gpt-3.5-turbo”).
is_custom (bool) – Whether this is a custom model. Defaults to False.
system_prompt (Optional[str], optional) – The system prompt to use for the provider. Defaults to None.
context_length (Optional[int], optional) – Custom context window length for the provider/model.
custom_input_price (Optional[float], optional) – Custom input price (USD) per million tokens for this provider/model; will default to public input price if available.
custom_output_price (Optional[float], optional) – Custom output price (USD) per million tokens for this provider/model; will default to public output price if available.
custom_latency (Optional[float], optional) – Custom latency (time to first token) for provider/model.
api_key (Optional[str], optional) – The API key for accessing the LLM provider’s services. Defaults to None.
**kwargs – Additional keyword arguments that might be necessary for specific providers or models.
input_price (float | None)
output_price (float | None)
latency (float | None)
- Raises:
UnsupportedLLMProvider – If the provider or model specified is not supported.
- __init__(provider: str, model: str, is_custom: bool = False, system_prompt: str | None = None, context_length: int | None = None, input_price: float | None = None, custom_input_price: float | None = None, output_price: float | None = None, custom_output_price: float | None = None, latency: float | None = None, custom_latency: float | None = None, api_key: str | None = None, **kwargs)[source]
_summary_
- Parameters:
provider (str) – The name of the LLM provider (e.g., “openai”, “anthropic”).
model (str) – The name of the LLM model to use (e.g., “gpt-3.5-turbo”).
is_custom (bool) – Whether this is a custom model. Defaults to False.
system_prompt (Optional[str], optional) – The system prompt to use for the provider. Defaults to None.
context_length (Optional[int], optional) – Custom context window length for the provider/model.
custom_input_price (Optional[float], optional) – Custom input price (USD) per million tokens for this provider/model; will default to public input price if available.
custom_output_price (Optional[float], optional) – Custom output price (USD) per million tokens for this provider/model; will default to public output price if available.
custom_latency (Optional[float], optional) – Custom latency (time to first token) for provider/model.
api_key (Optional[str], optional) – The API key for accessing the LLM provider’s services. Defaults to None.
**kwargs – Additional keyword arguments that might be necessary for specific providers or models.
input_price (float | None)
output_price (float | None)
latency (float | None)
- Raises:
UnsupportedLLMProvider – If the provider or model specified is not supported.
- classmethod from_string(llm_provider: str)[source]
We allow our users to specify LLM providers for NotDiamond in the string format ‘provider_name/model_name’, as example ‘openai/gpt-3.5-turbo’. Underlying our workflows we want to ensure we use LLMConfig as the base type, so this class method converts a string specification of an LLM provider into an LLMConfig object.
- Parameters:
llm_provider (str) – this is the string definition of the LLM provider
- Returns:
initialized object with correct provider and model
- Return type:
- prepare_for_request()[source]
Converts the LLMConfig object to a dict in the format accepted by the NotDiamond API.
- Returns:
dict
- property openrouter_model
notdiamond.llms.providers
- class notdiamond.llms.providers.NDLLMProviders(value)[source]
Bases:
Enum
NDLLMProviders serves as a registry for the supported LLM models by NotDiamond. It allows developers to easily specify available LLM providers for the router.
- GPT_3_5_TURBO
refers to ‘gpt-3.5-turbo’ model by OpenAI
- Type:
NDLLMProvider
- GPT_3_5_TURBO_0125
refers to ‘gpt-3.5-turbo-0125’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4
refers to ‘gpt-4’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4_0613
refers to ‘gpt-4-0613’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4_1106_PREVIEW
refers to ‘gpt-4-1106-preview’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4_TURBO
refers to ‘gpt-4-turbo’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4_TURBO_PREVIEW
refers to ‘gpt-4-turbo-preview’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4_TURBO_2024_04_09
refers to ‘gpt-4-turbo-2024-04-09’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4o_2024_05_13
refers to ‘gpt-4o-2024-05-13’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4o_2024_08_06
refers to ‘gpt-4o-2024-08-06’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4o
refers to ‘gpt-4o’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4o_MINI_2024_07_18
refers to ‘gpt-4o-mini-2024-07-18’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4o_MINI
refers to ‘gpt-4o-mini’ model by OpenAI
- Type:
NDLLMProvider
- GPT_4_0125_PREVIEW
refers to ‘gpt-4-0125-preview’ model by OpenAI
- Type:
NDLLMProvider
- O1_PREVIEW
refers to ‘o1-preview’ model by OpenAI
- Type:
NDLLMProvider
- O1_PREVIEW_2024_09_12
refers to ‘o1-preview-2024-09-12’ model by OpenAI
- Type:
NDLLMProvider
- O1_MINI
refers to ‘o1-mini’ model by OpenAI
- Type:
NDLLMProvider
- O1_MINI_2024_09_12
refers to ‘o1-mini-2024-09-12’ model by OpenAI
- Type:
NDLLMProvider
- CLAUDE_2_1
refers to ‘claude-2.1’ model by Anthropic
- Type:
NDLLMProvider
- CLAUDE_3_OPUS_20240229
refers to ‘claude-3-opus-20240229’ model by Anthropic
- Type:
NDLLMProvider
- CLAUDE_3_SONNET_20240229
refers to ‘claude-3-sonnet-20240229’ model by Anthropic
- Type:
NDLLMProvider
- CLAUDE_3_5_SONNET_20240620
refers to ‘claude-3-5-sonnet-20240620’ model by Anthropic
- Type:
NDLLMProvider
- CLAUDE_3_5_HAIKU_20241022
refers to ‘claude-3-5-haiku-20241022’ model by Anthropic
- Type:
NDLLMProvider
- CLAUDE_3_HAIKU_20240307
refers to ‘claude-3-haiku-20240307’ model by Anthropic
- Type:
NDLLMProvider
- GEMINI_PRO
refers to ‘gemini-pro’ model by Google
- Type:
NDLLMProvider
- GEMINI_1_PRO_LATEST
refers to ‘gemini-1.0-pro-latest’ model by Google
- Type:
NDLLMProvider
- GEMINI_15_PRO_LATEST
refers to ‘gemini-1.5-pro-latest’ model by Google
- Type:
NDLLMProvider
- GEMINI_15_PRO_EXP_0801
refers to ‘gemini-1.5-pro-exp-0801’ model by Google
- Type:
NDLLMProvider
- GEMINI_15_FLASH_LATEST
refers to ‘gemini-1.5-flash-latest’ model by Google
- Type:
NDLLMProvider
- COMMAND_R
refers to ‘command-r’ model by Cohere
- Type:
NDLLMProvider
- COMMAND_R_PLUS
refers to ‘command-r-plus’ model by Cohere
- Type:
NDLLMProvider
- MISTRAL_LARGE_LATEST
refers to ‘mistral-large-latest’ model by Mistral AI
- Type:
NDLLMProvider
- MISTRAL_LARGE_2407
refers to ‘mistral-large-2407’ model by Mistral AI
- Type:
NDLLMProvider
- MISTRAL_LARGE_2402
refers to ‘mistral-large-2402’ model by Mistral AI
- Type:
NDLLMProvider
- MISTRAL_MEDIUM_LATEST
refers to ‘mistral-medium-latest’ model by Mistral AI
- Type:
NDLLMProvider
- MISTRAL_SMALL_LATEST
refers to ‘mistral-small-latest’ model by Mistral AI
- Type:
NDLLMProvider
- OPEN_MISTRAL_7B
refers to ‘open-mistral-7b’ model by Mistral AI
- Type:
NDLLMProvider
- OPEN_MIXTRAL_8X7B
refers to ‘open-mixtral-8x7b’ model by Mistral AI
- Type:
NDLLMProvider
- OPEN_MIXTRAL_8X22B
refers to ‘open-mixtral-8x22b’ model by Mistral AI
- Type:
NDLLMProvider
- OPEN_MISTRAL_NEMO
refers to ‘open-mistral-nemo’ model by Mistral AI
- Type:
NDLLMProvider
- TOGETHER_MISTRAL_7B_INSTRUCT_V0_2
refers to ‘Mistral-7B-Instruct-v0.2’ model served via TogetherAI
- Type:
NDLLMProvider
- TOGETHER_MIXTRAL_8X7B_INSTRUCT_V0_1
refers to ‘Mixtral-8x7B-Instruct-v0.1’ model served via TogetherAI
- Type:
NDLLMProvider
- TOGETHER_MIXTRAL_8X22B_INSTRUCT_V0_1
refers to ‘Mixtral-8x22B-Instruct-v0.1’ model served via TogetherAI
- Type:
NDLLMProvider
- TOGETHER_LLAMA_3_70B_CHAT_HF
refers to ‘Llama-3-70b-chat-hf’ model served via TogetherAI
- Type:
NDLLMProvider
- TOGETHER_LLAMA_3_8B_CHAT_HF
refers to ‘Llama-3-8b-chat-hf’ model served via TogetherAI
- Type:
NDLLMProvider
- TOGETHER_QWEN2_72B_INSTRUCT
refers to ‘Qwen2-72B-Instruct’ model served via TogetherAI
- Type:
NDLLMProvider
- TOGETHER_LLAMA_3_1_8B_INSTRUCT_TURBO
refers to ‘Meta-Llama-3.1-8B-Instruct-Turbo’ model served via TogetherAI
- Type:
NDLLMProvider
- TOGETHER_LLAMA_3_1_70B_INSTRUCT_TURBO
refers to ‘Meta-Llama-3.1-70B-Instruct-Turbo’ model served via TogetherAI
- Type:
NDLLMProvider
- TOGETHER_LLAMA_3_1_405B_INSTRUCT_TURBO
refers to ‘Meta-Llama-3.1-405B-Instruct-Turbo’ model served via TogetherAI
- Type:
NDLLMProvider
- REPLICATE_MISTRAL_7B_INSTRUCT_V0_2
refers to “mistral-7b-instruct-v0.2” model served via Replicate
- Type:
NDLLMProvider
- REPLICATE_MIXTRAL_8X7B_INSTRUCT_V0_1
refers to “mixtral-8x7b-instruct-v0.1” model served via Replicate
- Type:
NDLLMProvider
- REPLICATE_META_LLAMA_3_70B_INSTRUCT
refers to “meta-llama-3-70b-instruct” model served via Replicate
- Type:
NDLLMProvider
- REPLICATE_META_LLAMA_3_8B_INSTRUCT
refers to “meta-llama-3-8b-instruct” model served via Replicate
- Type:
NDLLMProvider
- REPLICATE_META_LLAMA_3_1_405B_INSTRUCT
refers to “meta-llama-3.1-405b-instruct” model served via Replicate
- Type:
NDLLMProvider
- LLAMA_3_1_SONAR_LARGE_128K_ONLINE
refers to “llama-3.1-sonar-large-128k-online” model by Perplexity
- Type:
NDLLMProvider
- CHATGPT_4o_LATEST = LLMConfig(openai/chatgpt-4o-latest)
- CLAUDE_2_1 = LLMConfig(anthropic/claude-2.1)
- CLAUDE_3_5_HAIKU_20241022 = LLMConfig(anthropic/claude-3-5-haiku-20241022)
- CLAUDE_3_5_SONNET_20240620 = LLMConfig(anthropic/claude-3-5-sonnet-20240620)
- CLAUDE_3_5_SONNET_20241022 = LLMConfig(anthropic/claude-3-5-sonnet-20241022)
- CLAUDE_3_5_SONNET_LATEST = LLMConfig(anthropic/claude-3-5-sonnet-latest)
- CLAUDE_3_HAIKU_20240307 = LLMConfig(anthropic/claude-3-haiku-20240307)
- CLAUDE_3_OPUS_20240229 = LLMConfig(anthropic/claude-3-opus-20240229)
- CLAUDE_3_SONNET_20240229 = LLMConfig(anthropic/claude-3-sonnet-20240229)
- CODESTRAL_LATEST = LLMConfig(mistral/codestral-latest)
- COMMAND_R = LLMConfig(cohere/command-r)
- COMMAND_R_PLUS = LLMConfig(cohere/command-r-plus)
- GEMINI_15_FLASH_LATEST = LLMConfig(google/gemini-1.5-flash-latest)
- GEMINI_15_PRO_EXP_0801 = LLMConfig(google/gemini-1.5-pro-exp-0801)
- GEMINI_15_PRO_LATEST = LLMConfig(google/gemini-1.5-pro-latest)
- GEMINI_1_PRO_LATEST = LLMConfig(google/gemini-1.0-pro-latest)
- GEMINI_PRO = LLMConfig(google/gemini-pro)
- GPT_3_5_TURBO = LLMConfig(openai/gpt-3.5-turbo)
- GPT_3_5_TURBO_0125 = LLMConfig(openai/gpt-3.5-turbo-0125)
- GPT_4 = LLMConfig(openai/gpt-4)
- GPT_4_0125_PREVIEW = LLMConfig(openai/gpt-4-0125-preview)
- GPT_4_0613 = LLMConfig(openai/gpt-4-0613)
- GPT_4_1106_PREVIEW = LLMConfig(openai/gpt-4-1106-preview)
- GPT_4_TURBO = LLMConfig(openai/gpt-4-turbo)
- GPT_4_TURBO_2024_04_09 = LLMConfig(openai/gpt-4-turbo-2024-04-09)
- GPT_4_TURBO_PREVIEW = LLMConfig(openai/gpt-4-turbo-preview)
- GPT_4o = LLMConfig(openai/gpt-4o)
- GPT_4o_2024_05_13 = LLMConfig(openai/gpt-4o-2024-05-13)
- GPT_4o_2024_08_06 = LLMConfig(openai/gpt-4o-2024-08-06)
- GPT_4o_MINI = LLMConfig(openai/gpt-4o-mini)
- GPT_4o_MINI_2024_07_18 = LLMConfig(openai/gpt-4o-mini-2024-07-18)
- LLAMA_3_1_SONAR_LARGE_128K_ONLINE = LLMConfig(perplexity/llama-3.1-sonar-large-128k-online)
- MISTRAL_LARGE_2402 = LLMConfig(mistral/mistral-large-2402)
- MISTRAL_LARGE_2407 = LLMConfig(mistral/mistral-large-2407)
- MISTRAL_LARGE_LATEST = LLMConfig(mistral/mistral-large-latest)
- MISTRAL_MEDIUM_LATEST = LLMConfig(mistral/mistral-medium-latest)
- MISTRAL_SMALL_LATEST = LLMConfig(mistral/mistral-small-latest)
- O1_MINI = LLMConfig(openai/o1-mini)
- O1_MINI_2024_09_12 = LLMConfig(openai/o1-mini-2024-09-12)
- O1_PREVIEW = LLMConfig(openai/o1-preview)
- O1_PREVIEW_2024_09_12 = LLMConfig(openai/o1-preview-2024-09-12)
- OPEN_MISTRAL_7B = LLMConfig(mistral/open-mistral-7b)
- OPEN_MISTRAL_NEMO = LLMConfig(mistral/open-mistral-nemo)
- OPEN_MIXTRAL_8X22B = LLMConfig(mistral/open-mixtral-8x22b)
- OPEN_MIXTRAL_8X7B = LLMConfig(mistral/open-mixtral-8x7b)
- REPLICATE_META_LLAMA_3_1_405B_INSTRUCT = LLMConfig(replicate/meta-llama-3.1-405b-instruct)
- REPLICATE_META_LLAMA_3_70B_INSTRUCT = LLMConfig(replicate/meta-llama-3-70b-instruct)
- REPLICATE_META_LLAMA_3_8B_INSTRUCT = LLMConfig(replicate/meta-llama-3-8b-instruct)
- REPLICATE_MISTRAL_7B_INSTRUCT_V0_2 = LLMConfig(replicate/mistral-7b-instruct-v0.2)
- REPLICATE_MIXTRAL_8X7B_INSTRUCT_V0_1 = LLMConfig(replicate/mixtral-8x7b-instruct-v0.1)
- TOGETHER_LLAMA_3_1_405B_INSTRUCT_TURBO = LLMConfig(togetherai/Meta-Llama-3.1-405B-Instruct-Turbo)
- TOGETHER_LLAMA_3_1_70B_INSTRUCT_TURBO = LLMConfig(togetherai/Meta-Llama-3.1-70B-Instruct-Turbo)
- TOGETHER_LLAMA_3_1_8B_INSTRUCT_TURBO = LLMConfig(togetherai/Meta-Llama-3.1-8B-Instruct-Turbo)
- TOGETHER_LLAMA_3_70B_CHAT_HF = LLMConfig(togetherai/Llama-3-70b-chat-hf)
- TOGETHER_LLAMA_3_8B_CHAT_HF = LLMConfig(togetherai/Llama-3-8b-chat-hf)
- TOGETHER_MISTRAL_7B_INSTRUCT_V0_2 = LLMConfig(togetherai/Mistral-7B-Instruct-v0.2)
- TOGETHER_MIXTRAL_8X22B_INSTRUCT_V0_1 = LLMConfig(togetherai/Mixtral-8x22B-Instruct-v0.1)
- TOGETHER_MIXTRAL_8X7B_INSTRUCT_V0_1 = LLMConfig(togetherai/Mixtral-8x7B-Instruct-v0.1)
- TOGETHER_QWEN2_72B_INSTRUCT = LLMConfig(togetherai/Qwen2-72B-Instruct)
notdiamond.llms.request
- async notdiamond.llms.request.amodel_select(messages: List[Dict[str, str]], llm_configs: List[LLMConfig], metric: Metric, notdiamond_api_key: str, max_model_depth: int, hash_content: bool, tradeoff: str | None = None, preference_id: str | None = None, tools: Sequence[Dict[str, Any] | Callable] | None = [], previous_session: str | None = None, timeout: float | int | None = 60, max_retries: int | None = 3, nd_api_url: str | None = 'https://api.notdiamond.ai', _user_agent: str = 'Python-SDK/0.3.32')[source]
This endpoint receives the prompt and routing settings, and makes a call to the NotDiamond API. It returns the best fitting LLM to call and a session ID that can be used for feedback.
- Parameters:
messages (List[Dict[str, str]]) – list of messages to be used for the LLM call
llm_configs (List[LLMConfig]) – a list of available LLMs that the router can decide from
metric (Metric) – metric based off which the router makes the decision. As of now only ‘accuracy’ supported.
notdiamond_api_key (str) – API key generated via the NotDiamond dashboard.
max_model_depth (int) – if your top recommended model is down, specify up to which depth of routing you’re willing to go.
hash_content (Optional[bool]) – Flag for hashing content before sending to NotDiamond API.
tradeoff (Optional[str], optional) – Define the “cost” or “latency” tradeoff for the router to determine the best LLM for a given query.
preference_id (Optional[str], optional) – The ID of the router preference that was configured via the Dashboard. Defaults to None.
previous_session (Optional[str], optional) – The session ID of a previous session, allow you to link requests.
timeout (int, optional) – timeout for the request. Defaults to 60.
max_retries (int, optional) – The maximum number of retries to make when calling the Not Diamond API.
nd_api_url (Optional[str], optional) – The URL of the NotDiamond API. Defaults to None.
tools (Sequence[Dict[str, Any] | Callable] | None)
_user_agent (str)
- Returns:
- returns a tuple of the chosen LLMConfig to call and a session ID string.
In case of an error the LLM defaults to None and the session ID defaults to ‘NO-SESSION-ID’.
- Return type:
tuple(LLMConfig, string)
- notdiamond.llms.request.create_preference_id(notdiamond_api_key: str, name: str | None = None, nd_api_url: str | None = 'https://api.notdiamond.ai', _user_agent: str = 'Python-SDK/0.3.32') str [source]
Create a preference id with an optional name. The preference name will appear in your dashboard on Not Diamond.
- Parameters:
notdiamond_api_key (str)
name (str | None)
nd_api_url (str | None)
_user_agent (str)
- Return type:
str
- notdiamond.llms.request.get_tools_in_openai_format(tools: Sequence[Dict[str, Any] | Callable] | None)[source]
This function converts the tools list into the format that OpenAI expects. Does this by using langchains Model that automatically creates the dictionary on bind_tools
- Parameters:
tools (Optional[Sequence[Union[Dict[str, Any], Callable]]]) – list of tools to be converted
- Returns:
dictionary of tools in the format that OpenAI expects
- Return type:
dict
- notdiamond.llms.request.model_select(messages: List[Dict[str, str]], llm_configs: List[LLMConfig], metric: Metric, notdiamond_api_key: str, max_model_depth: int, hash_content: bool, tradeoff: str | None = None, preference_id: str | None = None, tools: Sequence[Dict[str, Any] | Callable] | None = [], previous_session: str | None = None, timeout: float | int | None = 60, max_retries: int | None = 3, nd_api_url: str | None = 'https://api.notdiamond.ai', _user_agent: str = 'Python-SDK/0.3.32')[source]
This endpoint receives the prompt and routing settings, and makes a call to the NotDiamond API. It returns the best fitting LLM to call and a session ID that can be used for feedback.
- Parameters:
messages (List[Dict[str, str]]) – list of messages to be used for the LLM call
llm_configs (List[LLMConfig]) – a list of available LLMs that the router can decide from
metric (Metric) – metric based off which the router makes the decision. As of now only ‘accuracy’ supported.
notdiamond_api_key (str) – API key generated via the NotDiamond dashboard.
max_model_depth (int) – if your top recommended model is down, specify up to which depth of routing you’re willing to go.
hash_content (Optional[bool]) – Flag for hashing content before sending to NotDiamond API.
tradeoff (Optional[str], optional) – Define the “cost” or “latency” tradeoff for the router to determine the best LLM for a given query.
preference_id (Optional[str], optional) – The ID of the router preference that was configured via the Dashboard. Defaults to None.
previous_session (Optional[str], optional) – The session ID of a previous session, allow you to link requests.
timeout (int, optional) – timeout for the request. Defaults to 60.
max_retries (int, optional) – The maximum number of retries to make when calling the Not Diamond API. Defaults to 3.
nd_api_url (Optional[str], optional) – The URL of the NotDiamond API. Defaults to None.
tools (Sequence[Dict[str, Any] | Callable] | None)
_user_agent (str)
- Returns:
- returns a tuple of the chosen LLMConfig to call and a session ID string.
In case of an error the LLM defaults to None and the session ID defaults to ‘NO-SESSION-ID’.
- Return type:
tuple(LLMConfig, string)
- notdiamond.llms.request.model_select_prepare(messages: List[Dict[str, str]], llm_configs: List[LLMConfig], metric: Metric, notdiamond_api_key: str, max_model_depth: int, hash_content: bool, tradeoff: str | None = None, preference_id: str | None = None, tools: Sequence[Dict[str, Any] | Callable] | None = [], previous_session: str | None = None, nd_api_url: str | None = 'https://api.notdiamond.ai', _user_agent: str = 'Python-SDK/0.3.32')[source]
This is the core method for the model_select endpoint. It returns the best fitting LLM to call and a session ID that can be used for feedback.
- Parameters:
messages (List[Dict[str, str]]) – list of messages to be used for the LLM call
llm_configs (List[LLMConfig]) – a list of available LLMs that the router can decide from
metric (Metric) – metric based off which the router makes the decision. As of now only ‘accuracy’ supported.
notdiamond_api_key (str) – API key generated via the NotDiamond dashboard.
max_model_depth (int) – if your top recommended model is down, specify up to which depth of routing you’re willing to go.
hash_content (Optional[bool]) – Flag for hashing content before sending to NotDiamond API.
tradeoff (Optional[str], optional) – Define the “cost” or “latency” tradeoff for the router to determine the best LLM for a given query.
preference_id (Optional[str], optional) – The ID of the router preference that was configured via the Dashboard. Defaults to None.
previous_session (Optional[str], optional) – The session ID of a previous session, allow you to link requests.
async_mode (bool, optional) – whether to run the request in async mode. Defaults to False.
nd_api_url (Optional[str], optional) – The URL of the NotDiamond API. Defaults to None.
tools (Sequence[Dict[str, Any] | Callable] | None)
_user_agent (str)
- Returns:
returns data to be used for the API call of modelSelect
- Return type:
tuple(url, payload, headers)
- notdiamond.llms.request.report_latency(session_id: str, llm_config: LLMConfig, tokens_per_second: float, notdiamond_api_key: str, nd_api_url: str | None = 'https://api.notdiamond.ai', _user_agent: str = 'Python-SDK/0.3.32')[source]
This method makes an API call to the NotDiamond server to report the latency of an LLM call. It helps fine-tune our model router and ensure we offer recommendations that meet your latency expectation.
This feature can be disabled on the NDLLM class level by setting latency_tracking to False.
- Parameters:
session_id (str) – the session ID that was returned from the invoke or model_select calls, so we know which router call your latency report refers to.
llm_provider (LLMConfig) – specifying the LLM provider for which the latency is reported
tokens_per_second (float) – latency of the model call calculated based on time elapsed, input tokens, and output tokens
notdiamond_api_key (str) – NotDiamond API call used for authentication
nd_api_url (Optional[str], optional) – The URL of the NotDiamond API. Defaults to None.
llm_config (LLMConfig)
_user_agent (str)
- Returns:
status code of the API call, 200 if it’s success
- Return type:
int
- Raises:
ApiError – if the API call to the NotDiamond backend fails, this error is raised