notdiamond.toolkit.litellm

notdiamond.toolkit.litellm.litellm_notdiamond

exception notdiamond.toolkit.litellm.litellm_notdiamond.NotDiamondError(status_code, message, url='https://api.notdiamond.ai')[source]

Bases: Exception

class notdiamond.toolkit.litellm.litellm_notdiamond.NotDiamondConfig(llm_providers: List[Dict[str, str]], tools: str | None = None, max_model_depth: int | None = 1, tradeoff: str | None = None, preference_id: str | None = None, hash_content: bool | None = False)[source]

Bases: object

Parameters:
  • llm_providers (List[Dict[str, str]])

  • tools (List[Dict[str, str]] | None)

  • max_model_depth (int)

  • tradeoff (str | None)

  • preference_id (str | None)

  • hash_content (bool | None)

classmethod get_config()[source]
hash_content: bool | None = False
llm_providers: List[Dict[str, str]]
max_model_depth: int = 1
preference_id: str | None = None
tools: List[Dict[str, str]] | None = None
tradeoff: str | None = None
notdiamond.toolkit.litellm.litellm_notdiamond.completion(model: str, messages: list, api_base: str, model_response: ModelResponse, print_verbose: Callable, encoding, api_key, logging_obj, optional_params=None, litellm_params=None, logger_fn=None)[source]
Parameters:
  • model (str)

  • messages (list)

  • api_base (str)

  • model_response (ModelResponse)

  • print_verbose (Callable)

notdiamond.toolkit.litellm.litellm_notdiamond.get_litellm_model(response: dict) str[source]
Parameters:

response (dict)

Return type:

str

notdiamond.toolkit.litellm.litellm_notdiamond.update_litellm_params(litellm_params: dict)[source]

Create a new litellm_params dict with non-default litellm_params from the original call, custom_llm_provider and api_base

Parameters:

litellm_params (dict)

notdiamond.toolkit.litellm.litellm_notdiamond.validate_environment(api_key)[source]

notdiamond.toolkit.litellm.main

async notdiamond.toolkit.litellm.main.acompletion(model: str, messages: List = [], functions: List | None = None, function_call: str | None = None, timeout: float | int | None = None, temperature: float | None = None, top_p: float | None = None, n: int | None = None, stream: bool | None = None, stream_options: dict | None = None, stop=None, max_tokens: int | None = None, presence_penalty: float | None = None, frequency_penalty: float | None = None, logit_bias: dict | None = None, user: str | None = None, response_format: dict | Type[BaseModel] | None = None, seed: int | None = None, tools: List | None = None, tool_choice: str | None = None, parallel_tool_calls: bool | None = None, logprobs: bool | None = None, top_logprobs: int | None = None, deployment_id=None, base_url: str | None = None, api_version: str | None = None, api_key: str | None = None, model_list: list | None = None, extra_headers: dict | None = None, **kwargs) ModelResponse | CustomStreamWrapper[source]

Asynchronously executes a litellm.completion() call for any of litellm supported llms (example gpt-4, gpt-3.5-turbo, claude-2, command-nightly)

Parameters:
  • model (str) – The name of the language model to use for text completion. see all supported LLMs: https://docs.litellm.ai/docs/providers/

  • messages (List) – A list of message objects representing the conversation context (default is an empty list).

  • PARAMS (OPTIONAL)

  • functions (List, optional) – A list of functions to apply to the conversation messages (default is an empty list).

  • function_call (str, optional) – The name of the function to call within the conversation (default is an empty string).

  • temperature (float, optional) – The temperature parameter for controlling the randomness of the output (default is 1.0).

  • top_p (float, optional) – The top-p parameter for nucleus sampling (default is 1.0).

  • n (int, optional) – The number of completions to generate (default is 1).

  • stream (bool, optional) – If True, return a streaming response (default is False).

  • stream_options (dict, optional) – A dictionary containing options for the streaming response. Only use this if stream is True.

  • stop (string/list, optional) –

    • Up to 4 sequences where the LLM API will stop generating further tokens.

  • max_tokens (integer, optional) – The maximum number of tokens in the generated completion (default is infinity).

  • presence_penalty (float, optional) – It is used to penalize new tokens based on their existence in the text so far.

  • frequency_penalty (float | None) – It is used to penalize new tokens based on their frequency in the text so far.

  • logit_bias (dict, optional) – Used to modify the probability of specific tokens appearing in the completion.

  • user (str, optional) – A unique identifier representing your end-user. This can help the LLM provider to monitor and detect abuse.

  • metadata (dict, optional) – Pass in additional metadata to tag your completion calls - eg. prompt version, details, etc.

  • api_base (str, optional) – Base URL for the API (default is None).

  • api_version (str, optional) – API version (default is None).

  • api_key (str, optional) – API key (default is None).

  • model_list (list, optional) – List of api base, version, keys

  • timeout (float, optional) – The maximum execution time in seconds for the completion request.

  • Params (LITELLM Specific)

  • mock_response (str, optional) – If provided, return a mock completion response for testing or debugging purposes (default is None).

  • custom_llm_provider (str, optional) – Used for Non-OpenAI LLMs, Example usage for bedrock, set model=”amazon.titan-tg1-large” and custom_llm_provider=”bedrock”

  • response_format (dict | Type[BaseModel] | None)

  • seed (int | None)

  • tools (List | None)

  • tool_choice (str | None)

  • parallel_tool_calls (bool | None)

  • logprobs (bool | None)

  • top_logprobs (int | None)

  • base_url (str | None)

  • extra_headers (dict | None)

Returns:

A response object containing the generated completion and associated metadata.

Return type:

ModelResponse

Notes

  • This function is an asynchronous version of the completion function.

  • The completion function is called using run_in_executor to execute synchronously in the event loop.

  • If stream is True, the function returns an async generator that yields completion lines.

notdiamond.toolkit.litellm.main.completion(model: str, messages: List = [], timeout: float | str | Timeout | None = None, temperature: float | None = None, top_p: float | None = None, n: int | None = None, stream: bool | None = None, stream_options: dict | None = None, stop=None, max_tokens: int | None = None, presence_penalty: float | None = None, frequency_penalty: float | None = None, logit_bias: dict | None = None, user: str | None = None, response_format: dict | Type[BaseModel] | None = None, seed: int | None = None, tools: List | None = None, tool_choice: str | dict | None = None, logprobs: bool | None = None, top_logprobs: int | None = None, parallel_tool_calls: bool | None = None, deployment_id=None, extra_headers: dict | None = None, functions: List | None = None, function_call: str | None = None, base_url: str | None = None, api_version: str | None = None, api_key: str | None = None, model_list: list | None = None, **kwargs) ModelResponse | CustomStreamWrapper[source]

Perform a completion() using any of litellm supported llms (example gpt-4, gpt-3.5-turbo, claude-2, command-nightly) :param model: The name of the language model to use for text completion. see all supported LLMs: https://docs.litellm.ai/docs/providers/ :type model: str :param messages: A list of message objects representing the conversation context (default is an empty list). :type messages: List

OPTIONAL PARAMS

functions (List, optional): A list of functions to apply to the conversation messages (default is an empty list). function_call (str, optional): The name of the function to call within the conversation (default is an empty string). temperature (float, optional): The temperature parameter for controlling the randomness of the output (default is 1.0). top_p (float, optional): The top-p parameter for nucleus sampling (default is 1.0). n (int, optional): The number of completions to generate (default is 1). stream (bool, optional): If True, return a streaming response (default is False). stream_options (dict, optional): A dictionary containing options for the streaming response. Only set this when you set stream: true. stop(string/list, optional): - Up to 4 sequences where the LLM API will stop generating further tokens. max_tokens (integer, optional): The maximum number of tokens in the generated completion (default is infinity). presence_penalty (float, optional): It is used to penalize new tokens based on their existence in the text so far. frequency_penalty: It is used to penalize new tokens based on their frequency in the text so far. logit_bias (dict, optional): Used to modify the probability of specific tokens appearing in the completion. user (str, optional): A unique identifier representing your end-user. This can help the LLM provider to monitor and detect abuse. logprobs (bool, optional): Whether to return log probabilities of the output tokens or not. If true, returns the log probabilities of each output token returned in the content of message top_logprobs (int, optional): An integer between 0 and 5 specifying the number of most likely tokens to return at each token position, each with an associated log probability. logprobs must be set to true if this parameter is used. metadata (dict, optional): Pass in additional metadata to tag your completion calls - eg. prompt version, details, etc. api_base (str, optional): Base URL for the API (default is None). api_version (str, optional): API version (default is None). api_key (str, optional): API key (default is None). model_list (list, optional): List of api base, version, keys extra_headers (dict, optional): Additional headers to include in the request.

LITELLM Specific Params

mock_response (str, optional): If provided, return a mock completion response for testing or debugging purposes (default is None). custom_llm_provider (str, optional): Used for Non-OpenAI LLMs, Example usage for bedrock, set model=”amazon.titan-tg1-large” and custom_llm_provider=”bedrock” max_retries (int, optional): The number of retries to attempt (default is 0).

Returns:

A response object containing the generated completion and associated metadata.

Return type:

ModelResponse

Parameters:
  • model (str)

  • messages (List)

  • timeout (float | str | Timeout | None)

  • temperature (float | None)

  • top_p (float | None)

  • n (int | None)

  • stream (bool | None)

  • stream_options (dict | None)

  • max_tokens (int | None)

  • presence_penalty (float | None)

  • frequency_penalty (float | None)

  • logit_bias (dict | None)

  • user (str | None)

  • response_format (dict | Type[BaseModel] | None)

  • seed (int | None)

  • tools (List | None)

  • tool_choice (str | dict | None)

  • logprobs (bool | None)

  • top_logprobs (int | None)

  • parallel_tool_calls (bool | None)

  • extra_headers (dict | None)

  • functions (List | None)

  • function_call (str | None)

  • base_url (str | None)

  • api_version (str | None)

  • api_key (str | None)

  • model_list (list | None)

Note

  • This function is used to perform completions() using the specified language model.

  • It supports various optional parameters for customizing the completion behavior.

  • If ‘mock_response’ is provided, a mock completion response is returned for testing or debugging.

notdiamond.toolkit.litellm.main.get_api_key(llm_provider: str, dynamic_api_key: str | None)[source]
Parameters:
  • llm_provider (str)

  • dynamic_api_key (str | None)

notdiamond.toolkit.litellm.main.get_llm_provider(model: str, custom_llm_provider: str | None = None, api_base: str | None = None, api_key: str | None = None, litellm_params: LiteLLM_Params | None = None) Tuple[str, str, str | None, str | None][source]

Returns the provider for a given model name - e.g. ‘azure/chatgpt-v-2’ -> ‘azure’

For router -> Can also give the whole litellm param dict -> this function will extract the relevant details

Raises Error - if unable to map model to a provider

Parameters:
  • model (str)

  • custom_llm_provider (str | None)

  • api_base (str | None)

  • api_key (str | None)

  • litellm_params (LiteLLM_Params | None)

Return type:

Tuple[str, str, str | None, str | None]