utils.token_counters

The utils.token_counters module provides a set of tools for counting and managing tokens in text strings. This is particularly useful for applications interfacing with language models like GPT-4, where understanding and monitoring token usage is crucial for optimizing API calls and managing computational resources efficiently.

Key Components:
  • token_counter_og: A function that returns the exact number of tokens in a text string for a specified API,

    using the tiktoken library for encoding. It’s designed for cases where accuracy is paramount.

  • token_counter_est: Provides a quick estimation of token count based on word count, using a heuristic approach.

    This function is useful for scenarios where performance is a consideration, and an exact token count is less critical.

  • TokenCounter: A class that tracks the number of tokens in requests and responses.

Usage: This module is intended to be used in applications that require detailed monitoring and management of token usage, especially when interfacing with language models. It provides both precise and estimated token counting functions, along with a mechanism to track token usage throughout the application lifecycle via the TokenCounter class and its singleton instances. The decorators offer a convenient way to add token counting to API calls without significant modification to existing code.

class celi_framework.utils.token_counters.TokenCounter(token_budget=0, token_counter_fn=<function token_counter_est>)

Bases: object

Counts tokens in requests and responses and applies an optional budget.

cached_token_count: int = 0
count_cached_tokens(*prompt_fields)
count_request_tokens(*prompt_fields)

Counts request tokens by converting all prompt_fields to strings and raises an exception if it exceeded the budget.

count_response_tokens(response)
current_token_count: int = 0
token_budget: int = 0
token_counter_fn()

Returns an estimated number of tokens in a text string based on word count.

This function provides a quick estimation of token count by multiplying the word count by 4/3, a heuristic for approximating tokens in typical English text.

Return type:

int

Parameters:

string (str) – The text string to estimate token count for.

Returns:

The estimated number of tokens.

Return type:

int

celi_framework.utils.token_counters.token_counter_est(string)

Returns an estimated number of tokens in a text string based on word count.

This function provides a quick estimation of token count by multiplying the word count by 4/3, a heuristic for approximating tokens in typical English text.

Return type:

int

Parameters:

string (str) – The text string to estimate token count for.

Returns:

The estimated number of tokens.

Return type:

int

celi_framework.utils.token_counters.token_counter_og(string, api='gpt-4')

Returns the exact number of tokens in a text string for a specified API.

This function encodes a string using the tiktoken library’s encoding functionality, specifically tailored to the GPT-4 model, or logs an error if an unsupported API is specified.

Return type:

int

Parameters:
  • string (str) – The text string to encode.

  • api (str) – The API specification, defaulting to ‘gpt-4’.

Returns:

The number of tokens in the encoded string.

Return type:

int