• EnglishEspañol日本語한국어Português
  • Log inStart now

set_llm_token_count_callback (Python agent API)

Syntax

newrelic.agent.set_llm_token_count_callback(callback, application=None)

Registers a callback function which will be used for calculating token counts on Large Language Model (LLM) events.

Requirements

Python agent version 9.8.0 or higher.

Description

This API registers a callback to calculate and store token counts on LlmEmbedding and LlmChatCompletionMessage events.

  • This function should be used when ai_monitoring.record_content.enabled is set to false. This setting prevents the agent from sending AI content to the New Relic server, where the token counts are attached server side.
  • If you'd still like to capture token counts for LLM events, you can implement a callback in your app code to determine the token counts locally and send this information to New Relic.

In most cases, this API will be called exactly once, but you can make multiple calls to this API. Each fresh call made to the endpoint overwrites the previously registered callback with the new one that is provided. To unset the callback completely, pass None in place of the original callback.

API Parameters

Parameter

Description

callback

callable or None

Required. The callback to calculate token counts. To unset the current callback, pass None in instead of a callback function.

application

object

Optional. The specific application object to associate the API call with. An application object can be obtained using the newrelic.agent.application function.

Return values

None.

Callback Requirements

Provided callbacks must return a positive integer token count value or no token count will be captured on LLM events.

Callback Parameters

Parameter

Description

model

string

Required. The name of the LLM model.

content

string

Required. The message content/ prompt or embedding input.

Examples

Calculating token counts and registering the callback

Example with tiktoken:

import newrelic.agent
def token_count_callback(model, content):
"""
Calculate token counts locally based on the model being used and the content.
This callback will be invoked for each message sent or received during a LLM call.
If the application supports more than one model, it may require finding libraries for
each model to support token counts appropriately.
Arguments:
model -- name of the LLM model
content -- the LLM message content
"""
import tiktoken
try:
enc = tiktoken.encoding_for_model(model)
except KeyError:
return None # Unknown model
return len(enc.encode(content))
newrelic.agent.set_llm_token_count_callback(token_count_callback)

Example API usage with an application object passed in:

application = newrelic.agent.register_application(timeout=10.0)
newrelic.agent.set_llm_token_count_callback(token_count_callback, application)
Copyright © 2024 New Relic Inc.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.