• EnglishEspañol日本語한국어Português
  • Log inStart now

set_llm_token_count_callback (Python agent API)


newrelic.agent.set_llm_token_count_callback(callback, application=None)

Registers a callback function which will be used for calculating token counts on Large Language Model (LLM) events.


Python agent version 9.8.0 or higher.


This API registers a callback to calculate and store token counts on LlmEmbedding and LlmChatCompletionMessage events.

  • This function should be used when ai_monitoring.record_content.enabled is set to false. This setting prevents the agent from sending AI content to the New Relic server, where the token counts are attached server side.
  • If you'd still like to capture token counts for LLM events, you can implement a callback in your app code to determine the token counts locally and send this information to New Relic.

In most cases, this API will be called exactly once, but you can make multiple calls to this API. Each fresh call made to the endpoint overwrites the previously registered callback with the new one that is provided. To unset the callback completely, pass None in place of the original callback.

API Parameters




callable or None

Required. The callback to calculate token counts. To unset the current callback, pass None in instead of a callback function.



Optional. The specific application object to associate the API call with. An application object can be obtained using the newrelic.agent.application function.

Return values


Callback Requirements

Provided callbacks must return a positive integer token count value or no token count will be captured on LLM events.

Callback Parameters





Required. The name of the LLM model.



Required. The message content/ prompt or embedding input.


Calculating token counts and registering the callback

Example with tiktoken:

import newrelic.agent
def token_count_callback(model, content):
Calculate token counts locally based on the model being used and the content.
This callback will be invoked for each message sent or received during a LLM call.
If the application supports more than one model, it may require finding libraries for
each model to support token counts appropriately.
model -- name of the LLM model
content -- the LLM message content
import tiktoken
enc = tiktoken.encoding_for_model(model)
except KeyError:
return None # Unknown model
return len(enc.encode(content))

Example API usage with an application object passed in:

application = newrelic.agent.register_application(timeout=10.0)
newrelic.agent.set_llm_token_count_callback(token_count_callback, application)
Copyright © 2024 New Relic Inc.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.