Skip to main content

What is OpenLLMetry?

OpenLLMetry is an open-source SDK built by Traceloop that automatically instruments your LLM calls. It captures request and response data and sends it to any OpenTelemetry-compatible backend, including Moda.

Installation

pip install traceloop-sdk

Setup

Add a few lines at the start of your application:
from traceloop.sdk import Traceloop

Traceloop.init(
    base_url="https://moda-ingest.modas.workers.dev/v1/traces",
    api_key="YOUR_MODA_API_KEY"
)

# Your existing code works as normal
from openai import OpenAI

client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Configuration options

OptionDescription
base_url / baseUrlThe Moda ingest endpoint URL
api_key / apiKeyYour Moda API key
service_nameOptional name to identify your application

What data is captured

OpenLLMetry automatically captures:
  • Messages: User prompts and assistant responses
  • Content blocks: Structured content including tool use, extended thinking, and images (when supported by the provider)
  • Token usage: Input, output, and reasoning token counts (for cost tracking)
  • Model info: Model name and provider
  • Timing: Request duration and timestamps
Content blocks are automatically captured when using providers that support structured responses (Anthropic, OpenAI with tools). This enables features like conversation replay in the Moda dashboard.

Semantic conventions

OpenLLMetry uses OpenTelemetry semantic conventions for LLM spans:
AttributeDescription
llm.prompts.{n}.roleRole of the nth prompt message
llm.prompts.{n}.contentContent of the nth prompt message
llm.completions.{n}.roleRole of the nth completion
llm.completions.{n}.contentContent of the nth completion
llm.usage.prompt_tokensNumber of input tokens
llm.usage.completion_tokensNumber of output tokens
llm.request.modelModel name
llm.systemProvider name

Supported providers

OpenLLMetry automatically captures calls to:
  • OpenAI
  • Anthropic
  • Cohere
  • Azure OpenAI
  • AWS Bedrock
  • Google AI (Vertex AI, Gemini)
  • Mistral
  • Ollama
  • And more

Verifying it works

After setting up, make a few LLM calls and check:
  1. No errors in your application logs
  2. Data appears in the Moda dashboard within a few seconds
OpenLLMetry sends data asynchronously, so it does not slow down your LLM calls.

Advanced: Custom attributes

You can add custom attributes to your traces for additional context:
from traceloop.sdk import Traceloop
from traceloop.sdk.decorators import workflow

@workflow(name="my-chatbot")
def handle_chat(user_message: str):
    # Your LLM call here
    pass

Troubleshooting

Data not appearing?
  • Check that your API key is correct
  • Verify the endpoint URL is https://moda-ingest.modas.workers.dev/v1/traces
  • Look for error messages in your application logs
Getting connection errors?
  • Make sure your firewall allows outbound HTTPS connections
  • Verify the endpoint URL does not have typos
Token counts missing?
  • Some providers may not return token usage in streaming mode
  • Check if the provider supports token counting
Need help? Check the Traceloop documentation for more details on OpenLLMetry configuration.