What is OpenLLMetry?
OpenLLMetry is an open-source SDK built by Traceloop that automatically instruments your LLM calls. It captures request and response data and sends it to any OpenTelemetry-compatible backend, including Moda.
Installation
pip install traceloop-sdk
Setup
Add a few lines at the start of your application:
from traceloop.sdk import Traceloop
Traceloop.init(
base_url="https://moda-ingest.modas.workers.dev/v1/traces",
api_key="YOUR_MODA_API_KEY"
)
# Your existing code works as normal
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
Configuration options
| Option | Description |
|---|
base_url / baseUrl | The Moda ingest endpoint URL |
api_key / apiKey | Your Moda API key |
service_name | Optional name to identify your application |
What data is captured
OpenLLMetry automatically captures:
- Messages: User prompts and assistant responses
- Content blocks: Structured content including tool use, extended thinking, and images (when supported by the provider)
- Token usage: Input, output, and reasoning token counts (for cost tracking)
- Model info: Model name and provider
- Timing: Request duration and timestamps
Content blocks are automatically captured when using providers that support structured responses (Anthropic, OpenAI with tools). This enables features like conversation replay in the Moda dashboard.
Semantic conventions
OpenLLMetry uses OpenTelemetry semantic conventions for LLM spans:
| Attribute | Description |
|---|
llm.prompts.{n}.role | Role of the nth prompt message |
llm.prompts.{n}.content | Content of the nth prompt message |
llm.completions.{n}.role | Role of the nth completion |
llm.completions.{n}.content | Content of the nth completion |
llm.usage.prompt_tokens | Number of input tokens |
llm.usage.completion_tokens | Number of output tokens |
llm.request.model | Model name |
llm.system | Provider name |
Supported providers
OpenLLMetry automatically captures calls to:
- OpenAI
- Anthropic
- Cohere
- Azure OpenAI
- AWS Bedrock
- Google AI (Vertex AI, Gemini)
- Mistral
- Ollama
- And more
Verifying it works
After setting up, make a few LLM calls and check:
- No errors in your application logs
- Data appears in the Moda dashboard within a few seconds
OpenLLMetry sends data asynchronously, so it does not slow down your LLM calls.
Advanced: Custom attributes
You can add custom attributes to your traces for additional context:
from traceloop.sdk import Traceloop
from traceloop.sdk.decorators import workflow
@workflow(name="my-chatbot")
def handle_chat(user_message: str):
# Your LLM call here
pass
Troubleshooting
Data not appearing?
- Check that your API key is correct
- Verify the endpoint URL is
https://moda-ingest.modas.workers.dev/v1/traces
- Look for error messages in your application logs
Getting connection errors?
- Make sure your firewall allows outbound HTTPS connections
- Verify the endpoint URL does not have typos
Token counts missing?
- Some providers may not return token usage in streaming mode
- Check if the provider supports token counting